docs/en/tutorials/0_config.md · OSCHINA-MIRROR/open-mmlab-mmflow

Учебник 0: Изучение конфигураций

Мы внедряем модульный и наследственный дизайн в нашу систему конфигурации, что удобно для проведения различных экспериментов. Если вы хотите проверить файл конфигурации, вы можете запустить python tools/misc/print_config.py /PATH/TO/CONFIG, чтобы увидеть полную конфигурацию.

Структура файла конфигурации

Существует 4 основных типа компонентов под config/_base_: наборы данных, модели, расписания и default_runtime. Многие методы могут быть легко сконструированы с помощью одного из них, например PWC-Net. Конфигурации, которые состоят из компонентов из _base_, называются примитивными.

Для всех конфигураций в одной папке рекомендуется иметь только одну примитивную конфигурацию. Все остальные конфигурации должны наследоваться от примитивной конфигурации. Таким образом, максимальный уровень наследования равен 3.

Для облегчения понимания мы рекомендуем участникам проекта наследовать от существующих методов. Например, если некоторые модификации сделаны на основе PWC-Net, пользователь может сначала унаследовать базовую структуру PWC-Net, указав _base_ = ../pwcnet/pwcnet_slong_8x1_flyingchairs_384x448.py, а затем изменить необходимые поля в файлах конфигурации.

Если вы создаёте совершенно новый метод, который не имеет структуры с существующими методами, вы можете создать папку xxx под configs.

Пожалуйста, обратитесь к mmcv для получения подробной документации.

Соглашение об именовании файлов конфигурации

Мы следуем приведённому ниже стилю для именования файлов конфигурации. Участникам рекомендуется следовать тому же стилю.

{model}_{schedule}_[gpu x batch_per_gpu]_{training datasets}_[input_size].py

{xxx} является обязательным полем, а [yyy] — необязательным.

{model}: тип модели, такой как pwcnet, flownets и т. д.
{schedule}: расписание обучения. Следуя соглашению FlowNet2, мы используем slong, sfine и sshort, или количество итераций, таких как 150k (150 тысяч итераций).
[gpu x batch_per_gpu]: графические процессоры и образцы на графический процессор, такие как 8x1.
{training datasets}: обучающий набор данных, такой как flyingchairs, flyingthings3d_subset, flyingthings3d.
[input_size]: размер тренировочных изображений.

Система конфигурации

Чтобы помочь пользователям получить общее представление о полной конфигурации и модулях в MMFlow, мы кратко комментируем конфигурацию PWC-Net, обученную на FlyingChairs с расписанием slong. Для более подробного использования и соответствующей альтернативы для каждого модуля, пожалуйста, обратитесь к документации по API и учебнику в MMDetection (https://github.com/open-mmlab/mmdetection/blob/master/docs/tutorials/config.md).

_base_ = [
    '../_base_/models/pwcnet.py', '../_base_/datasets/flyingchairs_384x448.py',
    '../_base_/schedules/schedule_s_long.py', '../_base_/default_runtime.py'
]# базовый файл конфигурации, на котором мы строим новый файл конфигурации.

_base_/models/pwc_net.py — это базовый файл конфигурации модели для PWC-Net.

model = dict(
    type='PWCNet',  # название алгоритма
    encoder=dict(  # модуль кодировщика config
        type='PWCNetEncoder',  # имя кодировщика в PWC-Net.
        in_channels=3,  # входные каналы
        # тип этого субмодуля, если net_type — Basic, то число слоёв свёртки каждого уровня равно 3,
        # если net_type — Small, то число слоёв свёртки каждого уровня равно 2.
        net_type='Basic',
        pyramid_levels=[
            'level1', 'level2', 'level3', 'level4', 'level5', 'level6'
        ], # список уровней пирамиды, которые являются ключами для выходного dict.
        out_channels=(16, 32, 64, 96, 128, 196),  # список чисел выходных каналов каждого уровня пирамиды.
        strides=(2, 2, 2, 2, 2, 2),  # список шагов каждого уровня пирамиды.
        dilations=(1, 1, 1, 1, 1, 1),  # список расширений каждого уровня пирамиды.
        act_cfg=dict(type='LeakyReLU', negative_slope=0.1)),  # Config dict для каждого слоя активации в ConvModule.
    decoder=dict(  # Декодер модуля config.
        type='PWCNetDecoder',  # Имя декодера потока в PWC-Net.
        in_channels=dict(
            level6=81, level5=213, level4=181, level3=149, level2=117),  # Входные каналы базовых ```
dense block.
    flow_div=20.,  # The constant divisor to scale the ground truth value.
    corr_cfg=dict(type='Correlation', max_displacement=4, padding=0),
    warp_cfg=dict(type='Warp'),
    act_cfg=dict(type='LeakyReLU', negative_slope=0.1),
    scaled=False,  # Whether to use scaled correlation by the number of elements involved to calculate correlation or not.
    post_processor=dict(type='ContextNet', in_channels=565),  # The configuration for post processor.
    flow_loss=dict(  # The loss function configuration.
        type='MultiLevelEPE',
        p=2,
        reduction='sum',
        weights={ # The weights for different levels of flow.
            'level2': 0.005,
            'level3': 0.01,
            'level4': 0.02,
            'level5': 0.08,
            'level6': 0.32
        }),
),
# model training and testing settings
train_cfg=dict(),
test_cfg=dict(),
init_cfg=dict(
    type='Kaiming',
    nonlinearity='leaky_relu',
    layer=['Conv2d', 'ConvTranspose2d'],
    mode='fan_in',
    bias=0))

in _base_/datasets/flyingchairs_384x448.py

dataset_type = 'FlyingChairs'  # Dataset name
data_root = 'data/FlyingChairs/data'  # Root path of dataset

img_norm_cfg = dict(mean=[0., 0., 0.], std=[255., 255., 255], to_rgb=False)  # Image normalization config to normalize the input images

train_pipeline = [ # Training pipeline
    dict(type='LoadImageFromFile'),  # load images
    dict(type='LoadAnnotations'),  # load flow data
    dict(type='ColorJitter',  # Randomly change the brightness, contrast, saturation and hue of an image.
     brightness=0.5,  # How much to jitter brightness.
     contrast=0.5,  # How much to jitter contrast.
     saturation=0.5,  # How much to jitter saturation.
         hue=0.5),  # How much to jitter hue.
    dict(type='RandomGamma', gamma_range=(0.7, 1.5)),  # Randomly gamma correction on images.
    dict(type='Normalize', **img_norm_cfg),  # Normalization config, the values are from img_norm_cfg
    dict(type='GaussianNoise', sigma_range=(0, 0.04), clamp_range=(0., 1.)),  # Add Gaussian noise and a sigma uniformly sampled from [0, 0.04];
    dict(type='RandomFlip', prob=0.5, direction='horizontal'),  # Random horizontal flip
    dict(type='RandomFlip', prob=0.5, direction='vertical'),   # Random vertical flip
    # Random affine transformation of images
    # Keys of global_transform and relative_transform should be the subset of
    #     ('translates', 'zoom', 'shear', 'rotate'). And also, each key and its
    #     corresponding values has to satisfy the following rules:
    #         - translates: the translation ratios along x axis and y axis. Defaults
    #             to(0., 0.).
    #         - zoom: the min and max zoom ratios. Defaults to (1.0, 1.0).
    #         - shear: the min and max shear ratios. Defaults to (1.0, 1.0).
    #         - rotate: the min and max rotate degree. Defaults to (0., 0.).
    dict(type='RandomAffine',
         global_transform=dict(
            translates=(0.05, 0.05),
            zoom=(1.0, 1.5),
            shear=(0.86, 1.16),
            rotate=(-10., 10.)
        ),
         relative_transform=dict(
            translates=(0.00375, 0.00375),
            zoom=(0.985, 1.015),
            shear=(1.0, 1.0),
            rotate=(-1.0, 1.0)
        )),
    dict(type='RandomCrop', crop_size=(384, 448)),  # Random crop the image and flow as (384, 448)
    dict(type='DefaultFormatBundle'),  # It simplifies the pipeline of formatting common fields, including "img1", "img2" and "flow_gt".
    dict(
        type='Collect',  # Collect data from the loader relevant to the specific task.
        keys=['imgs', 'flow_gt'],
        meta_keys=('img_fields', 'ann_fields', 'filename1', 'filename2',
                   'ori_filename1', 'ori_filename2', 'filename_flow',
                   'ori_filename_flow', 'ori_shape', 'img_shape',
                   'img_norm_cfg')),
]

test_pipeline = [
    dict(type='LoadImageFromFile')
``` **dict(type='LoadAnnotations')**,
    **dict(type='InputResize', exponent=4)**,
    **dict(type='Normalize', img_norm_cfg)**,
    **dict(type='TestFormatBundle')**,  # It simplifies the pipeline of formatting common fields, including "img1"
    # and "img2".
    **dict(
        type='Collect',
        keys=['imgs']**,  # Collect data from the loader relevant to the specific task.
        meta_keys=('flow_gt', 'filename1', 'filename2', 'ori_filename1',
                   'ori_filename2', 'ori_shape', 'img_shape', 'img_norm_cfg',
                   'scale_factor', 'pad_shape'))  # 'flow_gt' in img_meta is works for online evaluation.
]

**data = dict(
    train_dataloader=dict(
        samples_per_gpu=1**,  # Batch size of a single GPU
        workers_per_gpu=5**,  # Worker to pre-fetch data for each single GPU
        drop_last=True),  # Drops the last non-full batch

    val_dataloader=dict(
        samples_per_gpu=1,  # Batch size of a single GPU
        workers_per_gpu=2,  # Worker to pre-fetch data for each single GPU
        shuffle=False),  # Whether shuffle dataset.

    test_dataloader=dict(
        samples_per_gpu=1,  # Batch size of a single GPU
        workers_per_gpu=2,  # Worker to pre-fetch data for each single GPU
        shuffle=False),  # Whether shuffle dataset.

    train=dict(  # Train dataset config
        type=dataset_type,
        pipeline=train_pipeline,
        data_root=data_root,
        split_file='data/FlyingChairs_release/FlyingChairs_train_val.txt',  # train-validation split file
    ),

    val=dict(
        type=dataset_type,
        pipeline=test_pipeline,
        data_root=data_root,
        test_mode=True),

    test=dict(
        type=dataset_type,
        pipeline=test_pipeline,
        data_root=data_root,
        test_mode=True)
)**

in _base_/schedules/schedule_s_long.py

# optimizer
optimizer = dict(
    type='Adam', lr=0.0001, weight_decay=0.0004, betas=(0.9, 0.999))
optimizer_config = dict(grad_clip=None)
# learning policy
lr_config = dict(
    policy='step',
    by_epoch=False,
    gamma=0.5,
    step=[400000, 600000, 800000, 1000000])
runner = dict(type='IterBasedRunner', max_iters=1200000)
checkpoint_config = dict(by_epoch=False, interval=100000)
evaluation = dict(interval=100000, metric='EPE')

in _base_/default_runtime.py

log_config = dict(  # config to register logger hook
    interval=50,  # Interval to print the log
    hooks=[
        dict(type='TextLoggerHook'),
        dict(type='TensorboardLoggerHook')
    ])  # The logger used to record the training process.
dist_params = dict(backend='nccl')  # Parameters to setup distributed training, the port can also be set.
log_level = 'INFO'  # The level of logging.
load_from = None  # load models as a pre-trained model from a given path. This will not resume training.
workflow = [('train', 1)]  # Workflow for runner. [('train', 1)] means there is only one workflow and the workflow named 'train' is executed once.

Modify config through script arguments

When submitting jobs using "tools/train.py" or "tools/test.py", you may specify --cfg-options to in-place modify the config.

Update config keys of dict chains.

The config options can be specified following the order of the dict keys in the original config. For example, --cfg-option model.encoder.in_channels=6.
Update keys inside a list of configs.

Some config dicts are composed as a list in your config. For example, the training pipeline data.train.pipeline is normally a list e.g. [dict(type='LoadImageFromFile'), ...]. If you want to change 'LoadImageFromFile' to 'LoadImageFromWebcam' in the pipeline, you may specify --cfg-options data.train.pipeline.0.type=LoadImageFromWebcam.
Update values of list/tuples.

If the value to be updated is a list or a tuple. For example, the config file normally sets workflow=[('train', 1)]. If you want to change this key, you may specify --cfg-options workflow="[(train,1),(val,1)]". Note that the quotation mark " is necessary to support list/tuple data types, and that NO white space FAQ

Игнорирование некоторых полей в базовых конфигурациях

Иногда вы можете установить _delete_=True, чтобы игнорировать некоторые поля в базовых конфигурациях. Простую иллюстрацию можно найти в mmcv.

Для лучшего понимания этой функции рекомендуется внимательно изучить этот туториал.

Использование промежуточных переменных в конфигурациях

В файлах конфигурации используются некоторые промежуточные переменные, такие как train_pipeline и test_pipeline в наборах данных. Стоит отметить, что при изменении промежуточных переменных в дочерних конфигурациях пользователям необходимо снова передать промежуточные переменные в соответствующие поля. Наглядный пример можно найти в этом туториале.

OSCHINA-MIRROR/open-mmlab-mmflow

Структура файла конфигурации

Соглашение об именовании файлов конфигурации

Система конфигурации

Modify config through script arguments

Игнорирование некоторых полей в базовых конфигурациях

Использование промежуточных переменных в конфигурациях

Опубликовать ( 0 )

Введение

Обновления

Участники

Недавние действия

OSCHINA-MIRROR/open-mmlab-mmflow .gitee-modal { min-width: 500px !important; } .gitee-modal .close { right: 0.6rem !important; }

Структура файла конфигурации

Соглашение об именовании файлов конфигурации

Система конфигурации

Modify config through script arguments

Игнорирование некоторых полей в базовых конфигурациях

Использование промежуточных переменных в конфигурациях

Опубликовать ( 0 )

Введение

Обновления

Участники

Недавние действия

OSCHINA-MIRROR/open-mmlab-mmflow