Emerging interests have been brought to recognize previously unseen objects given very few training examples, known as few-shot object detection (FSOD). Recent researches demonstrate that good feature embedding is the key to reach favorable few-shot learning performance. We observe object proposals with different Intersection-of-Union (IoU) scores are analogous to the intra-image augmentation used in contrastive approaches. And we exploit this analogy and incorporate supervised contrastive learning to achieve more robust objects representations in FSOD. We present Few-Shot object detection via Contrastive proposals Encoding (FSCE), a simple yet effective approach to learning contrastive-aware object proposal encodings that facilitate the classification of detected objects. We notice the degradation of average precision (AP) for rare objects mainly comes from misclassifying novel instances as confusable classes. And we ease the misclassification issues by promoting instance level intra-class compactness and inter-class variance via our contrastive proposal encoding loss (CPE loss). Our design outperforms current state-ofthe-art works in any shot and all data splits, with up to +8.8% on standard benchmark PASCAL VOC and +2.7% on challenging COCO benchmark. Code is available at: https://github.com/bsun0802/FSCE.git
@inproceedings{sun2021fsce,
title={FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding},
author={Sun, Bo and Li, Banghuai and Cai, Shengcai and Yuan, Ye and Zhang, Chi},
booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR)},
year={2021}
}
Note: ALL the reported results use the data split released from fsce official repo, unless stated otherwise. Currently, each setting is only evaluated with one fiNed few shot dataset. Please refer to here to get more details about the dataset and data preparation.
Following the original implementation, it consists of 3 steps:
Step1: Base training
Step2: Reshape the bbox head of base model:
Step3: Few shot fine-tuning:
# step1: base training for voc split1
bash ./tools/detection/dist_train.sh \
configs/detection/fsce/voc/split1/fsce_r101_fpn_voc-split1_base-training.py 8
# step2: reshape the bbox head of base model for few shot fine-tuning
python -m tools.detection.misc.initialize_bbox_head \
--src1 work_dirs/fsce_r101_fpn_voc-split1_base-training/latest.pth \
--method randinit \
--save-dir work_dirs/fsce_r101_fpn_voc-split1_base-training
# step3: few shot fine-tuning
bash ./tools/detection/dist_train.sh \
configs/detection/fsce/voc/split1/fsce_r101_fpn_voc-split1_1shot-fine-tuning.py 8
Note:
work_dirs/{BASE TRAINING CONFIG}/base_model_random_init_bbox_head.pth
.
When the model is saved to different path, please update the argument load_from
in step3 few shot fine-tune configs instead
of using resume_from
.load_from
to the downloaded checkpoint path.arch | contrastive loss | Split | Base AP50 | ckpt(step1) | ckpt(step2) | log |
---|---|---|---|---|---|---|
r101_fpn | N | 1 | 80.9 | ckpt | ckpt | log |
r101_fpn | N | 2 | 82.0 | ckpt | ckpt | log |
r101_fpn | N | 3 | 82.1 | ckpt | ckpt | log |
Note:
arch | contrastive loss | Split | Shot | Base AP50 | Novel AP50 | ckpt | log |
---|---|---|---|---|---|---|---|
r101_fpn | N | 1 | 1 | 78.4 | 41.2 | ckpt | log |
r101_fpn | N | 1 | 2 | 77.8 | 51.1 | ckpt | log |
r101_fpn | N | 1 | 3 | 76.1 | 49.3 | ckpt | log |
r101_fpn | N | 1 | 5 | 75.9 | 59.4 | ckpt | log |
r101_fpn | N | 1 | 10 | 76.4 | 62.6 | ckpt | log |
r101_fpn | Y | 1 | 3 | 75.0 | 48.9 | ckpt | log |
r101_fpn | Y | 1 | 5 | 75.0 | 58.8 | ckpt | log |
r101_fpn | Y | 1 | 10 | 75.5 | 63.3 | ckpt | log |
r101_fpn | N | 2 | 1 | 79.8 | 25.0 | ckpt | log |
r101_fpn | N | 2 | 2 | 78.0 | 30.6 | ckpt | log |
r101_fpn | N | 2 | 3 | 76.4 | 43.4 | ckpt | log |
r101_fpn | N | 2 | 5 | 77.2 | 45.3 | ckpt | log |
r101_fpn | N | 2 | 10 | 77.5 | 50.4 | ckpt | log |
r101_fpn | Y | 2 | 3 | 76.3 | 43.3 | ckpt | log |
r101_fpn | Y | 2 | 5 | 76.6 | 45.9 | ckpt | log |
r101_fpn | Y | 2 | 10 | 76.8 | 50.4 | ckpt | log |
r101_fpn | N | 3 | 1 | 79.0 | 39.8 | ckpt | log |
r101_fpn | N | 3 | 2 | 78.4 | 41.5 | ckpt | log |
r101_fpn | N | 3 | 3 | 76.1 | 47.1 | ckpt | log |
r101_fpn | N | 3 | 5 | 77.4 | 54.1 | ckpt | log |
r101_fpn | N | 3 | 10 | 77.7 | 57.4 | ckpt | log |
r101_fpn | Y | 3 | 3 | 75.6 | 48.1 | ckpt | log |
r101_fpn | Y | 3 | 5 | 76.2 | 55.7 | ckpt | log |
r101_fpn | Y | 3 | 10 | 77.0 | 57.9 | ckpt | log |
Note:
fc_cls
and fc_reg
layers are fine-tuned.aug_test
.arch | contrastive loss | Base mAP | ckpt(step1) | ckpt(step2) | log |
---|---|---|---|---|---|
r101_fpn | N | 39.50 | ckpt | ckpt | log |
arch | shot | contrastive loss | Base mAP | Novel mAP | ckpt | log |
---|---|---|---|---|---|---|
r101_fpn | 10 | N | 31.7 | 11.7 | ckpt | log |
r101_fpn | 30 | N | 32.3 | 16.4 | ckpt | log |
Note:
aug_test
.Вы можете оставить комментарий после Вход в систему
Неприемлемый контент может быть отображен здесь и не будет показан на странице. Вы можете проверить и изменить его с помощью соответствующей функции редактирования.
Если вы подтверждаете, что содержание не содержит непристойной лексики/перенаправления на рекламу/насилия/вульгарной порнографии/нарушений/пиратства/ложного/незначительного или незаконного контента, связанного с национальными законами и предписаниями, вы можете нажать «Отправить» для подачи апелляции, и мы обработаем ее как можно скорее.
Опубликовать ( 0 )