some object detection algo
The part highlighted with red characters means papers that i think "must-read". However, it is my personal opinion and other papers are important too, so I recommend to read them if you have time.
FPS(Speed) index is related to the hardware spec(e.g. CPU, GPU, RAM, etc), so it is hard to make an equal comparison. The solution is to measure the performance of all models on hardware with equivalent specifications, but it is very difficult and time consuming.
Detector | VOC07 (mAP@IoU=0.5) | VOC12 (mAP@IoU=0.5) | COCO (mAP@IoU=0.5:0.95) | Published In |
---|---|---|---|---|
R-CNN | 58.5 | - | - | CVPR'14 |
SPP-Net | 59.2 | - | - | ECCV'14 |
MR-CNN | 78.2 (07+12) | 73.9 (07+12) | - | ICCV'15 |
Fast R-CNN | 70.0 (07+12) | 68.4 (07++12) | 19.7 | ICCV'15 |
Faster R-CNN | 73.2 (07+12) | 70.4 (07++12) | 21.9 | NIPS'15 |
YOLO v1 | 66.4 (07+12) | 57.9 (07++12) | - | CVPR'16 |
G-CNN | 66.8 | 66.4 (07+12) | - | CVPR'16 |
AZNet | 70.4 | - | 22.3 | CVPR'16 |
ION | 80.1 | 77.9 | 33.1 | CVPR'16 |
HyperNet | 76.3 (07+12) | 71.4 (07++12) | - | CVPR'16 |
OHEM | 78.9 (07+12) | 76.3 (07++12) | 22.4 | CVPR'16 |
MPN | - | - | 33.2 | BMVC'16 |
SSD | 76.8 (07+12) | 74.9 (07++12) | 31.2 | ECCV'16 |
GBDNet | 77.2 (07+12) | - | 27.0 | ECCV'16 |
CPF | 76.4 (07+12) | 72.6 (07++12) | - | ECCV'16 |
R-FCN | 79.5 (07+12) | 77.6 (07++12) | 29.9 | NIPS'16 |
DeepID-Net | 69.0 | - | - | PAMI'16 |
NoC | 71.6 (07+12) | 68.8 (07+12) | 27.2 | TPAMI'16 |
DSSD | 81.5 (07+12) | 80.0 (07++12) | 33.2 | arXiv'17 |
TDM | - | - | 37.3 | CVPR'17 |
FPN | - | - | 36.2 | CVPR'17 |
YOLO v2 | 78.6 (07+12) | 73.4 (07++12) | - | CVPR'17 |
RON | 77.6 (07+12) | 75.4 (07++12) | 27.4 | CVPR'17 |
DeNet | 77.1 (07+12) | 73.9 (07++12) | 33.8 | ICCV'17 |
CoupleNet | 82.7 (07+12) | 80.4 (07++12) | 34.4 | ICCV'17 |
RetinaNet | - | - | 39.1 | ICCV'17 |
DSOD | 77.7 (07+12) | 76.3 (07++12) | - | ICCV'17 |
SMN | 70.0 | - | - | ICCV'17 |
Light-Head R-CNN | - | - | 41.5 | arXiv'17 |
YOLO v3 | - | - | 33.0 | arXiv'18 |
SIN | 76.0 (07+12) | 73.1 (07++12) | 23.2 | CVPR'18 |
STDN | 80.9 (07+12) | - | - | CVPR'18 |
RefineDet | 83.8 (07+12) | 83.5 (07++12) | 41.8 | CVPR'18 |
SNIP | - | - | 45.7 | CVPR'18 |
Relation-Network | - | - | 32.5 | CVPR'18 |
Cascade R-CNN | - | - | 42.8 | CVPR'18 |
MLKP | 80.6 (07+12) | 77.2 (07++12) | 28.6 | CVPR'18 |
Fitness-NMS | - | - | 41.8 | CVPR'18 |
RFBNet | 82.2 (07+12) | - | - | ECCV'18 |
CornerNet | - | - | 42.1 | ECCV'18 |
PFPNet | 84.1 (07+12) | 83.7 (07++12) | 39.4 | ECCV'18 |
Pelee | 70.9 (07+12) | - | - | NIPS'18 |
HKRM | 78.8 (07+12) | - | 37.8 | NIPS'18 |
M2Det | - | - | 44.2 | AAAI'19 |
R-DAD | 81.2 (07++12) | 82.0 (07++12) | 43.1 | AAAI'19 |
ScratchDet | 84.1 (07++12) | 83.6 (07++12) | 39.1 | CVPR'19 |
Libra R-CNN | - | - | 43.0 | CVPR'19 |
Reasoning-RCNN | 82.5 (07++12) | - | 43.2 | CVPR'19 |
FSAF | - | - | 44.6 | CVPR'19 |
AmoebaNet + NAS-FPN | - | - | 47.0 | CVPR'19 |
Cascade-RetinaNet | - | - | 41.1 | CVPR'19 |
TridentNet | - | - | 48.4 | ICCV'19 |
DAFS | 85.3 (07+12) | 83.1 (07++12) | 40.5 | ICCV'19 |
Auto-FPN | 81.8 (07++12) | - | 40.5 | ICCV'19 |
FCOS | - | - | 44.7 | ICCV'19 |
FreeAnchor | - | - | 44.8 | NeurIPS'19 |
DetNAS | 81.5 (07++12) | - | 42.0 | NeurIPS'19 |
NATS | - | - | 42.0 | NeurIPS'19 |
AmoebaNet + NAS-FPN + AA | - | - | 50.7 | arXiv'19 |
EfficientDet | - | - | 51.0 | arXiv'19 |
Statistics of commonly used object detection datasets. The Table came from this survey paper.
Challenge | Object Classes | Number of Images | Number of Annotated Images | |||
---|---|---|---|---|---|---|
Train | Val | Test | Train | Val | ||
PASCAL VOC Object Detection Challenge | ||||||
VOC07 | 20 | 2,501 | 2,510 | 4,952 | 6,301 (7,844) | 6,307 (7,818) |
VOC08 | 20 | 2,111 | 2,221 | 4,133 | 5,082 (6,337) | 5,281 (6,347) |
VOC09 | 20 | 3,473 | 3,581 | 6,650 | 8,505 (9,760) | 8,713 (9,779) |
VOC10 | 20 | 4,998 | 5,105 | 9,637 | 11,577 (13,339) | 11,797 (13,352) |
VOC11 | 20 | 5,717 | 5,823 | 10,994 | 13,609 (15,774) | 13,841 (15,787) |
VOC12 | 20 | 5,717 | 5,823 | 10,991 | 13,609 (15,774) | 13,841 (15,787) |
ILSVRC Object Detection Challenge | ||||||
ILSVRC13 | 200 | 395,909 | 20,121 | 40,152 | 345,854 | 55,502 |
ILSVRC14 | 200 | 456,567 | 20,121 | 40,152 | 478,807 | 55,502 |
ILSVRC15 | 200 | 456,567 | 20,121 | 51,294 | 478,807 | 55,502 |
ILSVRC16 | 200 | 456,567 | 20,121 | 60,000 | 478,807 | 55,502 |
ILSVRC17 | 200 | 456,567 | 20,121 | 65,500 | 478,807 | 55,502 |
MS COCO Object Detection Challenge | ||||||
MS COCO15 | 80 | 82,783 | 40,504 | 81,434 | 604,907 | 291,875 |
MS COCO16 | 80 | 82,783 | 40,504 | 81,434 | 604,907 | 291,875 |
MS COCO17 | 80 | 118,287 | 5,000 | 40,670 | 860,001 | 36,781 |
MS COCO18 | 80 | 118,287 | 5,000 | 40,670 | 860,001 | 36,781 |
Open Images Object Detection Challenge | ||||||
OID18 | 500 | 1,743,042 | 41,620 | 125,436 | 12,195,144 | โ |
The papers related to datasets used mainly in Object Detection are as follows.
-
[PASCAL VOC] The PASCAL Visual Object Classes (VOC) Challenge | [IJCV' 10] |
[pdf]
-
[PASCAL VOC] The PASCAL Visual Object Classes Challenge: A Retrospective | [IJCV' 15] |
[pdf]
|[link]
-
[ImageNet] ImageNet: A Large-Scale Hierarchical Image Database| [CVPR' 09] |
[pdf]
-
[ImageNet] ImageNet Large Scale Visual Recognition Challenge | [IJCV' 15] |
[pdf]
|[link]
-
[COCO] Microsoft COCO: Common Objects in Context | [ECCV' 14] |
[pdf]
|[link]
-
[Open Images] The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale | [arXiv' 18] |
[pdf]
|[link]
-
[DOTA] DOTA: A Large-scale Dataset for Object Detection in Aerial Images | [CVPR' 18] |
[pdf]
|[link]
-
[Objects365] Objects365: A Large-Scale, High-Quality Dataset for Object Detection | [ICCV' 19] |
[link]