The repo is based on S2A-Net and mmdetection. Thanks to the original authors!
There are two key issues that limit further improvements in the performance of existing rotational detectors: 1)Periodic sudden change of the parameters in the rotating bounding box (RBBox) definition causes a numerical discontinuity in the loss (such as smoothL1 loss). 2)There is a gap of optimization asynchrony between the loss in the RBBox regression and evaluation metrics. In this paper, we define a new distance formulation between two convex polygons describing the overlapping degree and non-overlapping degree. Based on this smooth distance, we propose a loss called Polygon-to-Polygon distance loss (P2P Loss). The distance is derived from the area sum of triangles specified by the vertexes of one polygon and the edges of the other. Therefore, the P2P Loss is continuous, differentiable, and inherently free from any RBBox definition. Our P2P Loss is not only consistent with the detection metrics but also able to measure how far, as well as how similar, a RBBox is from another one even when they are completely non-overlapping. These features allow the RetinaNet using the P2P Loss to achieve 79.15$%$ mAP on the DOTA dataset, which is quite competitive compared with many state-of-the-art rotated object detectors.
*All results are reported on DOTA-v1.0 test-dev.
Method | Model | Backbone | MS | Rotate | box AP | Download |
---|---|---|---|---|---|---|
SmoothL1 | RetinaNet | R-50-FPN | - | - | 69.41/69.92/69.88 | model/model/model |
P2P Loss | RetinaNet | R-50-FPN | - | - | 70.89/70.91/71.05 | model/model/model |
P2P Loss# | RetinaNet | R-50-FPN | - | - | 72.38/72.21/72.20 | model/model/model |
P2P Loss# | RetinaNet | R-50-FPN | ✓ | ✓ | 78.308 | model |
P2P Loss# | RetinaNet | R-101-DCN | ✓ | ✓ | 79.155 | model |
Please refer to install.md for installation and dataset preparation.
Please see getting_started.md for the basic usage of MMDetection.
@article{yang2022polygon,
title={Polygon-to-Polygon Distance Loss for Rotated Object Detection},
author={Yang, Yang and Chen, Jifeng and Zhong, Xiaopin and Deng, Yuanlong},
year={2022}
}
@article{han2020align,
title = {Align Deep Features for Oriented Object Detection},
author = {Han, Jiaming and Ding, Jian and Li, Jie and Xia, Gui-Song},
journal = {arXiv preprint arXiv:2008.09397},
year = {2020}
}
@inproceedings{xia2018dota,
title={DOTA: A large-scale dataset for object detection in aerial images},
author={Xia, Gui-Song and Bai, Xiang and Ding, Jian and Zhu, Zhen and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={3974--3983},
year={2018}
}
@InProceedings{Ding_2019_CVPR,
author = {Ding, Jian and Xue, Nan and Long, Yang and Xia, Gui-Song and Lu, Qikai},
title = {Learning RoI Transformer for Oriented Object Detection in Aerial Images},
booktitle = {The IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2019}
}
@article{chen2019mmdetection,
title={MMDetection: Open mmlab detection toolbox and benchmark},
author={Chen, Kai and Wang, Jiaqi and Pang, Jiangmiao and Cao, Yuhang and Xiong, Yu and Li, Xiaoxiao and Sun, Shuyang and Feng, Wansen and Liu, Ziwei and Xu, Jiarui and others},
journal={arXiv preprint arXiv:1906.07155},
year={2019}
}