Giter VIP home page Giter VIP logo

dronedetectron2's Introduction

Cascaded Zoom-in Detector for High Resolution Aerial Images

License: MIT

This is the PyTorch implementation of our paper:
Cascaded Zoom-in Detector for High Resolution Aerial Images
Akhil Meethal, Eric Granger, Marco Pedersoli
[arXiv] [CVPRw]

Accepted at: CVPRw 2023 (EarthVison Workshop oragnized by IEEE GRSS)

The method proposed in this paper can be easily integrated to the detector of your choice to improve its small object detection performance. In this repo, we demonstrated it with the two-stage Faster RCNN detector and the one-stage anchor free FCOS detector.

Installation

Prerequisites

  • Linux or macOS with Python ≥ 3.6
  • PyTorch ≥ 1.5 and torchvision that matches the PyTorch installation.
  • Detectron2

Install PyTorch in Conda env

# create conda env
conda create -n detectron2 python=3.6
# activate the enviorment
conda activate detectron2
# install PyTorch >=1.5 with GPU
conda install pytorch torchvision -c pytorch

Build Detectron2 from Source

Follow the INSTALL.md to install Detectron2.

Dataset download

  1. Download VisDrone dataset

Follow the instructions on VisDrone page

  1. Organize the dataset as following:
croptrain/
└── datasets/
    └── VisDrone/
        ├── train/
        ├── val/
        ├── annotations_VisDrone_train.json
        └── annotations_VisDrone_val.json

The original annotations provided with the VisDrone dataset is in PASCAL VOC format. I used this code to convert it to COCO style annotation: VOC2COCO.

Update: I am sharing the json files I generated for the VisDrone dataset via google drive below.

a) annotations_VisDrone_train.json

b) annotations_VisDrone_val.json

  1. Download DOTA dataset

Please follow the instructions on DOTA page. Organize it the same way as above. You can also download the json files for train and validation set below:

a) annotations_DOTA_train.json

b) annotations_DOTA_val.json

Training

  • Train the basic supervised model on VisDrone dataset
python train_net.py \
      --num-gpus 1 \
      --config-file configs/Base-RCNN-FPN.yaml \
      OUTPUT_DIR outputs_FPN_VisDrone
  • Train the basic supervised model on DOTA dataset
python train_net.py \
      --num-gpus 1 \
      --config-file configs/Dota-Base-RCNN-FPN.yaml \
      OUTPUT_DIR outputs_FPN_DOTA
  • Train the Cascaded Zoom-in Detector on VisDrone dataset
python train_net.py \
      --num-gpus 1 \
      --config-file configs/RCNN-FPN-CROP.yaml \
      OUTPUT_DIR outputs_FPN_CROP_VisDrone
  • Train the Cascaded Zoom-in Detector on DOTA dataset
python train_net.py \
      --num-gpus 1 \
      --config-file configs/Dota-RCNN-FPN-CROP.yaml \
      OUTPUT_DIR outputs_FPN_CROP_DOTA

Resume the training

python train_net.py \
      --resume \
      --num-gpus 1 \
      --config-file configs/Base-RCNN-FPN.yaml \
      OUTPUT_DIR outputs_FPN_VisDrone

Evaluation

python train_net.py \
      --eval-only \
      --num-gpus 1 \
      --config-file configs/Base-RCNN-FPN.yaml \
      MODEL.WEIGHTS <your weight>.pth

Results comparison on the VisDrone dataset

Citing Cascaded Zoom-in Detector

If you use Cascaded Zoom-in Detector in your research or wish to refer to the results published in the paper, please use the following BibTeX entry.

@inproceedings{meethal2023czdetector,
    title={Cascaded Zoom-in Detector for High Resolution Aerial Images},
    author={Meethal, Akhil and Granger, Eric and Pedersoli, Marco},
    booktitle={CVPRw},
    year={2023},
}

Also, if you use Detectron2 in your research, please use the following BibTeX entry.

@misc{wu2019detectron2,
  author =       {Yuxin Wu and Alexander Kirillov and Francisco Massa and
                  Wan-Yen Lo and Ross Girshick},
  title =        {Detectron2},
  howpublished = {\url{https://github.com/facebookresearch/detectron2}},
  year =         {2019}
}

License

This project is licensed under MIT License, as found in the LICENSE file.

dronedetectron2's People

Contributors

akhilpm avatar ycliu93 avatar chihyaoma avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.