Giter VIP home page Giter VIP logo

one-shot-object-detection's Introduction

One-Shot Object Detection with Co-Attention and Co-Excitation

Introduction

Image

One-Shot Object Detection with Co-Attention and Co-Excitation
Ting-I Hsieh, Yi-Chen Lo, Hwann-Tzong Chen, Tyng-Luh Liu
Neural Information Processing Systems (NeurIPS), 2019
slide, poster

This project is a pure pytorch implementation of One-Shot Object Detection. A majority of the code is modified from jwyang/faster-rcnn.pytorch.

What we are doing and going to do

  • Support tensorboardX.
  • Upload the ImageNet pre-trained model.
  • Provide Reference image.
  • Provide checkpoint model.
  • Train PASCAL_VOC datasets

Preparation

First of all, clone the code

git clone https://github.com/timy90022/One-Shot-Object-Detection.git

1. Prerequisites

  • Ubuntu 16.04
  • Python or 3.6
  • Pytorch 1.0

2. Data Preparation

  • COCO: Please also follow the instructions in py-faster-rcnn to prepare the data. See the scripts provided in this repository.

3. Pretrained Model

We use ResNet50 as the pretrained model in our experiments. This pretrained model is trained by excluding all COCO-related ImageNet classes, which is acheived via matching the WordNet synsets of ImageNet classes to COCO classes. As a result, we keep only 933,052 images from the remaining 725 classes, while the original dataset contains 1,284,168 images of 1000 classes. The pretrained model is available at

Download and unzip them into the ../data/

4. Reference images

The reference images are retrieved by cropping out the patches with respect to the predicted bounding boxes of Mask R-CNN, and the bounding boxes need to satisfy the following conditions:

  • The IOU threshold > 0.5
  • The score confidence > 0.7

The reference images are available at

Download and unzip them into the ../data/

5. Compilation

This step can be referred to jwyang/faster-rcnn.pytorch. Install all the python dependencies using pip:

pip install -r requirements.txt

Compile the cuda dependencies using following simple commands:

cd lib
python setup.py build develop

It will compile all the modules you need, including NMS, ROI_Pooing, ROI_Align, and ROI_Crop. The default version is compiled with Python 2.7.

As pointed out in this issue, if you encounter some error during the compilation, you might miss to export the CUDA paths to your environment.

Train

Before training, set the right directory to save and load the trained models. Change the arguments "save_dir" and "load_dir" in trainval_net.py and test_net.py to adapt to your environment.

In coco dataset, we split it into 4 groups. It will train and test different category. Just to adjust "--g"(1~4). If you want to train other settings, you should sepcify "--g 0"

If you want to train parts of the dataset, try to modify "--seen".

  • 1 --> Training, session see train_categories(config file) class
  • 2 --> Testing, session see test_categories(config file) class
  • 3 --> session see train_categories + test_categories class

To train a model with ResNet50 on COCO, simply run

CUDA_VISIBLE_DEVICES=$GPU_ID python trainval_net.py \
                   --dataset coco --net res50 \
                   --bs $BATCH_SIZE --nw $WORKER_NUMBER \
                   --lr $LEARNING_RATE --lr_decay_step $DECAY_STEP \
                   --cuda --g $SPLIT --seen $SEEN

Above, BATCH_SIZE and WORKER_NUMBER can be set adaptively according to your GPU memory size. On NVIDIA V100 GPUs with 32G memory, it can be up to batch size 16.

If you have multiple (say 8) V100 GPUs, then just use them all! Try

python trainval_net.py --dataset coco --net res50 \
                       --bs $BATCH_SIZE --nw $WORKER_NUMBER \
                       --lr $LEARNING_RATE --lr_decay_step $DECAY_STEP \
                       --cuda --g $SPLIT --seen $SEEN --mGPUs

Test

If you want to evlauate the detection performance of ResNet50 model on COCO test set, you can train by yourself or download the models from Google Drive and unzip them into the ./models/res50/.

Simply run

python test_net.py --dataset coco --net res50 \
                   --s $SESSION --checkepoch $EPOCH --p $CHECKPOINT \
                   --cuda --g $SPLIT

Specify the model session, checkepoch and checkpoint, e.g., SESSION=1, EPOCH=10, CHECKPOINT=1663.

If you want to test our model checkpoint, simple run

For coco first group:

python test_net.py --s 1  --g 1 --a 4 --cuda

For coco second group:

python test_net.py --s 2  --g 2 --a 4 --cuda

Acknowledgments

Code is based on jwyang/faster-rcnn.pytorch and AlexHex7/Non-local_pytorch.

Citation

@incollection{NIPS2019_8540,
  title 	= {One-Shot Object Detection with Co-Attention and Co-Excitation},
  author 	= {Hsieh, Ting-I and Lo, Yi-Chen and Chen, Hwann-Tzong and Liu, Tyng-Luh},
  booktitle 	= {Advances in Neural Information Processing Systems 32},
  year		= {2019},
  publisher 	= {Curran Associates, Inc.}
}

one-shot-object-detection's People

Contributors

timy90022 avatar htchen avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.