git clone https://github.com/lufficc/SSD
cd SSD
conda create --name ssd
conda activate ssd
python -m pip install yacs vizer opencv-python tqdm tensorboardX
python -m pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html
update-moreh --target 22.8.3 --force
For this task, I tested with the VOC dataset, the COCO you can do the same, just replace the VOC_ROOT into the COCO_ROOT with the exact root to the COCO path. The data that I used for train was downloaded via following scripts
# /bin/bash
# > get_voc.sh
VOC_ROOT="/path/to/voc"
cd $VOC_ROOT
wget https://pjreddie.com/media/files/VOCtrainval_11-May-2012.tar
wget https://pjreddie.com/media/files/VOCtrainval_06-Nov-2007.tar
wget https://pjreddie.com/media/files/VOCtest_06-Nov-2007.tar
tar xf VOCtrainval_11-May-2012.tar
tar xf VOCtrainval_06-Nov-2007.tar
tar xf VOCtest_06-Nov-2007.tar
> VOC_ROOT=/data/work/dataset/VOCdevkit python train.py --config-file configs/vgg_ssd300_voc0712.yaml
> VOC_ROOT=/data/work/dataset/VOCdevkit python train.py --config-file configs/mobilenet_v3_ssd320_voc0712.yaml
> VOC_ROOT=/data/work/dataset/VOCdevkit python train.py --config-file configs/mobilenet_v2_ssd320_voc0712.yaml
> VOC_ROOT=/data/work/dataset/VOCdevkit python train.py --config-file configs/efficient_net_b3_ssd300_voc0712.yaml
# Choose one of them
Note: With torchvision==1.1x: the following command will produce false
>>> `ssd/utils/nms.py` Line 7:
if torchvision.__version__ >= '0.3.0':
_nms = torchvision.ops.nms
else:
warnings.warn('No NMS is available. Please upgrade torchvision to 0.3.0+')
sys.exit(-1)
Hot fix: Simply remove above lines and add following to the same file
_nms = torchvision.ops.nms
This repository implements SSD (Single Shot MultiBox Detector). The implementation is heavily influenced by the projects ssd.pytorch, pytorch-ssd and maskrcnn-benchmark. This repository aims to be the code base for researches based on SSD.
Losses | Learning rate | Metrics |
---|---|---|
- PyTorch 1.0: Support PyTorch 1.0 or higher.
- Multi-GPU training and inference: We use
DistributedDataParallel
, you can train or test with arbitrary GPU(s), the training schema will change accordingly. - Modular: Add your own modules without pain. We abstract
backbone
,Detector
,BoxHead
,BoxPredictor
, etc. You can replace every component with your own code without change the code base. For example, You can add EfficientNet as backbone, just addefficient_net.py
(ALREADY ADDED) and register it, specific it in the config file, It's done! - CPU support for inference: runs on CPU in inference time.
- Smooth and enjoyable training procedure: we save the state of model, optimizer, scheduler, training iter, you can stop your training and resume training exactly from the save point without change your training
CMD
. - Batched inference: can perform inference using multiple images per batch per GPU.
- Evaluating during training: eval you model every
eval_step
to check performance improving or not. - Metrics Visualization: visualize metrics details in tensorboard, like AP, APl, APm and APs for COCO dataset or mAP and 20 categories' AP for VOC dataset.
- Auto download: load pre-trained weights from URL and cache it.
- Python3
- PyTorch 1.0 or higher
- yacs
- Vizer
- GCC >= 4.9
- OpenCV
git clone https://github.com/lufficc/SSD.git
cd SSD
# Required packages: torch torchvision yacs tqdm opencv-python vizer
pip install -r requirements.txt
# Done! That's ALL! No BUILD! No bothering SETUP!
# It's recommended to install the latest release of torch and torchvision.
For Pascal VOC dataset, make the folder structure like this:
VOC_ROOT
|__ VOC2007
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ VOC2012
|_ JPEGImages
|_ Annotations
|_ ImageSets
|_ SegmentationClass
|__ ...
Where VOC_ROOT
default is datasets
folder in current project, you can create symlinks to datasets
or export VOC_ROOT="/path/to/voc_root"
.
For COCO dataset, make the folder structure like this:
COCO_ROOT
|__ annotations
|_ instances_valminusminival2014.json
|_ instances_minival2014.json
|_ instances_train2014.json
|_ instances_val2014.json
|_ ...
|__ train2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ val2014
|_ <im-1-name>.jpg
|_ ...
|_ <im-N-name>.jpg
|__ ...
Where COCO_ROOT
default is datasets
folder in current project, you can create symlinks to datasets
or export COCO_ROOT="/path/to/coco_root"
.
# for example, train SSD300:
python train.py --config-file configs/vgg_ssd300_voc0712.yaml
# for example, train SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS train.py --config-file configs/vgg_ssd300_voc0712.yaml SOLVER.WARMUP_FACTOR 0.03333 SOLVER.WARMUP_ITERS 1000
The configuration files that I provide assume that we are running on single GPU. When changing number of GPUs, hyper-parameter (lr, max_iter, ...) will also changed according to this paper: Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour.
# for example, evaluate SSD300:
python test.py --config-file configs/vgg_ssd300_voc0712.yaml
# for example, evaluate SSD300 with 4 GPUs:
export NGPUS=4
python -m torch.distributed.launch --nproc_per_node=$NGPUS test.py --config-file configs/vgg_ssd300_voc0712.yaml
Predicting image in a folder is simple:
python demo.py --config-file configs/vgg_ssd300_voc0712.yaml --images_dir demo --ckpt https://github.com/lufficc/SSD/releases/download/1.2/vgg_ssd300_voc0712.pth
Then it will download and cache vgg_ssd300_voc0712.pth
automatically and predicted images with boxes, scores and label names will saved to demo/result
folder by default.
You will see a similar output:
(0001/0005) 004101.jpg: objects 01 | load 010ms | inference 033ms | FPS 31
(0002/0005) 003123.jpg: objects 05 | load 009ms | inference 019ms | FPS 53
(0003/0005) 000342.jpg: objects 02 | load 009ms | inference 019ms | FPS 51
(0004/0005) 008591.jpg: objects 02 | load 008ms | inference 020ms | FPS 50
(0005/0005) 000542.jpg: objects 01 | load 011ms | inference 019ms | FPS 53
VOC2007 test | coco test-dev2015 | |
---|---|---|
SSD300* | 77.2 | 25.1 |
SSD512* | 79.8 | 28.8 |
Backbone | Input Size | box AP | Model Size | Download |
---|---|---|---|---|
VGG16 | 300 | 25.2 | 262MB | model |
VGG16 | 512 | 29.0 | 275MB | model |
Backbone | Input Size | mAP | Model Size | Download |
---|---|---|---|---|
VGG16 | 300 | 77.7 | 201MB | model |
VGG16 | 512 | 80.7 | 207MB | model |
Mobilenet V2 | 320 | 68.9 | 25.5MB | model |
Mobilenet V3 | 320 | 69.5 | 29.9MB | model |
EfficientNet-B3 | 300 | 73.9 | 97.1MB | model |
If you want to add your custom components, please see DEVELOP_GUIDE.md for more details.
If you have issues running or compiling this code, we have compiled a list of common issues in TROUBLESHOOTING.md. If your issue is not present there, please feel free to open a new issue.
If you use this project in your research, please cite this project.
@misc{lufficc2018ssd,
author = {Congcong Li},
title = {{High quality, fast, modular reference implementation of SSD in PyTorch}},
year = {2018},
howpublished = {\url{https://github.com/lufficc/SSD}}
}