Giter VIP home page Giter VIP logo

lw-detr's Introduction

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection

This is the official repository with PyTorch implementation of LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection.

☀️ If you find this work useful for your research, please kindly star our repo and cite our paper! ☀️

Highlights

  • Release a series of real-time detection models in LW-DETR, including LW-DETR-tiny, LW-DETR-small, LW-DETR-medium, LW-DETR-large and LW-DETR-xlarge, named <LWDETR_*size_60e_coco.pth>. Please refer to Hugging Face to download.
  • Release a series of pretrained models in LW-DETR. Please refer to Hugging Face to download.

Catalogue

1. Introduction

LW-DETR is a light-weight detection tranformer, which outperforms YOLOs for real-time object detection. The architecture is a simple stack of a ViT encoder, a projector, and a shallow DETR decoder. LW-DETR leverages recent advanced techniques, such as training-effective techniques, e.g., improved loss and pretraining, and interleaved window and global attentions for reducing the ViT encoder complexity. LW-DETR improves the ViT encoder by aggregating multi-level feature maps, and the intermediate and final feature maps in the ViT encoder, forming richer feature maps, and introduces window-major feature map organization for improving the efficiency of interleaved attention computation. LW-DETR achieves superior performance than on existing real-time detectors, e.g., YOLO and its variants, on COCO and other benchmark datasets.

2. Installation

Requirements

The code is developed and validated under python=3.8.19, pytorch=1.13.0, cuda=11.6,TensorRT-8.6.1.6. Higher versions might be available as well.

  1. Create your own Python environment with Anaconda.
conda create -n lwdetr python=3.8.19
conda activate lwdetr
  1. Clone this repo.
git clone https://github.com/Atten4Vis/LW-DETR.git
cd LW-DETR
  1. Install PyTorch and torchvision.

Follow the instruction on https://pytorch.org/get-started/locally/.

# an example:
conda install pytorch==1.13.0 torchvision==0.14.0 pytorch-cuda=11.6 -c pytorch -c nvidia
  1. Install required packages.

For training and evaluation:

pip install -r requirements.txt

For deployment:

Please refer to NVIDIA for installation instruction of TensorRT

pip install -r deploy/requirements.txt
  1. Compiling CUDA operators
cd models/ops
python setup.py build install
# unit test (should see all checking is True)
python test.py
cd ../..

3. Preparation

Data preparation

For MS COCO dataset, please download and extract COCO 2017 train and val images with annotations from http://cocodataset.org. We expect the directory structure to be the following:

COCODIR/
  ├── train2017/
  ├── val2017/
  └── annotations/
  	├── instances_train2017.json
  	└── instances_val2017.json

For Objects365 dataset for pretraining, please download Objects365 images with annotations from https://www.objects365.org/overview.html.

Model preparation

All the checkpoints can be found in Hugging Face.

  1. Pretraining on Objects365.
  • Pretrained the ViT.

We pretrain the ViT on the dataset Objects365 using a MIM method, CAE v2, based on the pretrained models. Please refer to the following link to download the pretrained models, and put them into pretrain_weights/.

Model Comment
caev2_tiny_300e_objects365 pretrained ViT model on objects365 for LW-DETR-tiny/small using CAE v2
caev2_tiny_300e_objects365 pretrained ViT model on objects365 for LW-DETR-medium/large using CAE v2
caev2_tiny_300e_objects365 pretrained ViT model on objects365 for LW-DETR-xlarge using CAE v2
  • Pretrained LW-DETR.

We retrain the encoder and train the projector and the decoder on Objects365 in a supervision manner. Please refer to the following link to download the pretrained models, and put them into pretrain_weights/.

Model Comment
LWDETR_tiny_30e_objects365 pretrained LW-DETR-tiny model on objects365
LWDETR_small_30e_objects365 pretrained LW-DETR-small model on objects365
LWDETR_medium_30e_objects365 pretrained LW-DETR-medium model on objects365
LWDETR_large_30e_objects365 pretrained LW-DETR-large model on objects365
LWDETR_xlarge_30e_objects365 pretrained LW-DETR-xlarge model on objects365
  1. Finetuning on COCO. We finetune the pretrained model on COCO. If you want to reimplement our repo, please skip this step. If you want to directly evaluate our trained models, please refer to the following link to download the finetuned models, and put them into output/.
Model Comment
LWDETR_tiny_60e_coco finetuned LW-DETR-tiny model on COCO
LWDETR_small_60e_coco finetuned LW-DETR-small model on COCO
LWDETR_medium_60e_coco finetuned LW-DETR-medium model on COCO
LWDETR_large_60e_coco finetuned LW-DETR-large model on COCO
LWDETR_xlarge_60e_coco finetuned LW-DETR-xlarge model on COCO

4. Train

You can directly run scripts/lwdetr_<model_size>_coco_train.sh file for the training process on coco dataset.

Train a LW-DETR-tiny model
sh scripts/lwdetr_tiny_coco_train.sh /path/to/your/COCODIR
Train a LW-DETR-small model
sh scripts/lwdetr_small_coco_train.sh /path/to/your/COCODIR
Train a LW-DETR-medium model
sh scripts/lwdetr_medium_coco_train.sh /path/to/your/COCODIR
Train a LW-DETR-large model
sh scripts/lwdetr_large_coco_train.sh /path/to/your/COCODIR
Train a LW-DETR-xlarge model
sh scripts/lwdetr_xlarge_coco_train.sh /path/to/your/COCODIR

5. Eval

You can directly run scripts/lwdetr_<model_size>_coco_eval.sh file for the evaluation process on coco dataset. Please refer to 3. Preparation to download a series of LW-DETR models.

Eval our pretrained LW-DETR-tiny model
sh scripts/lwdetr_tiny_coco_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
Eval our pretrained LW-DETR-small model
sh scripts/lwdetr_small_coco_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
Eval our pretrained LW-DETR-medium model
sh scripts/lwdetr_medium_coco_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
Eval our pretrained LW-DETR-large model
sh scripts/lwdetr_large_coco_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint
Eval our pretrained LW-DETR-xlarge model
sh scripts/lwdetr_xlarge_coco_eval.sh /path/to/your/COCODIR /path/to/your/checkpoint

6. Deploy

Export models

You can run scripts/lwdetr_<model_size>_coco_export.sh file to export models for development. Before execution, please ensure that TensorRT and cuDNN environment variables are correctly set.

Export a LW-DETR-tiny model
# export ONNX model
sh scripts/lwdetr_tiny_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint
# convert model from ONNX to TensorRT engine as well
sh scripts/lwdetr_tiny_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint --trt
Export a LW-DETR-small model
# export ONNX model
sh scripts/lwdetr_small_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint
# convert model from ONNX to TensorRT engine as well
sh scripts/lwdetr_small_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint --trt
Export a LW-DETR-medium model
# export ONNX model
sh scripts/lwdetr_medium_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint
# convert model from ONNX to TensorRT engine as well
sh scripts/lwdetr_medium_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint --trt
Export a LW-DETR-large model
# export ONNX model
sh scripts/lwdetr_large_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint
# convert model from ONNX to TensorRT engine as well
sh scripts/lwdetr_large_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint --trt
Export a LW-DETR-xlarge model
# export ONNX model
sh scripts/lwdetr_xlarge_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint
# convert model from ONNX to TensorRT engine as well
sh scripts/lwdetr_xlarge_coco_export.sh /path/to/your/COCODIR /path/to/your/checkpoint --trt

Run benchmark

You can use deploy/benchmark.py tool to run benchmarks of inference latency.

# evaluate and benchmark the latency on a onnx model
python deploy/benchmark.py --path=/path/to/your/onnxmodel --coco_path=/path/to/your/COCODIR --run_benchmark 
# evaluate and benchmark the latency on a TensorRT engine
python deploy/benchmark.py --path=/path/to/your/trtengine --coco_path=/path/to/your/COCODIR --run_benchmark 

7. Main Results

The main results on coco dataset. We report the mAP as reported in the original paper, as well as the mAP obtained from re-implementation.

Method
pretraining Params (M) FLOPs (G) Model Latency (ms) Total Latency (ms) mAP Download
LW-DETR-tiny 12.1 11.2 2.0 2.0 42.6(42.9) Link
LW-DETR-small 14.6 16.6 2.9 2.9 48.0(48.1) Link
LW-DETR-medium 28.2 42.8 5.6 5.6 52.5(52.6) Link
LW-DETR-large 46.8 71.6 8.8 8.8 56.1(56.1) Link
LW-DETR-xlarge 118.0 174.2 19.1 19.1 58.3(58.3) Link

8. References

Our project is conducted based on the following public paper with code:

9. Citation

If you find this code useful in your research, please kindly consider citing our paper:

    @article{chen2024lw,
        title={LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection},
        author={Chen, Qiang and Su, Xiangbo and Zhang, Xinyu and Wang, Jian and Chen, Jiahui and Shen, Yunpeng and Han, Chuchu and Chen, Ziliang and Xu, Weixiang and Li, Fanrong and others},
        journal={arXiv preprint arXiv:2406.03459},
        year={2024}
    }

lw-detr's People

Contributors

xbsu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.