Giter VIP home page Giter VIP logo

sparsebev's Introduction

SparseBEV

This is the official PyTorch implementation for paper SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos. (ICCV 2023)

中文解读:https://zhuanlan.zhihu.com/p/654821380

Model Zoo

Setting Pretrain Training Cost NDSval NDStest FPS Weights
r50_nuimg_704x256 nuImg 21h (8x2080Ti) 55.6 - 15.8 gdrive
r50_nuimg_704x256_400q_36ep nuImg 28h (8x2080Ti) 55.8 - 23.5 gdrive
r101_nuimg_1408x512 nuImg 2d8h (8xV100) 59.2 - 6.5 gdrive
vov99_dd3d_1600x640_trainval_future DD3D 4d1h (8xA100) 84.9 67.5 - gdrive
  • We use r50_nuimg_704x256 for ablation studies and r50_nuimg_704x256_400q_36ep for comparison with others.
  • We recommend using r50_nuimg_704x256 to validate new ideas since it trains faster and the result is more stable.
  • FPS is measured with AMD 5800X CPU and RTX 3090 GPU (without fp16).
  • The noise is around 0.3 NDS.

Environment

Install PyTorch 2.0 + CUDA 11.8:

conda create -n sparsebev python=3.8
conda activate sparsebev
conda install pytorch==2.0.0 torchvision==0.15.0 pytorch-cuda=11.8 -c pytorch -c nvidia

or PyTorch 1.10.2 + CUDA 10.2 for older GPUs:

conda create -n sparsebev python=3.8
conda activate sparsebev
conda install pytorch==1.10.2 torchvision==0.11.3 cudatoolkit=10.2 -c pytorch

Install other dependencies:

pip install openmim
mim install mmcv-full==1.6.0
mim install mmdet==2.28.2
mim install mmsegmentation==0.30.0
mim install mmdet3d==1.0.0rc6
pip install setuptools==59.5.0
pip install numpy==1.23.5

Install turbojpeg and pillow-simd to speed up data loading (optional but important):

sudo apt-get update
sudo apt-get install -y libturbojpeg
pip install pyturbojpeg
pip uninstall pillow
pip install pillow-simd==9.0.0.post1

Compile CUDA extensions:

cd models/csrc
python setup.py build_ext --inplace

Prepare Dataset

  1. Download nuScenes from https://www.nuscenes.org/nuscenes and put it in data/nuscenes.
  2. Download the generated info file from Google Drive and unzip it.
  3. Folder structure:
data/nuscenes
├── maps
├── nuscenes_infos_test_sweep.pkl
├── nuscenes_infos_train_sweep.pkl
├── nuscenes_infos_val_sweep.pkl
├── samples
├── sweeps
├── v1.0-test
└── v1.0-trainval

These *.pkl files can also be generated with our script: gen_sweep_info.py.

Training

Download pretrained weights and put it in directory pretrain/:

pretrain
├── cascade_mask_rcnn_r101_fpn_1x_nuim_20201024_134804-45215b1e.pth
├── cascade_mask_rcnn_r50_fpn_coco-20e_20e_nuim_20201009_124951-40963960.pth

Train SparseBEV with 8 GPUs:

torchrun --nproc_per_node 8 train.py --config configs/r50_nuimg_704x256.py

Train SparseBEV with 4 GPUs (i.e the last four GPUs):

export CUDA_VISIBLE_DEVICES=4,5,6,7
torchrun --nproc_per_node 4 train.py --config configs/r50_nuimg_704x256.py

The batch size for each GPU will be scaled automatically. So there is no need to modify the batch_size in config files.

Evaluation

Single-GPU evaluation:

export CUDA_VISIBLE_DEVICES=0
python val.py --config configs/r50_nuimg_704x256.py --weights checkpoints/r50_nuimg_704x256.pth

Multi-GPU evaluation:

export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
torchrun --nproc_per_node 8 val.py --config configs/r50_nuimg_704x256.py --weights checkpoints/r50_nuimg_704x256.pth

Timing

FPS is measured with a single GPU:

export CUDA_VISIBLE_DEVICES=0
python timing.py --config configs/r50_nuimg_704x256.py --weights checkpoints/r50_nuimg_704x256.pth

Acknowledgements

Many thanks to these excellent open-source projects:

sparsebev's People

Contributors

afterthat97 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.