Giter VIP home page Giter VIP logo

mv2d's Introduction

MV2D

This repo is the official PyTorch implementation for paper:
Object as Query: Lifting any 2D Object Detector to 3D Detection. Accepted by ICCV 2023.

We design Multi-View 2D Objects guided 3D Object Detector (MV2D), which can lift any 2D object detector to multi-view 3D object detection. Since 2D detections can provide valuable priors for object existence, MV2D exploits 2D detectors to generate object queries conditioned on the rich image semantics. These dynamically generated queries help MV2D to recall objects in the field of view and show a strong capability of localizing 3D objects. For the generated queries, we design a sparse cross attention module to force them to focus on the features of specific objects, which suppresses interference from noises.

Preparation

This implementation is built upon PETR, and can be constructed as the install.md.

  • Environments
    Linux, Python == 3.8.10, CUDA == 11.3, pytorch == 1.11.0, mmcv == 1.6.1, mmdet == 2.25.1, mmdet3d == 1.0.0, mmsegmentation == 0.28.0

  • Detection Data
    Follow the mmdet3d to process the nuScenes dataset (https://github.com/open-mmlab/mmdetection3d/blob/master/docs/en/data_preparation.md).

  • Pretrained weights
    We use nuImages pretrained weights from mmdetection3d. Download the pretrained weights and put them into weights/ directory.

  • After preparation, you will be able to see the following directory structure:

    MV2D
    ├── mmdetection3d
    ├── configs
    ├── mmdet3d_plugin
    ├── tools
    ├── data
    │   ├── nuscenes
    │     ├── ...
    ├── weights
    ├── README.md
    

Train & Inference

cd MV2D

You can train the model following:

bash tools/dist_train.sh configs/mv2d/exp/mv2d_r50_frcnn_two_frames_1408x512_ep24.py 8 

You can evaluate the model following:

bash tools/dist_test.sh configs/mv2d/exp/mv2d_r50_frcnn_two_frames_1408x512_ep24.py work_dirs/mv2d_r50_frcnn_two_frames_1408x512_ep24/latest.pth 8 --eval bbox

Main Results

config mAP NDS checkpoint
MV2D-T_R50_1408x512_ep72 0.453 0.543 download
MV2D-S_R50_1408x512_ep72 0.398 0.470 download

Acknowledgement

Many thanks to the authors of mmdetection3d and petr.

Citation

If you find this repo useful for your research, please cite

@article{wang2023object,
  title={Object as query: Equipping any 2d object detector with 3d detection ability},
  author={Wang, Zitian and Huang, Zehao and Fu, Jiahui and Wang, Naiyan and Liu, Si},
  journal={arXiv preprint arXiv:2301.02364},
  year={2023}
}

Contact

For questions about our paper or code, please contact Zitian Wang([email protected]).

mv2d's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.