Giter VIP home page Giter VIP logo

opendrivelab / bevperception-survey-recipe Goto Github PK

View Code? Open in Web Editor NEW
1.1K 34.0 95.0 24.83 MB

[IEEE T-PAMI] Awesome BEV perception research and cookbook for all level audience in autonomous diriving

Home Page: https://doi.org/10.1109/TPAMI.2023.3333838

License: Apache License 2.0

Python 54.94% Shell 0.05% Jupyter Notebook 44.91% C++ 0.04% Cuda 0.06%
autonomous-driving birds-eye-view camera-detection lidar-detection perception-algorithm

bevperception-survey-recipe's Introduction

Bird's-eye-view (BEV) Perception: A Survey and Collection

Awesome BEV perception papers and toolbox for achieving state-of-the-arts performance.

Table of Contents

Introduction

This repo is associated with the survey paper "Delving into the Devils of Bird’s-eye-view Perception: A Review, Evaluation and Recipe", which provides an up-to-date literature survey for BEV perception and an open-source BEV toolbox based on PyTorch. We also introduce the BEV algorithm family, including follow-up work on BEV percepton such as VCD, GAPretrain, and FocalDistiller. We hope this repo can not only be a good starting point for new beginners but also help current researchers in the BEV perception community.

If you find some work popular enough to be cited below, email us or simply open a PR!

Major Features

  • Up-to-date Literature Survey for BEV Perception
    We summarize important methods in recent years about BEV perception, including different modalities (camera, LIDAR, Fusion) and tasks (Detection, Segmentation, Occupancy). More details of the survey paper list can be found here.

  • Convenient BEVPerception Toolbox
    We integrate a bag of tricks in the BEV toolbox that helps us achieve 1st in the camera-based detection track of the Waymo Open Challenge 2022, which can be used independently or as a plug-in for popular deep learning libraries. Moreover, we provide a suitable playground for beginners in this area, including a hands-on tutorial and a small-scale dataset (1/5 WOD in kitti format) to validate ideas. More details can be found here.

  • SOTA BEV Knowledge Distillation Algorithms
    We include important follow-up works of BEVFormer/BEVDepth/SOLOFusion in the perspective of knowledge distillation(VCD, GAPretrain, FocalDistiller). More details of each paper can be found in each README.md file under here.

Method Expert Apprentice
VCD Vision-centric multi-modal detector Camera-only detector
GAPretrain Lidar-only detector Camera-only detector
FocalDistiller Camera-only detector Camera-only detector

What's New

2023/11/04 Our Survey is accepted by IEEE T-PAMI.

2023/10/26 A new paper VCD is coming soon with official implementation.

2023/09/06 We have a new version of the survey. Check it out!

2023/04/06 Two new papers GAPretrain and FocalDistiller are coming soon with official implementation.

2022/10/13 v0.1 was released.

  • Integrate some practical data augmentation methods for BEV camera-based 3D detection in the toolbox.
  • Offer a pipeline to process the Waymo dataset (camera-based 3D detection).
  • Release a baseline (with config) for the Waymo dataset and also 1/5 of the Waymo dataset in Kitti format.

Literature Survey

The general picture of BEV perception at a glance, where consists of three sub-parts based on the input modality. BEV perception is a general task built on top of a series of fundamental tasks. For better completeness of the whole perception algorithms in autonomous driving, we list other topics as well. More details can be found in the survey paper.

We have summarized important datasets and methods in recent years about BEV perception in academia and also different roadmaps used in industry.

We have also summarized some conventional methods for different tasks.

BEV Toolbox

The BEV toolbox provides useful recipes for BEV camera-based 3D object detection, including solid data augmentation strategies, efficient BEV encoder design, loss function family, useful test-time augmentation, ensemble policy, and so on. Please refer to bev_toolbox/README.md for more details.

BEV Knowledge Distillation Algorithms

The BEV algorithm family includes follow-up works of BEVFormer in different aspects, ranging from plug-and-play tricks to pre-training distillation. All paper summary is under nuscenes_playground along with official implementation, check it out!

License and Citation

This project is released under the Apache 2.0 license.

If you find this project useful in your research, please consider cite:

@article{li2022bevsurvey,
  author={Li, Hongyang and Sima, Chonghao and Dai, Jifeng and Wang, Wenhai and Lu, Lewei and Wang, Huijie and Zeng, Jia and Li, Zhiqi and Yang, Jiazhi and Deng, Hanming and Tian, Hao and Xie, Enze and Xie, Jiangwei and Chen, Li and Li, Tianyu and Li, Yang and Gao, Yulu and Jia, Xiaosong and Liu, Si and Shi, Jianping and Lin, Dahua and Qiao, Yu},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={Delving Into the Devils of Bird's-Eye-View Perception: A Review, Evaluation and Recipe}, 
  year={2023},
  volume={},
  number={},
  pages={1-20},
  doi={10.1109/TPAMI.2023.3333838}
}
@misc{bevtoolbox2022,
  title={{BEVPerceptionx-Survey-Recipe} toolbox for general BEV perception},
  author={BEV-Toolbox Contributors},
  howpublished={\url{https://github.com/OpenDriveLab/Birds-eye-view-Perception}},
  year={2022}
}

bevperception-survey-recipe's People

Contributors

chonghaosima avatar cyberknight42 avatar devlinyan avatar eloiz avatar faikit avatar henryjunw avatar hli2020 avatar increase24 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bevperception-survey-recipe's Issues

Error when evaluation?

When I evaluate the model, an error occurred.
It seems that the version of waymo-open-dataset is not right?

File "/data/BEVPerception-Survey-Recipe/waymo_playground/projects/mmdet3d_plugin/datasets/waymo_datasetV2.py", line 850, in _build_waymo_metric_config
config.let_metric_config.enabled = True
AttributeError: let_metric_config
I use waymo-open-dataset-tf-2-6-0==1.4.1
I think the version should be changed to waymo-open-dataset-tf-2-6-0==1.4.9?

some training problem

Hi, authors. I trained the waymo_mini_r101_baseline file, but the result is inconsistent with the results, especially the LET-APH metric. Here is my result:

LET-AP LET-APH LET-APL
0.4175 0.2109 0.2829

My question is:

  1. During the training, I found the the grad_norm always becomes nan. Is it a normal phenomenon?
  2. Could you provide the training log so that I can check the result.

Missing 'filter_waymo.txt' and 'waymo_calibs.pkl'

Thanks for your waymo support of bevformer! I am missing some files. They are 'filter_waymo.txt' and 'waymo_calibs.pkl', Are they included in waymo_mini google drive or we need to generate by ourselves?

论文疑问:bevdet没有使用激光雷达作为深度预测的监督

你好,论文 Delving into the Devils of Bird’s-eye-view Perception- A Review, Evaluation and Recipe, 中有下面这段话,意思是说bevdet是CaDDN的后续工作,使用激光雷达的深度信息,监督深度预测网络训练。实际上,bevdet里面,并没有使用激光雷达点云来监督深度预测。

The main difference between LSS [57] and CaDDN [46]
is that CaDDN uses depth ground truth to supervise its categorical
depth distribution prediction, thus owing a superior
depth network to extract 3D information from 2D space.
This track comes subsequent work such as BEVDet [47]
and its temporal version BEVDet4D [64], BEVDepth [49]

Missing Extrinsic and Intrinsic npy files for example data

I'm trying to run the example code for the bev-toolbox, but it appears that the .npy files for camera intrinsics and extrinsics is missing

`FileNotFoundError Traceback (most recent call last)
Cell In [7], line 11
9 imgs = [cv2.imread(f'./example/cam{i}_img.jpg') for i in range(5)]
10 # intrinsic parameters of cameras
---> 11 cam_intr = [np.load(f'./example/cam{i}_intrinsic.npy') for i in range(5)]
12 # extrinsic parameters of cameras
13 cam_extr = [np.load(f'./example/cam{i}_extrinsic.npy') for i in range(5)]

Cell In [7], line 11, in (.0)
9 imgs = [cv2.imread(f'./example/cam{i}_img.jpg') for i in range(5)]
10 # intrinsic parameters of cameras
---> 11 cam_intr = [np.load(f'./example/cam{i}_intrinsic.npy') for i in range(5)]
12 # extrinsic parameters of cameras
13 cam_extr = [np.load(f'./example/cam{i}_extrinsic.npy') for i in range(5)]

File C:\Anaconda\envs\camera\Lib\site-packages\numpy\lib\npyio.py:405, in load(file, mmap_mode, allow_pickle, fix_imports, encoding, max_header_size)
403 own_fid = False
404 else:
--> 405 fid = stack.enter_context(open(os_fspath(file), "rb"))
406 own_fid = True
408 # Code to distinguish from NumPy binary files and pickles.

FileNotFoundError: [Errno 2] No such file or directory: './example/cam0_intrinsic.npy'

`

Bugs when infer the BEVFormer waymo

projects/mmdet3d_plugin/datasets/waymo_datasetV2.py", line 561, in format_waymo_results
    with open(f'./filter_waymo.txt', 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: './filter_waymo.txt'

How to unzip the waymo_mini dataset in Google drive?

Hi,
thanks for the great work!
It seems that there is something wrong with the waymo_mini dataset in Google drive.
When I download the data and using tar -zxvf ./waymo_mini.tar-015.gz00 to extract the data, an error occurred:

gzip: stdin: unexpected end of file
tar: Unexpected EOF in archive
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
How to solve that?
The size of the data seems right.

Code of FocalDistiller

Hi! I am very interested in the FocalDistiller!

Could you have a plan to open-source the core code of the FocalDistiller? Just the loss code is also OK!

Looking forward to your reply. Thank you!

code for generate waymo_mini dataset

Hi, thanks for your great work. Could you share the code for generating the waymo_mini dataset since it is very slow for me to download the data from google drive?

pretrained model on waymo

hi, dear author, has the repo provided waymo(or waymo mini) pertained models? looks the pretrained models in data prepartion is nu-scenes based, would like to do some validation on waymo, thanks a lot

FocalDistiller code

Could you tell me when will release the source code of the FocalDistiller?
Look forward to your reply!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.