OccNeRF

Project Page | Paper | Checkpoints & Videos

OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields

Chubin Zhang*, Juncheng Yan* Yi Wei*, Jiaxin Li, Li Liu, Yansong Tang, Yueqi Duan, Jiwen Lu

Updates:

🔔 2023/12/15 Initial code and paper release.

🕹 Demos

Demos are a little bit large; please wait a moment to load them. If you cannot load them or feel them blurry, you can click the hyperlink of each demo for the full-resolution raw video.

📝 Introduction

In this paper, we propose an OccNeRF method for self-supervised multi-camera occupancy prediction. Different from bounded 3D occupancy labels, we need to consider unbounded scenes with raw image supervision. To solve the issue, we parameterize the reconstructed occupancy fields and reorganize the sampling strategy. The neural rendering is adopted to convert occupancy fields to multi-camera depth maps, supervised by multi-frame photometric consistency. Moreover, for semantic occupancy prediction, we design several strategies to polish the prompts and filter the outputs of a pretrained open-vocabulary 2D segmentation model.

💡 Method

Method Pipeline:

We first use a 2D backbone to extract multi-camera features, which are lifted to 3D space to get volume features with interpolation. The parameterized occupancy fields are reconstructed to describe unbounded scenes. To obtain the rendered depth and semantic maps, we perform volume rendering with our reorganized sampling strategy. The multi-frame depths are supervised by photometric loss. For semantic prediction, we adopted pretrained Grounded-SAM with prompts cleaning. The green arrow indicates supervision signals.

🔧 Installation

Clone this repo and install the dependencies:

git clone --recurse-submodules https://github.com/LinShan-Bin/OccNeRF.git
cd OccNeRF
conda create -n occnerf python=3.8
conda activate occnerf
conda install pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install -r requirements.txt

Our code is tested with Python 3.8, PyTorch 1.9.1 and CUDA 11.3 and can be adapted to other versions of PyTorch and CUDA with minor modifications.

🏗 Dataset Preparation

Download nuScenes V1.0 full dataset data from nuScenes and link the data folder to ./data/nuscenes/nuscenes/.
Download the ground truth occupancy labels from Occ3d and unzip the gts.tar.gz to ./data/nuscenes/gts. Note that we only use the 3d occpancy labels for validation.
Generate the ground truth depth maps for validation:
```
python tools/export_gt_depth_nusc.py
```
Download the dataset index pickle file from SurroundOcc and place nuscenes_infos_train.pkl under ./data/nuscenes/. Then generate the ground truth semantic maps:
```
cd GroundedSAM_OccNeRF
bash ./run.sh
```
Download the pretrained weights of our model from Checkpoints and move them to ./ckpts/.
Refer to README.md in ./GroundedSAM_OccNeRF/ and prepare semantic prediction results of the training dataset if you want to train OccNeRF with semantic supervision.

The Final folder structure should be like:

OccNeRF/
├── ckpts/
│   ├── nusc-depth/
│   │   ├── encoder.pth
│   │   ├── depth.pth
│   ├── nusc-sem/
│   │   ├── encoder.pth
│   │   ├── depth.pth
├── data/
│   ├── nuscenes/
│   │   ├── nuscenes/
│   │   │   ├── maps/
│   │   │   ├── samples/
│   │   │   ├── sweeps/
│   │   │   ├── v1.0-trainval/
│   │   ├── gts/
│   │   ├── nuscenes_depth/
│   │   ├── nuscenes_semantic/
│   │   ├── nuscenes_infos_train.pkl
├── ...

🚀 Quick Start

Training

Train OccNeRF without semantic supervision:

python -m torch.distributed.launch --nproc_per_node=8 run.py --config configs/nusc-depth.txt

In order to train the full model, you need at least 80 GB GPU memory. If you have less GPU memory (e.g., 40 GB), you can train with a single frame (set auxiliary_frame = False in the config file). See section 4.4 in the paper for the ablation study. Evaluation can be done with 24 GB GPU memory.

Train OccNeRF with semantic supervision:

python -m torch.distributed.launch --nproc_per_node=8 run.py --config configs/nusc-sem.txt

Evaluation

Evaluate the depth estimation:

python -m torch.distributed.launch --nproc_per_node=8 run.py --config configs/nusc-depth.txt --eval_only --load_weights_folder ckpts/nusc-depth

Evaluate the occupancy prediction:

python -m torch.distributed.launch --nproc_per_node=8 run.py --config configs/nusc-sem.txt --eval_only --load_weights_folder ckpts/nusc-sem

Visualization

Visualize the depth estimation:

python tools/export_vis_data.py  # You can modify this file to choose scenes you want to visualize. Otherwise, all validation scenes will be visualized.
python -m torch.distributed.launch --nproc_per_node=8 run_vis.py --config configs/nusc-depth.txt --load_weights_folder ckpts/nusc-depth --log_dir your_log_dir
python gen_scene_video.py scene_folder_generated_by_the_above_command

🙏 Acknowledgement

Many thanks to these excellent projects:

📃 Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry.

@article{chubin2023occnerf, 
      title   = {OccNeRF: Self-Supervised Multi-Camera Occupancy Prediction with Neural Radiance Fields}, 
      author  = {Chubin Zhang and Juncheng Yan and Yi Wei and Jiaxin Li and Li Liu and Yansong Tang and Yueqi Duan and Jiwen Lu},
      journal = {arXiv preprint arXiv:2312.09243},
      year    = {2023}
}

yanwang202199 / occnerf Goto Github PK

occnerf's Introduction

OccNeRF

🕹 Demos

Depth estimation:

Occupancy prediction:

📝 Introduction

💡 Method

🔧 Installation

🏗 Dataset Preparation

🚀 Quick Start

Training

Evaluation

Visualization

🙏 Acknowledgement

📃 Bibtex

occnerf's People

Contributors

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent