Giter VIP home page Giter VIP logo

s3gaussian's Introduction

S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving

Nan Huang*, Xiaobao Wei, Wenzhao Zheng$^\dagger$, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang$^\ddagger$

* Work done while interning at UC Berkeley $\dagger$ Project leader $\ddagger$ Corresponding author

S3Gaussian employs 3D Gaussians to model dynamic scenes for autonomous driving without other supervisions (e.g., 3D bounding boxes).

vis

News

  • [2023/5/31] Training & evaluation code release!
  • [2024/5/31] Paper released on arXiv.

Demo

demo

Overview

overview

To tackle the challenges in self-supervised street scene decomposition, we propose a multi-resolution hexplane-based encoder to encode 4D grid into feature planes and a multi-head Gaussian decoder to decode them into deformed 4D Gaussians. We optimize the overall model without extra annotations in a self-supervised manner and achieve superior scene decomposition ability and rendering quality.

Results

overview

Getting Started

Environmental Setups

Our code is developed on Ubuntu 22.04 using Python 3.9 and pytorch=1.13.1+cu116. We also tested on pytorch=2.2.1+cu118. We recommend using conda for the installation of dependencies.

git clone https://github.com/nnanhuang/S3Gaussian.git --recursive
cd S3Gaussian
conda create -n S3Gaussian python=3.9 
conda activate S3Gaussian

pip install -r requirements.txt
pip install -e submodules/depth-diff-gaussian-rasterization
pip install -e submodules/simple-knn

Preparing Dataset

Follow detailed instructions in Prepare Dataset.

Training

For training first clip (eg. 0-50 frames), run

python train.py -s $data_dir --port 6017 --expname "waymo" --model_path $model_path 

If you want to try novel view synthesis, use

--configs "arguments/nvs.py"

For training next clip (eg. 51-100 frames), run

python train.py -s $data_dir --port 6017 --expname "waymo" --model_path $model_path 
--prior_checkpoint "$prior_dir/chkpnt_fine_50000.pth"

Also, you can load an existing checkpoint with:

python train.py -s $data_dir --port 6017 --expname "waymo" --start_checkpoint "$ckpt_dir/chkpnt_fine_30000.pth"

For more scripts examples, please check here.

Evaluation and Visualization

You can visualize and eval a checkpoints follow:

python train.py -s $data_dir --port 6017 --expname "waymo" --start_checkpoint "$ckpt_dir/chkpnt_fine_50000.pth"
--eval_only

Then you can get rendering RGB videos, ground truth RGB videos, depth videos, dynamic rgb videos and static rgb videos.

Acknowledgments

Credits to @Korace0v0 for building 3D Gaussians for street scenes. Many thanks!

Our code is based on 4D Gaussians and EmerNeRF.

Thanks to these excellent open-sourced repos!

Citation

If you find this project helpful, please consider citing the following paper:

@article{huang2024s3gaussian,
        title={S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving},
        author={Huang, Nan and Wei, Xiaobao and Zheng, Wenzhao and An, Pengju and Lu, Ming and Zhan, Wei and Tomizuka,    Masayoshi and Keutzer, Kurt and Zhang, Shanghang},
        journal={arXiv preprint arXiv:2405.20323},
        year={2024}
      }

s3gaussian's People

Contributors

wzzheng avatar nnanhuang avatar

Stargazers

allenpeng avatar KC avatar Hao Lu avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.