Giter VIP home page Giter VIP logo

blocknerfpytorch's Introduction

We track weekly NeRF papers and classify them. All previous published NeRF papers have been added to the list. We provide an English version and a Chinese version. We welcome contributions and corrections via PR.

We also provide an excel version (the meta data) of all NeRF papers, you can add your own comments or make your own paper analysis tools based on the structured meta data.

[CVPR22Oral] Block-NeRF: Scalable Large Scene Neural View Synthesis

All Contributors

1. Introduction

The Block-NeRF builds the largest neural scene representation to date, capable of rendering an entire neighborhood of San Francisco. The abstract of the Block-NeRF paper is as follows:

We present Block-NeRF, a variant of Neural Radiance Fields that can represent large-scale environments. Specifically, we demonstrate that when scaling NeRF to render city-scale scenes spanning multiple blocks, it is vital to decompose the scene into individually trained NeRFs. This decomposition decouples rendering time from scene size, enables rendering to scale to arbitrarily large environments, and allows per-block updates of the environment. We adopt several architectural changes to make NeRF robust to data captured over months under different environmental conditions. We add appearance embeddings, learned pose refinement, and controllable exposure to each individual NeRF, and introduce a procedure for aligning appearance between adjacent NeRFs so that they can be seamlessly combined. We build a grid of Block-NeRFs from 2.8 million images to create the largest neural scene representation to date, capable of rendering an entire neighborhood of San Francisco.

The official results of Block-NeRF:

demo_block_nerf.mp4

This project is the non-official implementation of Block-NeRF. You are expected to get the following results in this repository:

  1. Large-scale NeRF training. The current results are as follows:
building-demo.mp4
  1. SOTA custom scenes. Reconstruction SOTA NeRFs based on your collected photos. Here is a reconstructed video of my work station:
sm01_04.mp4
  1. Google Colab support. Run trained Block-NeRF on Google Colab with detailed visualizations (unfinished yet):

Open In Colab

The other features of this project would be:

  • PyTorch Implementation. The official Block-NeRF paper uses tensorflow and requires TPUs. However, this implementation only needs PyTorch.

  • GPU efficient. We ensure that almost all our experiments can be carried on eight NVIDIA 2080Ti GPUs.

  • Quick download. We host many datasets on Google drive so that downloading becomes much faster.

  • Uniform data format. The original Block-NeRF paper requires downloading tons of data from Google Cloud Platform. This repo provide processed data and convenient scripts. We provides a uniform data format that suits many datasets of large-scale neural fields.

  • State-of-the-art performance. This project produces state-of-the-art rendering quality with better efficiency.

  • Quick validation. We provide quick validation tools to evaluate your ideas so that you don't need to train on the full Block-NeRF dataset.

  • Open research. Along with this project, we aim to developping a strong community working on this. We welcome you to joining us (if you have a Wechat, feel free to add my Wechat ytc407). The contributors of this project are listed at the bottom of this page!

  • Chinese community. We will host regular Chinese tutorials and provide hands-on videos on general NeRF and building your custom NeRFs in the wild and in the city. Welcome to add my Wechat if you have a Wechat.

Welcome to star and watch this project, thank you very much!

2. News

  • [2022.8.24] Support the full Mega-NeRF pipeline.
  • [2022.8.18] Support all previous papers in weekly classified NeRF.
  • [2022.8.17] Support classification in weekly NeRF.
  • [2022.8.16] Support evaluation scripts and data format standard. Getting some results.
  • [2022.8.13] Add estimated camera pose and release a better dataset.
  • [2022.8.12] Add weekly NeRF functions.
  • [2022.8.8] Add the NeRF reconstruction code and doc for custom purposes.
  • [2022.7.28] The data preprocess script is finished.
  • [2022.7.20] This project started!

3. Installation

Expand / collapse installation steps.
  1. Create conda environment.
    conda create -n nerf-block python=3.9
  2. Install tensorflow and other libs. You don't need to install tensorflow if you download our processed data. Our version: tensorflow with CUDA11.7.
    pip install --upgrade pip
    pip install tensorflow opencv-python matplotlib configargparse
  3. Install other libs used for reconstructing custom scenes, which is only needed when you need to build your scenes.
    sudo apt-get install colmap
    sudo apt-get install imagemagick  # required sudo accesss
    pip install -r requirements.txt
    conda install pytorch-scatter -c pyg  # or install via https://github.com/rusty1s/pytorch_scatter
    You can use laptop version of COLMAP as well if you do not have access to sudo access on your server. However, we found if you do not set up COLMAP parameters properly, you would not get the SOTA performance.

4. Large-scale NeRF on the public datasets

Click the following sub-section titles to expand / collapse steps.

Note we provide useful debugging commands in many scripts. Debug commands require a single GPU card only and may run slower than the standard commands. You can use the standard commands instead for conducting experiments and comparisons. A sample bash file is:

# arguments
ARGUMENTS HERE  # we provide you sampled arguments with explanations and options here.
# for debugging, uncomment the following line when debugging
# DEBUG COMMAND HERE
# for standard training, comment the following line when debugging
STANDARD TRAINING COMMAND HERE
4.1 Download processed data.

What you should know before downloading the data:

(1) You don't need these steps if you only want to get results on your custom data (in other words, you can directly go to Section 5) but we recommand you to run on public datasets first.

(2) Disclaimer: you should ensure that you get the permission for usage from the original data provider. One should first sign the license on the official waymo webiste to get the permission of downloading the Waymo data. Other data should be downloaded and used without obeying the original licenses.

(3) Our processed waymo data is significantly smaller than the original version (19.1GB vs. 191GB) because we store the camera poses instead of raw ray directions. Besides, our processed data is more friendly for Pytorch dataloaders.

You can download and preprocess all of the data and pretrained models via the following commands:

bash data_proprocess/download_waymo.sh  // download waymo datasets
bash data_preprocess/download_mega.sh   // download mega datasets from the CMU server. The total size is around 31G.

(Optional) you may also download the mega dataset (which is the same as the "download_mega.sh" bash) from our Google drive. You can download selected data from this table:

Dataset name Images & poses Masks Pretrained models
Waymo waymo_image_poses Not ready Not ready
Building building-pixsfm building-pixsfm-grid-8 building-pixsfm-8.pt
Rubble rubble-pixsfm rubble-pixsfm-grid-8 rubble-pixsfm-8.pt
Quad ArtsQuad_dataset - quad-pixsfm quad-pixsfm-grid-8 quad-pixsfm-8.pt
Residence UrbanScene3D - residence-pixsfm residence-pixsfm-grid-8 residence-pixsfm-8.pt
Sci-Art UrbanScene3D - sci-art-pixsfm sci-art-pixsfm-grid-25 sci-art-pixsfm-25-w-512.pt
Campus UrbanScene3D - campus campus-pixsfm-grid-8 campus-pixsfm-8.pt

The data structures follow the Mega-NeRF standards. We provide detailed explanations with examples for each data structure in this doc. After downloading the data, unzip the files and make folders via the following commands:

bash data_preprocess/process_mega.sh

If you are interested in processing the raw waymo data on your own, please refer to this doc.

4.2 Run pretrained models.

We recommand you to eval the pretrained models first before you train the models. In this way, you can quickly see the results of our provided models and help you rule out many environmental issues. Run the following script to eval the pre-trained models.

bash scripts/eval_trained_models.sh
# The rendered images would be placed under ${EXP_FOLDER}, which is set to data/mega/${DATASET_NAME}/exp_logs by default.

The sample output log by running this script can be found at docs/sample_logs/eval_trained_models.txt.

4.3 Generate masks.

Why should we generate masks? (1) Masks help us transfer camera poses + images to ray-based data. In this way, we can download the raw datasets quickly and train quickly as well. (2) Masks helps us manage the boundary of rays.

Run the following script to create masks:

bash scripts/create_cluster_mask.sh
# The output would be placed under the ${MASK_PATH}, which is set to data/mega/${DATASET_NAME}/building-pixsfm-grid-8 by default.

The sample output log by running this script can be found at docs/sample_logs/create_cluster_mask.txt. The middle parts of the log have been deleted to save space.

4.4 Train sub-modules.

Run the following commands to train the sub-module (the block):

bash scripts/train_sub_modules.sh SUBMODULE_INDEX # SUBMODULE_INDEX is the index of the submodule.

The sample output log by running this script can be found at docs/sample_logs/create_cluster_mask.txt. You can also train multiple modules simutaneously via the parscript to launch all the training procedures simutaneuously. I personally don't use parscript but use the slurm launching scripts to launch all the required modules. The training time without multi-processing is around one day.

4.5 Merge modules.

Run the following commands to merge the trained modules to a unified model:

bash scripts/merge_sub_modules.sh

After that, you can go to 4.1 to eval your trained modules. The sample log can be found at docs/sample_logs/merge_sub_modules.txt.

5. Build your custom large-scale NeRF

Expand / collapse steps for building custom NeRF world.
  1. Put your images under data folder. The structure should be like:

    data
       |——————Madoka          // Your folder name here.
       |        └——————source // Source images should be put here.
       |                 └——————---|1.png
       |                 └——————---|2.png
       |                 └——————---|...

    The sample data is provided in our Google drive folder. The Madoka and Otobai can be found at this link.

  2. Run COLMAP to reconstruct scenes. This would probably cost a long time.

    python tools/imgs2poses.py data/Madoka

    You can replace data/Madoka by your data folder. If your COLMAP version is larger than 3.6 (which should not happen if you use apt-get), you need to change export_path to output_path in Ln67 of colmap_wrapper.py.

  3. Training NeRF scenes.

    python run.py --config configs/custom/Madoka.py

    You can replace configs/custom/Madoka.py by other configs.

  4. Validating the training results to generate a fly-through video.

    python run.py --config configs/custom/Madoka.py --render_only --render_video --render_video_factor 8

6. Citations & acknowledgements

You may cite this repo to better convince the reviewers about the reproducibility of your paper. If this repo helps you, please cite it as:

@software{Zhao_PytorchBlockNeRF_2022,
author = {Zhao, Zelin and Jia, Jiaya},
month = {8},
title = {{PytorchBlockNeRF}},
url = {https://github.com/dvlab-research/BlockNeRFPytorch},
version = {0.0.1},
year = {2022}
}

The original paper Block-NeRF and Mega-NeRF can be cited as:

 @InProceedings{Tancik_2022_CVPR,
    author    = {Tancik, Matthew and Casser, Vincent and Yan, Xinchen and Pradhan, Sabeek and Mildenhall, Ben and Srinivasan, Pratul P. and Barron, Jonathan T. and Kretzschmar, Henrik},
    title     = {Block-NeRF: Scalable Large Scene Neural View Synthesis},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2022},
    pages     = {8248-8258}
}

@inproceedings{turki2022mega,
  title={Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs},
  author={Turki, Haithem and Ramanan, Deva and Satyanarayanan, Mahadev},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={12922--12931},
  year={2022}
}

We refer to the code and data from DVGO, Mega-NeRF, and SVOX2, thanks for their great work!

Contributors ✨

Thanks goes to these wonderful people (emoji key):


Zelin Zhao

💻 🚧

EZ-Yang

💻

This project follows the all-contributors specification. Contributions of any kind welcome!

blocknerfpytorch's People

Contributors

allcontributors[bot] avatar sjtuytc avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.