NeRF and DietNeRF

Introduction

NeRF and DietNeRF implementation and some elaborations

This project was created as part of a "Tutorial-Work" at the Hebrew University of Jerusalem.
Please explore the powerpoint presetation for more details »

View src code · NeRF Paper · DietNeRF Paper

Table of Contents

About The Project
- Built With
Getting Started
- Prerequisites
- Installation
Usage
Citations

About The Project

This project is my implementation and some elaborations that I did to the papers:

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis

NeRF presents a method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. The algorithm represents a scene using a fully connected deep network, whose input is a single continuous 5D coordinate (the spatial location (x, y, z), viewing direction (θ,Φ)) and outputs a is an emitted color (R, G, B) and volume density σ of that radiance.

DietNeRF introduces an auxiliary semantic consistency loss that encourages realistic rendering at novel poses. It improves fine details and the reconstruction of the scene.

In addition to implementing the ideas presented in the papers, I tested the importance of the "Viewing Directions" input to the model, created visualizations in the form of plots and videos, and implemented some methods that helped me in the creation of those visualizations. For the calculations of the rotations in the videos, I used quaternions. For more information, please look at the Powerpoint presentation.

Built With

The project is implemented in Python, using the Tensorflow machine learning library. This project is also GCP (Google Cloud Platform) ready. For GCP deployment, follow the Usage guidlines below.

(back to top)

Getting Started

Create a config_file.yaml – Please follow an existing Yaml file for directions.
1. In the Yaml you specify what actions you want to be done in that execution – Training a net, rendering a video, creating a specific plot with a model and more.
Put it in directory called “config_files”.
Write in main.py in the list of “config_list” the name of the config file.
Add to directory Assets the images and the file that holds the images poses.
Add "camera poses" to the same directory of the images
1. If the dataset was created with Blender, add the “cam_data.json” file to the same directory (A script that render images and create the json can be found in “DatasetUtils” directory).
2. If the dataset was created with real images, you can use COLMAP to provide the “poses_bounds.npy” the same way as detailed in LLFF (Local Light Field Fusion) project.

(back to top)

Prerequisites

It may be helpful to set your environment with the supplied “Dockerfile” files. There is a docker for the basic environment, and one that wrap the project into an image ready to be deployed in GCP.

You can set up your environment on your own without the docker supplementary files.

These are the important packages that I used:

python3
numpy
numpy-quaternion
tensorflow-gpu
opencv-python
cloudpathlib[gs]

(back to top)

Installation

Clone the repo

git clone https://github.com/Sahar-E/NeRF-and-DietNeRF.git

Use docker as an environment with the supplied Dockerfiles / Setup venv your own venv.

(back to top)

Usage

Create and add your configuration.yaml file to main.py file where the "TODO" is pointing to. You should fork from an existing one in the project.
Run:
```
  python3 main.py
```

To use real images

If you wish to use real images, you will need to create the poses_bounds.npy in the same way as detailed in LLFF (Local Light Field Fusion) project. After that, it is the same as using the images with the cam_data.json file created by the blender python script in the project.

If you wish to use docker as the environment

Create an "environment" docker image using the Dockerfile in contain_dockerfile_for_base_environment directory.

If you wish to deploy the docker in the Google Cloud Platform

Create an "environment" docker image using the Dockerfile in contain_dockerfile_for_base_environment directory.
Configure the name in the Dockerfile at the root of the project, where the TODO note asks, and create the docker image.
Deploy the docker image in GCP.
Don't forget to set the Google's "Bucket" in the configuration file so it can sync the results into that bucket.

You can use the scipts in the DatasetUtils directory to create the blender pictures, resize the images and create jpgs from mp4 video file:

blender_create_pictures.py

In the “DatasetUtils” you can find the script “blender_create_pictures.py” that can be opened in Blender and activated by it. This script will move the camera in the scene, render the images and write with them their needed metadata with poses data in a file called “cam_data.json”. The folders that will be created are ready to go into the project.

resize_images.py

Another script in the “DatasetUtils” is “resize_images.py” that resizes the images in a given directory to convenient sizes for testing the project, e.g. 50x50, 100x100, 256x256.

video2jpg.py

Also in “DatasetUtils”. Takes a video and break it up to jpg frames.

(back to top)

Citations

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng. ECCV, 2020. https://arxiv.org/pdf/2003.08934.pdf
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis. Ajay Jain, Matthew Tancik, Pieter Abbeel. ICCV, 2021. https://arxiv.org/pdf/2104.00677.pdf
Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. Ben Mildenhall and Pratul P. Srinivasan and Rodrigo Ortiz-Cayon and Nima Khademi Kalantari and Ravi Ramamoorthi and Ren Ng and Abhishek Kar. ACM Transactions on Graphics (TOG). 2019. https://arxiv.org/pdf/1905.00889.pdf, https://github.com/Fyusion/LLFF
COLMAP - Pixelwise View Selection for Unstructured Multi-View Stereo. Schonberger, Johannes Lutz and Zheng, Enliang and Pollefeys, Marc and Frahm, Jan-Michael. ECCV. 2016.

(back to top)

Demo Videos

For more videos, please check out the powerpoint presentation found in the link above (The videos there are 60FPS. I recommend entering "Slide Show" full screen mode. That way the videos will auto start in the presentation).