This codebase contains the official code for the ObSuRF model by Karl Stelzner, Kristian Kersting, and Adam R. Kosiorek.
The dependencies for the model may be installed using the following commands, starting from a recent (python 3.7 or newer) Anaconda installation.
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
pip install tensorboardx opencv-python pyyaml imageio matplotlib tqdm
To download the datasets, check the project page. The unpacked data should
be placed (or symlinked) in the ./data/
directory.
The model is run by executing python train.py runs/[dataset]/[model]/config.yaml
. Visualizations,
logs, and checkpoints are stored in the directory of the specified config file.
Novel view sequences may be obtained from a trained model via render.py
, e.g.
python render.py runs/[dataset]/[model]/config.yaml --sceneid 0
. Sequences may be compiled into
videos using compile_video.py
.
The model was trained using 4 A100 GPUs with 40GB of VRAM, each. If you do not have these resources available, consider editing the config files to reduce the batch size (training: batch_size
) or the number of target points per scene (data: num_points
). We have not found the model to be particularly sensitive to either value.
Here, we provide some pretrained checkpoints for reference.
Model | Dataset | Link |
---|---|---|
ObSuRF with pixel conditioning and overlap loss | CLEVR3D | Link |
ObSuRF with pixel conditioning and overlap loss | MultiShapeNet | Link |
If you found this codebase useful, please consider citing our paper:
@article{stelzner2021decomposing,
title={Decomposing 3d scenes into objects via unsupervised volume segmentation},
author={Stelzner, Karl and Kersting, Kristian and Kosiorek, Adam R},
journal={arXiv:2104.01148},
year={2021}
}
Parts of this codebase were adapted from the following publicly available repositories: