Giter VIP home page Giter VIP logo

ln3diff's Introduction

LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation

S-Lab, Nanyang Technological University1;
Wangxuan Institute of Computer Technology, Peking University2;
Shanghai Artificial Intelligence Laboratory 3

LN3Diff is a feedforward 3D diffusion model that creates high-quality 3D object mesh from text within 8 V100-SECONDS.
A standing hund. An UFO space aircraft. A sailboat with mast. An 18th century cannon. A blue plastic chair.

For more visual results, go checkout our project page 📃

Codes coming soon 👊

This repository contains the official implementation of LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation


📣 Updates

[06/2024] LN3Diff got accepted to ECCV 2024 🥳!

[03/2024] Initial release.

[04/2024] Inference and training codes on Objaverse, ShapeNet and FFHQ are released, including pre-trained model and training dataset.

🐪 TODO

  • Release DiT-based 3D generation framework.
  • Polish the dataset preparation and training doc.
  • add metrics evaluation scripts and samples.
  • Lint the code.
  • Add Gradio demo.
  • Release the new T23D Objaverse model trained with 80K+ instances dataset.
  • Release the inference and training code.
  • Release the pre-trained checkpoints of ShapeNet and FFHQ.
  • Release the pre-trained checkpoints of T23D Objaverse model trained with 30K+ instances dataset.
  • Release the stage-1 VAE of Objaverse trained with 80K+ instances dataset.

🤝 Citation

If you find our work useful for your research, please consider citing the paper:

@inproceedings{lan2024ln3diff,
    title={LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation}, 
    author={Yushi Lan and Fangzhou Hong and Shuai Yang and Shangchen Zhou and Xuyi Meng and Bo Dai and Xingang Pan and Chen Change Loy},
    year={2024},
    booktitle={ECCV},
}

🖥️ Requirements

NVIDIA GPUs are required for this project. We conduct all the training on NVIDIA V100-32GiB (ShapeNet, FFHQ) and NVIDIA A100-80GiB (Objaverse). We have test the inference codes on NVIDIA V100. We recommend using anaconda to manage the python environments.

The environment can be created via conda env create -f environment_ln3diff.yml, and activated via conda activate ln3diff. If you want to reuse your own PyTorch environment, install the following packages in your environment:

# first, check whether you have installed pytorch (>=2.0) and xformer.
conda install -c conda-forge openexr-python git
pip install openexr lpips imageio kornia opencv-python tensorboard tqdm timm ffmpeg einops beartype imageio[ffmpeg] blobfile ninja lmdb webdataset opencv-python click torchdiffeq transformers
pip install git+https://github.com/nupurkmr9/vision-aided-gan

🏃‍♀️ Inference

Download Models

The pretrained stage-1 VAE and stage-2 LDM can be downloaded via OneDrive.

Put the downloaded checkpoints under checkpoints folder for inference. The checkpoints directory layout should be

checkpoints
├── ffhq
│         └── model_joint_denoise_rec_model1580000.pt
├── objaverse
│        ├── model_rec1680000.pt
│        └── model_joint_denoise_rec_model2310000.pt
├── shapenet
│        └── car
│                 └── model_joint_denoise_rec_model1580000.pt
│        └── chair
│                 └── model_joint_denoise_rec_model2030000.pt
│        └── plane
│                 └── model_joint_denoise_rec_model770000.pt
└── ...

Inference Commands

Note that to extract the mesh, 24GiB VRAM is required.

Training

Preparation:

The Cap3D captions can be downloaded from here. Please put under './datasets/text_captions_cap3d.json'

Stage-1 VAE 3D reconstruction

For (Objaverse) stage-1 VAE 3D reconstruction and extract VAE latents for diffusion learning, please run

bash shell_scripts/final_release/inference/sample_obajverse.sh

which shall give the following result:

triplane_1680000_0.mp4

The marching-cube extracted mesh can be visualized with Blender/MeshLab:

Mesh Visualization

We upload the pre-extracted vae latents at here, which contains the correponding VAE latents (with shape 32x32x12) of 76K G-buffer Objaverse objects. Feel free to use them in your own task.

For more G-buffer Objaverse examples, download the demo data.

Stage-2 Text-to-3D

We train 3D latent diffusion model on top of the stage-1 extracted latents. For the following bash inference file, to extract mesh from the generated tri-plane, set --export_mesh True. To change the text prompt, set the prompt variable. For unconditional sampling, set the cfg guidance unconditional_guidance_scale=0. Feel free to tune the cfg guidance scale to trade off diversity and fidelity.

Note that the diffusion sampling batch size is set to 4, which costs around 16GiB VRAM. The mesh extraction of a single instance costs 24GiB VRAM.

For text-to-3D on Objaverse, run

bash shell_scripts/final_release/inference/sample_obajverse.sh

For text-to-3D on ShapeNet, run one of the following commands (which conducts T23D on car, chair and plane.):

bash shell_scripts/final_release/inference/sample_shapenet_car_t23d.sh
bash shell_scripts/final_release/inference/sample_shapenet_chair_t23d.sh
bash shell_scripts/final_release/inference/sample_shapenet_plane_t23d.sh

For text-to-3D on FFHQ, run

bash shell_scripts/final_release/inference/sample_ffhq_t23d.sh

🏃‍♀️ Training

Dataset

For Objaverse, we use the rendering provided by G-buffer Objaverse. A demo subset for stage-1 VAE reconstruction can be downloaded from here. Note that for Objaverse training, we pre-process the raw data into wds-dataset shards for fast and flexible loading. The sample shard data can be found in here.

For ShapeNet, we render our own data with foreground mask for training, which can be downloaded from here. For training, we convert the raw data to LMDB for faster data loading. The pre-processed LMDB file can be downloaded from here.

For FFHQ, we use the pre-processed dataset from EG3D and compress it into LMDB, which can also be found in the onedrive link above.

Training Commands

Coming soon.

🗞️ License

Distributed under the S-Lab License. See LICENSE for more information.

Contact

If you have any question, please feel free to contact us via [email protected] or Github issues.

ln3diff's People

Contributors

nirvanalan avatar

Stargazers

 avatar coolcoolのyisuanwang avatar Seokju Yun avatar  avatar ZJGUO avatar Syed Sajeel avatar Aaron avatar Tianxing Wu avatar Bhargav oza avatar 板蓝根 avatar Xiong Lin avatar  avatar Jing Tang avatar StriveZs avatar Jiawei Ren avatar Kim Youwang avatar WeiYuemin avatar tjnuwjm avatar xiong bojun avatar  avatar  avatar ltduan avatar Jeon Youngwoo avatar  avatar  avatar Yongwei Chen avatar Mr.wen avatar Jiale Xu avatar Slice avatar  avatar Hu Zhu avatar Xinhua Cheng avatar  avatar Hollis-7 avatar  avatar  avatar Zhe Zhu avatar Xingang Pan avatar takuya_goto avatar Haitao Yang avatar Jeff Carpenter avatar WangXian avatar Duotun Wang avatar Jiacheng Wei avatar  avatar Yunhan Yang avatar Zhipeng Huang avatar MiZhenxing avatar Zhenyu Tang avatar  avatar L.JIE avatar  avatar YANHONG ZENG avatar  avatar Xinyang Li avatar Rekkles avatar Paragoner avatar  avatar Tasha Upchurch avatar Akash avatar  avatar Zhexin Liang avatar  avatar XingyuRen avatar アキラ avatar Yuming Jiang avatar zhangdanfeng avatar  avatar Jingnan Gao avatar  avatar Jinpeng Liu avatar Snow avatar lambdald avatar  avatar Leo avatar Zhiqi Li avatar feitongt avatar Bo Dai avatar WZY99 avatar Jonathan Clark avatar LeslieZhao avatar Haoge Deng avatar Lu Ming avatar Shuai Yang avatar kiui avatar Zeyi Sun avatar Fangzhou Hong avatar Zhaoxi Chen avatar Shangchen Zhou avatar Xingyi Yang avatar Said avatar

Watchers

Snow avatar Tasha Upchurch avatar takuya_goto avatar  avatar Pyjcsx avatar Haitao Yang avatar MiZhenxing avatar hiyyg avatar Haoge Deng avatar  avatar  avatar

ln3diff's Issues

How to extract texture mesh from NeRF?

Dear authors, thank you very much for your wonderful work! But I wonder how to get a texture mesh from generative NeRF. The current code only save a mesh file without color. Is it possible to get a texture mesh with color information on its every vertex?
Looking forward for your reply, thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.