facebookresearch / keypointnerf Goto Github PK

View Code? Open in Web Editor NEW

372.0 15.0 28.0 13.19 MB

KeypointNeRF Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

License: Other

Python 100.00%

keypointnerf's Introduction

KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

Marko Mihajlovic · Aayush Bansal · Michael Zollhoefer . Siyu Tang · Shunsuke Saito

ECCV 2022

KeypointNeRF leverages human keypoints to instantly generate volumetric radiance representation from 2-3 input images without retraining or fine-tuning. It can represent human faces and full bodies.

News 🆕

[2022/10/01] Combine ICON with our relative spatial keypoint encoding for fast and convenient monocular reconstruction, without requiring the expensive SMPL feature. More details are here.

Installation

Please install python dependencies specified in environment.yml:

conda env create -f environment.yml
conda activate KeypointNeRF

Data preparation

Please see DATA_PREP.md to setup the ZJU-MoCap dataset.

After this step the data directory follows the structure:

./data/zju_mocap
├── CoreView_313
├── CoreView_315
├── CoreView_377
├── CoreView_386
├── CoreView_387
├── CoreView_390
├── CoreView_392
├── CoreView_393
├── CoreView_394
└── CoreView_396

Train your own model on the ZJU dataset

Execute train.py script to train the model on the ZJU dataset.

python train.py --config ./configs/zju.json --data_root ./data/zju_mocap

After the training, the model checkpoint will be stored under ./EXPERIMENTS/zju/ckpts/last.ckpt, which is equivalent to the one provided here.

Evaluation

To extract render and evaluate images, execute:

python train.py --config ./configs/zju.json --data_root ./data/zju_mocap --run_val
python eval_zju.py --src_dir ./EXPERIMENTS/zju/images_v3

To visualize the dynamic results, execute:

python render_dynamic.py --config ./configs/zju.json --data_root ./data/zju_mocap --model_ckpt ./EXPERIMENTS/zju/ckpts/last.ckpt

(The first three views of an unseen subject are the input to KeypointNeRF; the last image is a rendered novel view)

We compare KeypointNeRF with recent state-of-the-art methods. The evaluation metric is SSIM and PSNR.

Models	PSNR ↑	SSIM ↑
pixelNeRF (Yu et al., CVPR'21)	23.17	86.93
PVA (Raj et al., CVPR'21)	23.15	86.63
NHP (Kwon et al., NeurIPS'21)	24.75	90.58
KeypointNeRF* (Mihajlovic et al., ECCV'22)	25.86	91.07

(*Note that results of KeypointNeRF are slightly higher compared to the numbers reported in the original paper due to training views not beeing shuffled during training.)

Reconstruction from a Single Image

Our relative spatial encoding can be used to reconstruct humans from a single image. As a example, we leverage ICON and replace its expensive SDF feature with our relative spatial encoding.

While it achieves comparable quality to ICON, it's much faster and more convinient to use (*displayed image taken from pinterest.com).

3D Human Reconstruction on CAPE

Models	Chamfer ↓ (cm)	P2S ↓ (cm)
PIFu (Saito et al., ICCV'19)	3.573	1.483
ICON (Xiu et al., CVPR'22)	1.424	1.351
KeypointICON (Mihajlovic et al., ECCV'22; Xiu et al., CVPR'22)	1.539	1.358

Check the benchmark here and more details here.

Publication

If you find our code or paper useful, please consider citing:

@inproceedings{Mihajlovic:ECCV2022,
  title = {{KeypointNeRF}: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints},
  author = {Mihajlovic, Marko and Bansal, Aayush and Zollhoefer, Michael and Tang, Siyu and Saito, Shunsuke},
  booktitle={European conference on computer vision},
  year={2022},
}

License

CC-BY-NC 4.0. See the LICENSE file.

keypointnerf's People

Contributors

Stargazers

Watchers

keypointnerf's Issues

Render on synthetic dataset

Hello, thanks for this great work!

I want to try this method on NeRF synthetic dataset (https://drive.google.com/drive/folders/1JDdLGDruGNXWnM1eqY1FNL9PlStjaKWi).

Is this method applicable to non-human data?

Fail to build val dataloader

Hi, authors,

Thanks for your great works! I'm interesting with the performance of KeypointNeRF on the ZJU dataset. I follow the DATA_PREP to prepare dataset and start to train KeypointNeRF.

However, I had the following error:

This is caused when I built the val or test dataset. I had checked the code and found the reason:

I notice that CoreView313 and CoreView315 use different sampled cameras with other persons, see here. The number of cameras in the CoreView 313 and CoreView 315 are 21, so the index of 21 at here is out of bounds.

Could plz to check this for me?

How to generate 3D joints files

Hi, authors

which tool are you used to generate the zju_joints3d.zip?

I am trying to train KeypointNeRF on my own dataset.

multiple GPU mode

I really appreciate the project. Unfortunately, I'm restricted by CUDA memory and limited skills. Could you provide files about the training and inferencing in multiple GPU modes?

Question for projection

Hi, thanks for the great work!!!

What if the 3D points out of index (let's say it may occur negative indexes of image pixel) after projected to other camera views rather than target views?

dataset error

Hi, thanks for your nice work. I download zju_mocap dataset but I dont find CoreView_396 as mentioned in your dataset.

reconstruction of human heads

Hi, author. Thanx for your great work!
It seems that the example given is on task of reconstruction of human bodies. I wonder how to reconstruct human heads from two or three images.
Maybe I read it wrong. Look forwarding to your reply!

About full body reconstruction with texture

Hey，awesome work！
Can I using the KeypointNERF to reconstruct
a full body which can be exported as a obj file with texture.

How to use this for a custom image to animate

CUDA out of memory

hi @markomih ,

Thank you for releasing the code.

Can you share the hardware configuration of your training?

i used a NVIDIA3090 card to run traning code, the errors are as follows:


$ nvidia-smi 
Thu Jun 29 16:09:01 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  Off  | 00000000:B1:00.0 Off |                  Off |
|  0%   41C    P0   103W / 450W |      0MiB / 24564MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+


$ python train.py --config ./configs/zju.json --data_root ./data/zju_mocap

RuntimeError: CUDA out of memory. Tried to allocate 222.00 MiB (GPU 0; 23.68 GiB total capacity; 20.55 GiB already allocated; 49.00 MiB free; 21.58 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

3D facial

Is there a dataset for 3D facial reconstruction? How to generate 3d face obj files? thank you！！！

How to extract 3d keypoints for other human dataset？

If I have SMPL vertices, how to precalculate 3D joints？

Render on my own data

Hi, author. Thanks for your great work!

I'm trying to use your pre-trained model to render my own data, but the function "decode_batch" and the parameters "batch" seems too complicate for me. I'm wondering if there is some easy way to do it?

Look forwarding to your reply!

GPU memory usage on more images?

Hi,

Thank you for releasing the code.

I wonder how much GPU memory KeypointNeRF requires if I use more images like 16 images. Since keypointnerf needs to keep feature maps of all imput images, I want to know if 24GB GPU memory is enough for the training.

Thanks

we can not load environment.yml, ResolvePackageNotFound

we can not load environment.yml,
ResolvePackageNotFound:

libffi==3.4.2=h7f98852_5
setuptools==59.4.0=py38h578d9bd_0
libzlib==1.2.11=h36c2ea0_1013
intel-openmp==2021.4.0=h06a4308_3561
zlib==1.2.11=h36c2ea0_1013
mkl_fft==1.3.1=py38h8666266_1
zstd==1.5.0=ha95c52a_0
...
can you pack all env data to netdisk, thanks!

UnboundLocalError

Thanks for your well-organized code. When I run train.py, an error is as follow:

Could you tell me how to solve it?