Giter VIP home page Giter VIP logo

3ddfa-v3's Introduction

3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation

By Zidu Wang, Xiangyu Zhu, Tianshuo Zhang, Baiqin Wang and Zhen Lei.

This repository is the official implementation of 3DDFA_V3 in CVPR2024 (Highlight).

teaser

3DDFA_V3 uses the geometric guidance of facial part segmentation for face reconstruction, improving the alignment of reconstructed facial features with the original image and excelling at capturing extreme expressions. The key idea is to transform the target and prediction into semantic point sets, optimizing the distribution of point sets to ensure that the reconstructed regions and the target share the same geometry.

News

  • [08/01/2024] We provide a fast CPU renderer based on face3d. It is capable of performing rendering inference functions similar to nvdiffrast.
  • [06/14/2024] We provide a fast version based on MobileNet-V3, which achieves similar results to the ResNet-50 version at a higher speed. Please note that if your environment supports ResNet-50, we still strongly recommend using the ResNet-50 version. (The MobileNet-V3 version is still under testing, and we may update it further in the future.)

Getting Started

Environment

# Clone the repo:
git clone https://github.com/wang-zidu/3DDFA-V3
cd 3DDFA-V3

conda create -n TDDFAV3 python=3.8
conda activate TDDFAV3

# The pytorch version is not strictly required.
pip install torch==1.12.1+cu102 torchvision==0.13.1+cu102 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu102
# or: conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch
# On Windows 10, it has been verified that version 1.10 works. You can install it with the following command: pip install torch==1.10.0+cu102 torchvision==0.11.0+cu102 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html

pip install -r requirements.txt

# Some results in the paper are rendered by pytorch3d and nvdiffrast
# This repository only uses nvdiffrast for convenience.
git clone https://github.com/NVlabs/nvdiffrast.git
cd nvdiffrast
pip install .
cd ..

# In some scenarios, nvdiffrast may not be usable. Therefore, we additionally provide a fast CPU renderer based on face3d.
# The results produced by the two renderers may have slight differences, but we consider these differences to be negligible.
# Please note that we still highly recommend using nvdiffrast.
cd util/cython_renderer/
python setup.py build_ext -i
cd ..
cd ..

Usage

  1. Please refer to this README to prepare assets and pretrained models.

  2. Run demos.

    python demo.py --inputpath examples/ --savepath examples/results --device cuda --iscrop 1 --detector retinaface --ldm68 1 --ldm106 1 --ldm106_2d 1 --ldm134 1 --seg_visible 1 --seg 1 --useTex 1 --extractTex 1 --backbone resnet50
    
    • --inputpath: path to the test data, should be a image folder.

    • --savepath: path to the output directory, where results (obj, png files) will be stored.

    • --iscrop: whether to crop input image, set false only when the test image are well cropped and resized into (224,224,3).

    • --detector: face detector for cropping image, support for retinaface (recommended) and mtcnn.

    • --ldm68, --ldm106, --ldm106_2d and --ldm134: save and show landmarks.

    • --backbone: backbone for reconstruction, support for resnet50 and mbnetv3.


    With the 3D mesh annotations provided by 3DDFA_V3, we can generate 2D facial segmentation results based on the 3D mesh:

    • --seg_visible: save and show segmentation in 2D with visible mask. When a part becomes invisible due to pose changes, the corresponding region will not be displayed. All segmentation results of the 8 parts will be shown in a single subplot.

    • --seg: save and show segmentation in 2D. When a part becomes invisible due to pose changes, the corresponding segmented region will still be displayed (obtained from 3D estimation), and the segmentation information of the 8 parts will be separately shown in 8 subplots.


    We provide two types of 3D mesh files in OBJ format as output.

    • --useTex: save .obj use texture from BFM model.

    • --extractTex: save .obj use texture extracted from the input image. We use median-filtered-weight pca-texture for texture blending at invisible region (Poisson blending should give better-looking results).

  3. Results.

    • image_name.png: the visualization results.
    • image_name.npy: landmarks, segmentation, etc.
    • image_name_pcaTex.obj: 3D mesh files in OBJ format using texture from the BFM model.
    • image_name_extractTex.obj: 3D mesh files in OBJ format using texture extracted from the input image.


    teaser

    teaser

3D Mesh Part Masks

Please refer to this README to download our masks (annotations).

teaser

We provide a new 3D mesh part masks aligned with the semantic regions in 2D face segmentation. The current version is based on BFM (with 35,709 vertices), which shares the same topology as the face models used by Deep3D, MGCNet, HRN, etc. We also provide some other useful attributes.

Synthetic Expression Data

Please refer to this README to download data.

teaser

Based on MaskGan, we introduce a new synthetic face dataset including closed-eye, open-mouth, and frown expressions.

Citation

If you use our work in your research, please cite our publication:

@inproceedings{wang20243d,
  title={3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation},
  author={Wang, Zidu and Zhu, Xiangyu and Zhang, Tianshuo and Wang, Baiqin and Lei, Zhen},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={1672--1682},
  year={2024}
}

Acknowledgements

There are some functions or scripts in this implementation that are based on external sources. We thank the authors for their excellent works. Here are some great resources we benefit: Deep3D, DECA, HRN, 3DDFA-V2, Nvdiffrast, Pytorch3D, Retinaface, MTCNN, MaskGan, DML-CSR, REALY.

Contact

We plan to train 3DDFA_V3 with a larger dataset and switch to more strong backbones or face models. Additionally, we will provide a fast version based on MobileNet. If you have any suggestions or requirements, please feel free to contact us at [email protected].

3ddfa-v3's People

Contributors

wang-zidu avatar

Stargazers

shengxiaolin avatar Houxiao Guo avatar  avatar  avatar Tracy-coder avatar Fronk Supakorn avatar  avatar Alejandro avatar Yuming Gu avatar Johnathan avatar Hulk avatar JingJiChang avatar wenqing wang avatar  avatar  avatar GeekSloth avatar  avatar ZHEQIUSHUI avatar guanfuchen avatar Junhwan Jang avatar 東 avatar Pie Samliam avatar muyanchen avatar Yunlin Chen avatar  avatar Haozhe Wu avatar  avatar  avatar  avatar Tarun Jain avatar  avatar liqi avatar gadas_ avatar Fengan Zhao avatar  avatar sketch avatar xuheny avatar Dazz1e avatar 문이세 avatar  avatar 龙笑泽 avatar Tao tao avatar  avatar  avatar  avatar Markus Rauhalahti avatar  avatar Soroush avatar  avatar  avatar  avatar kk avatar dragonylee avatar  avatar lambdald avatar  avatar Yibo Gao avatar blakeliu avatar  avatar liheyuan avatar Forever~ avatar  avatar Sunshiny avatar sunny avatar killer9 avatar  avatar  avatar SheldonHur avatar Dan Zavy avatar LIU YIYING avatar  avatar  avatar Danling Cao avatar Sijing Wu avatar  avatar  avatar Florian Barthel avatar Xuanhong Chen avatar Zhengkai Jiang avatar Li Wang avatar  avatar  avatar GaaHey Leoi avatar  avatar Zhaoyi Li avatar Udon avatar mortal avatar Tao Liu avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar MMXuan avatar 0xhephaistos avatar gkju avatar Ellery Twelve avatar  avatar

Watchers

Tarun Jain avatar 刘国友 avatar hblflybird avatar Yunlin Chen avatar  avatar Snow avatar  avatar Danylo Kelvich avatar Cheng-Kun Yang avatar Hypochondira avatar  avatar  avatar  avatar

3ddfa-v3's Issues

NoW Challenge

I want to evaluate your results on NoW Challenge. But i meet some problem about getting the 7 3d landmarks which are simply abtained by using index of 68 landmarks from 3d vertex. I don't if it‘s properly.

BFM with Neck

Hi, Is there any way for this to output with a Neck?

EDIT: Ideally, a model with a complete head and neck.

Thanks,
Eric

Video Demo

Hi,

Is there a plan to make a demo that works on video files?

Thanks!
Eric

pos+rot entangled in id+exp

2024-05-28_20-03-19.mp4

two green dots are upper and lower bound of base face mesh model transformed only with rotation+pos, without id+exp.
As we can see it does not move correctly, so we can conclude that residual movements are entagled in id+exp.

Question on Inference Speed

Thank you for the great work.

May I know what is the inference speed relative to 3DDFA-V2 ? Did it get slower or faster? For a simple 512x512 face. Either in FPS or seconds per picture using ResNet50.

Thank you again

Choosing the anchors

Can you please give more details on the implementation of the anchors? I did not read about what H and W are in the paper.

Thank you for the great work.

labels

Great job!Since you‘ve released your data,do you have plan to release your labels of landmarks and face segmentations.Because most face landmark detectors are not accurate to handle faces of expression like closed eye.

Question about the symbol $C_p$ in the paper

Thanks for your excellent work!
I have a question about the symbol $C_p$ in the paper. It seems to indicate two meanings. In subsection Transforming Segmentation to 2D points, the segmentation of each pixel $C_p$ is obtained based on the raw RGB image via a segmentation method. However, in Algorithm 1 of Supplementary Material, $C_p$ is obtained based on the rendered image $I_{seg}=Seg(Render(V_{3d},Tex)$.
I wonder,
(1) whether these two $C_p$ indicate the same segmentation results? If so, which image is selected, the raw image, or the rendered one?
(2) since we already obtained the pixel segmentation based on the raw RGB image, why we render an image from 3d mesh when determining the vertex segmentation annotation?
Thanks.

How to Export Materials(OBJ)

The generated OBJ models: _extractTex.obj and _pcaTex.obj, do not seem to carry UV material and texture attributes. How to export them? Can you update the relevant code? Thank you

correct 4x4 projection matrix?

currently projection matrix is 3x3

np.array([ [1015.0, 0, 0], 
           [0, 1015.0, 0],
           [112.0, 112.0, 1] ], dtype=np.float32)

and when the point is transformed, Z is discarded

pts = pts [..., :2] / pts [..., 2:3]

which is not sutiable for standard graphics transformations like in opengl

can you provide correct 4x4 projection matrix in order to transform homogenous 3D points (x,y,z,1.0) ?

Question on Camera Position / Focal Length

From this issue here: #5 (comment) , and also after reading the paper, it looks the camera position + focal length is fixed (i.e. focal=1015, znear=5, zfar=15).

  1. How is the 1015 value arrived at? What unit is it?
  2. If we don't assume the focal length is 1015 but wish to optimize it based on monocular video frames, is it possible in this work?

Thank you for the great work.

In essence, I am trying to ask if it returns pinhole camera intrinsic and extrinsic parameters.

AssertionError("Torch not compiled with CUDA enabled")

Site packages were installed via secondary option (# or: conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch)
First opion was unavailable due to unexisting version of torch
Platform: Windows 10
ERROR: Could not find a version that satisfies the requirement torch==1.12.1+cu102 (from versions: 1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.8.1+cu102, 1.9.0, 1.9.0+cu102,
1.9.1, 1.9.1+cu102, 1.10.0, 1.10.0+cu102, 1.10.1, 1.10.1+cu102, 1.10.2, 1.10.2+cu102, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1, 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1)

When I tried to use secondary option, I've got an error due to unexisting cuda:
(TDDFAV3) PS C:\Users\oreny\PycharmProjects\3DDFA-V3\3DDFA-V3> python demo.py --inputpath examples/ --savepath examples/results --device cuda --iscrop 1 --detector retinaface --ldm68 1 --ldm106 1 --ldm106_2d 1 --ldm134 1 --seg_visible 1 --seg 1 --useTex 1 --extractTex 1 --backbone resnet50
No CUDA runtime is found, using CUDA_HOME='C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8'
Traceback (most recent call last):
File "demo.py", line 77, in
main(parser.parse_args())
File "demo.py", line 15, in main
recon_model = face_model(args)
File "C:\Users\oreny\PycharmProjects\3DDFA-V3\3DDFA-V3\model\recon.py", line 78, in init
self.u = torch.tensor(model['u'], requires_grad=False, dtype=torch.float32, device=self.device)
File "C:\Users\oreny\miniconda3\envs\TDDFAV3\lib\site-packages\torch\cuda_init_.py", line 211, in _lazy_init
raise AssertionError("Torch not compiled with CUDA enabled")
AssertionError: Torch not compiled with CUDA enabled

Any ideas how to fix it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.