Giter VIP home page Giter VIP logo

3ddfa_v2's Introduction

3ddfa_v2's People

Contributors

ak391 avatar cleardusk avatar conorturner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3ddfa_v2's Issues

About test accuracy

First of all, thanks for such excellent work. With reference to the work of 3DDFA_V1, plus the synthesis of training data by myself, the test accuracy of the model has been greatly improved. The test accuracy is as follows:
微信图片_20201013103455
Using the mobilenetv2 model, the training method is the same as 3DDFA_V1, the number of shape parameters is 60, and the number of expression parameters is 29. If a large model is used, the accuracy will be further improved. The actual test stability has also been greatly improved.
Therefore, I look forward to the open source of 3DDFA_V2 related training algorithms, thank you very much.

Extracting UV textures in the video for all the frames

Describe the bug
Extracting UV textures for multiple frames in the video gives black image as output after the 1st frame. The output becomes greyish after the 1st frame and then eventually becomes black.

To Reproduce
add the line
uv_tex(img, ver_lst, tddfa.tri, show_flag=args.show_flag, wfp=wfp)
at line 131 in the demo_video_smooth.py below
elif args.opt == '3d'

Expected behavior
To produce the UV textures for all the images(frames) in the video

what makes such lightweight backbone works so well?

compared to the previous version of your work, 3ddfa, 3ddfa_v2's structure is much simpler, but achieves better results. so i wonder if the meta-joint loss is the reason that enable mobilenet to outperform previous works. i would like to know your opinion on applying these methods(look ahead, combine different losses) to solving other tasks.

Ear, Neck and Landmark Visibility

Hi,

Thanks for sharing the great work!

Here I have a few questions:

  1. In the old 3DDFA (3DDFA_V1), the reconstructed vertices cover the ears and neck regions. But it is excluded in the new 3DDFA_V2. However, in some applications we would like to get the details around those regions, may I know is there an easy way to adapt the new 3DDFA_V2 model to cover the ears and neck regions?

  2. In the old 3DDFA paper (CVPR ver.), the authors visualized the visibility of each landmark. I am wondering how to get the landmark visibility from the reconstructed vertices, could you please give me some hints about it?

Looking forward to your reply!
Best regards.

Algorithm 2 of the paper

image

In the red circle shown above, why you divide N by 3 ?

N is the number of points, T(:, 4) has three elements, representing respectively the displacement in three dimensions of the coordinates.

If we take the first predicted displacement as Tx, and it's corresponding ground truth displacement as Tgx then, the resulting difference of reconstructed 3D shape should be:
||(Tx-Tgx, Tx-Tgx, Tx-Tgx, ... , Tx-Tgx)|| = |Tx-Tgx|*√(N)

Is there any mistake in my derivation above?

meta-join

Hi,I realize meta-join training and find that VDC dominates in the early stage ,because of VDC loss valueconverge from about 493 to 200 very fast. It is opposite of the conclusion from paper:fPWDC dominates in the early stage and VDC guides in the late stage. Any suggestions?@cleardusk

how to get the frontal face?

Given a face photo in large pose, how to get the frontal face picture using this 3D model?
And if anyone has implement this function?

'Namespace' object has no attribute 'dense_flag'

Thanks for release the codes.
When run "python3 demo.py -f examples/inputs/emma.jpg", I meet
'Namespace' object has no attribute 'dense_flag'
I found this is because this line in demo.py https://github.com/cleardusk/3DDFA_V2/blob/master/demo.py#L44
arser.add_argument('--dense_flg', default='true', type=str2bool, help='whether reconstructing dense')
but in line32, https://github.com/cleardusk/3DDFA_V2/blob/master/demo.py#L32
ver_lst = tddfa.recon_vers(param_lst, roi_box_lst, dense_flag=args.dense_flag)
obviously, args.dense_flag is not match '--dense_flg'.
May be this should be fixed.

Eval on AFLW2000

Hi, jianzhu,
I eval 3DDFA_V2 on AFLW2000 by script from 3DDFA, and the result is
[ 0, 30] Mean: 2.735, Std: 1.127
[30, 60] Mean: 3.477, Std: 1.431
[60, 90] Mean: 4.543, Std: 1.961
[ 0, 90] Mean: 3.585, Std: 0.742

The model seems like MobileNet(M+R) in your paper, right?
image

How to visualise dense landmarks on videos?

I know I can use python3 demo_video.py -f examples/inputs/videos/214.avi --opt 3d render dense landmarks but how am I going to see 3d dense landmarks alone without rendering, I am planning to use them for Facial recognition embedding

Head Generation

I want to know if I got the 3D face using 3DDFA. How could I generate the whole head without hair?
Does anyone have a good idea? Thanks a lot!

通过3DDFA得到面部模型后,怎么样才能得到完整的光头模型呢?
希望大家出出主意,非常感谢!

Training detail for 3DDFA_V2

i am looking for training details for this model, but they are not provided with code base, i refer 3DDFA but the result are not good. could you please provide the training details.

Thanks,

NME metric

Hello, thanks for your excellent work!

The NME of different datasets is an important metric. For a fair comparison, may you share the code about how to calculate the NME? Or are there any official code to calculate the NME metrics, and the visibility vector which is shown in "Pose-Invariant 3D Face Alignment(ICCV 2015)" as follow:
image
image

Looking forward to your reply. Good luck!

Resnet weights

Hello there! Thank you for your excellent work, I was curious if you had any plans to provide weights for resnet?

About the output

Is it able to turn the result to mesh file with texture?Thank you.

How to make expressions more expressive?

Hi, the model is very accurate for face shapes.
However, it seems to have only 10 parameters for expressions.
Eyes' and eyebrows' motions are not well detected. Expressions for mouth are good.
How to improve this? Do I need to train a model myself with more parameters for expressions?
Thank you very much for your reply.

Bounding box of AFLW2000-3D

Hi!
I am trying to calculate the NME of the AFLW2000-3D dataset. But I have no way to get the bounding box annotation from AFLW2000-3D. The "roi" information provided by the AFLW2000-3D dataset is strange. I think it is not the bounding box of faces.
Could you tell me how to get the bounding box annotation of the AFLW2000-3D dataset?
image

Can you tell the big variable that saved the landmark?

Hello I would like to know if face recognition is possible using this code. Can you tell me which file and variable that stores the landmark of the face

In this code, I am trying to perform face recognition by giving the landmark variable as a face recognition code. Is it possible?

models

请问您这个models的输入大小shape是多少,还有就是models的参数文件在哪个文件夹里。
thanks you

测试效果不太理想呀

跑了demo,发现视频的关键点检测也不是很稳定呀,并且带了smooth滤波的demo眨眼的时候定位不准确。还没有我关键点检测算法稳定呢,我自己开发的算法计算量只有10M,106关键点的。

构建失败

platform:macOS catalina 10.15.7 (19H2)
gcc version:

Configured with: --prefix=/Applications/Xcode.app/Contents/Developer/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX10.15.sdk/usr/include/c++/4.2.1
Apple clang version 12.0.0 (clang-1200.0.32.27)
Target: x86_64-apple-darwin19.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
 (pytorch) quanhaoguo@QuanhaodeMacBook-Pro 3DDFA_V2 % sh ./build.sh
running build_ext
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
running build_ext
skipping 'lib/rasterize.cpp' Cython extension (up-to-date)
building 'Sim3DR_Cython' extension
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I/Users/quanhaoguo/anaconda3/envs/pytorch/include -arch x86_64 -I/Users/quanhaoguo/anaconda3/envs/pytorch/include -arch x86_64 -I/Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include -I/Users/quanhaoguo/anaconda3/envs/pytorch/include/python3.6m -c lib/rasterize.cpp -o build/temp.macosx-10.7-x86_64-3.6/lib/rasterize.o -std=c++11
clang: warning: include path for libstdc++ headers not found; pass '-stdlib=libc++' on the command line to use the libc++ standard library instead [-Wstdlibcxx-not-found]
In file included from lib/rasterize.cpp:624:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/arrayobject.h:4:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/ndarrayobject.h:12:
In file included from /Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/ndarraytypes.h:1822:
/Users/quanhaoguo/anaconda3/envs/pytorch/lib/python3.6/site-packages/numpy/core/include/numpy/npy_1_7_deprecated_api.h:17:2: warning: 
      "Using deprecated NumPy API, disable it with "          "#define
      NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION" [-W#warnings]
#warning "Using deprecated NumPy API, disable it with " \
 ^
In file included from lib/rasterize.cpp:629:
lib/rasterize.h:5:10: fatal error: 'cmath' file not found
#include "cmath"
         ^~~~~~~
1 warning and 1 error generated.
error: command 'gcc' failed with exit status 1

Feature request: add onnxruntime inference

Feature request

I have benchmarked the onnxruntime library and find its latency (mobilenet in our case) is rather small. However, my personal time is limited. Therefore, I hope anyone interested in this repo can contribute to this repo by adding the onnxruntime inference : )

The onnxruntime tutorial is here.

Question about the function `parse_roi_box_from_bbox`

Thanks for your great work!
When I read your released code at this place:

def parse_roi_box_from_bbox(bbox):

I could not understand your operation purpose. Of course, the face image should be a square.

But I guess this is done because the face detector tends to detect the upper part of the face, so you move down the bounding box , is it right? By the way, are the hyper-parameters in this function like 0.14, 1.58 chosen empirically?

Looking forward to your reply!

landmark inaccuracy on the RAVDESS dataset

Type your opinions or ideas here.

Hi, this project is really very useful for several downstream tasks.
Currently, I'm utilizing 3DFFA_V2 to reconstruct some talking faces on the RAVDESS dataset.
This is a very neat in-lab dataset, which has a high-resolution head with white background.
However, the reconstruction accuracy seems not to be good, several landmarks on lips are not aligned.
Here are some examples
this is the original image
original
this is the reconstructed image
reconstruct
Obviously, the lip in the original image is closed, but in the reconstructed image is opened.
I'm wondering whether should I adjust some parameters when conducting 3D reconstruction on videos?

ValueError: Buffer dtype mismatch, expected 'int_t' but got 'long long'

In the line xx1 = np.maximum(x1[i], x1[order[1:]]) below it throws this error in FaceBoxes\utils\nms\py_cpu_nms.py

def py_cpu_nms(dets, thresh):
    """Pure Python NMS baseline."""
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    scores = dets[:, 4]

    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        i = order[0]
        keep.append(i)
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        inds = np.where(ovr <= thresh)[0]
        order = order[inds + 1]

    return keep

scipy.io.loadmat fail, decompressing error.

Please detail your questions here : )

File "mio5_utils.pyx", line 548, in scipy.io.matlab.mio5_utils.VarReader5.read_full_tag File "mio5_utils.pyx", line 556, in scipy.io.matlab.mio5_utils.VarReader5.cread_full_tag File "streams.pyx", line 176, in scipy.io.matlab.streams.ZlibInputStream.read_into File "streams.pyx", line 163, in scipy.io.matlab.streams.ZlibInputStream._fill_buffer zlib.error: Error -2 while decompressing data: inconsistent stream state

When I run demo.py, an error happened on line: from utils.uv import uv_tex. I search it on baidu or google, but find nothing. Someone get this? share pls.

sh ./build.sh help me

Hello, I am a student starting machine learning. I am trying to run the currently uploaded code by installing python with anaconda
sh ./build.sh In this part, because it is a Windows environment, I cannot use sh
If you can run this code on Windows, I would appreciate it if you can tell me how.

Align the coordinates of 3D landmarks

Hey congrats for this beautiful job, very interesting actually.

I was wondering if there is way to

  1. get a 3d representation of the landmaks, like (x,y,z) coordinates for each landmark point.
  2. I would like to align the arrays of landmarks of different faces with a reference landmarks.

Ideally I would like to have all the 3d landmarks coordinates aligned to the same reference point, for example: all the landmarks coordinates are the coordinates of a frontal face where the proportions of the face are all similar.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.