Giter VIP home page Giter VIP logo

Comments (5)

D-X-Y avatar D-X-Y commented on August 26, 2024

Hi Roberto, thanks for trying our codes. Would you mind to let me know how did you get these numbers? What normalization distance did you use? The released model is trained by using https://github.com/D-X-Y/SAN#300-w and should obtain a similar performance compared to results in the paper.

from landmark-detection.

bobetocalo avatar bobetocalo commented on August 26, 2024

Hi,

I have processed 300W using the trained model that you provide called SAN_300W_GTB_itn_cpm_3_50_sigma4_128x128x8

I use inter-pupil distance normalization as in the literature. As a result, I attach some examples of images that I have obtained with your model. Are these images correct? (NME is showed at the bottom left of each image).

3051542838_1
image_013
image_042
image_048
image_082
image_097_1

Are these images similar to the ones that you have obtained in your paper? It is impossible to obtain a 3.98 NME in the Full data set with the model that I am using (https://github.com/D-X-Y/SAN#evaluation-on-the-single-image).

I look forward to your response.

Best regards,
Roberto Valle

from landmark-detection.

D-X-Y avatar D-X-Y commented on August 26, 2024

Hi, I just updated the readme, you can follow https://github.com/D-X-Y/SAN#evaluate-on-300-w to evaluate the released model on 300-W. I think there may be two reasons that cause the worse results from your evaluation:

  1. We use inter-ocular distance to normalize, which is larger than inter-pupil distance and thus can obtain a small number (as mentioned in the second paragraph in Sec. 4.2 and in https://github.com/D-X-Y/SAN#normalization).
  2. The pre-trained model uses the ground truth face bounding box from the official 300-W website. If you use a different type of "face bounding boxes" compared to the one used in the training procedure, you will get worse performance.

Note about the normalization distance: We follow "A deep regression architecture with two-stage re-initialization for high performance facial landmark detection, CVPR 2017" and "300 faces in-the-wild challenge: The first facial landmark localization challenge, ICCV-W 2013"

from landmark-detection.

bobetocalo avatar bobetocalo commented on August 26, 2024

I am using the ground truth face bounding box from the 300W annotations.

I have changed the normalization measure to corners distance. The reported results are not the same that you report in the paper. However, they are closer to the 3.98 mentioned before.

 > Total:
NME: 4.308152161611457
AUC: 48.61202523909483
FR: 5.224963715529752
 > Helen:
NME: 3.5710834782540104
AUC: 55.42145791970029
FR: 0.9090909090909038
 > LFPW:
NME: 3.7346050901785715
AUC: 53.46455804667038
FR: 0.8928571428571397
 > Common:
NME: 3.6372005198985984
AUC: 54.62610391820355
FR: 0.9025270758122761
 > iBUG:
NME: 7.061538898714597
AUC: 23.855890372885476
FR: 22.962962962962962

However, I encourage you to modify Table 1 in your paper because literature (SDM, ESR, LBF, CFSS) does not use corners normalization either.

Finally, could you provide some example of images of 300W with your prediction and the NME obtained?

On the other hand, are you going to release the trained SAN model on AFLW?

from landmark-detection.

D-X-Y avatar D-X-Y commented on August 26, 2024

If you use bash scripts/300W/300W-EVAL.sh 0, you will the NME results on the common set, challenge set (ibug) and full set, i.e., 3.37, 6.82, and 4.04. These results are very similar to the one reported in the paper. I guass the difference between your results and my results might be caused by some pre-processing procedure??

Thanks for your suggestion about modifing Table 1. Before we submit our paper, we didn't notice that SDM, ESR, LBF, CFSS are using inter-pupil, but simply copying the numbers from "A deep regression architecture with two-stage re-initialization for high performance facial landmark detection". That is our mistake. After several months of the CVRP camera ready, we noticed this mistake but can not change that version. I have updated this information in our README and will clarify it in our following papers.

You can refer Figure 8 in the paper for examples. I'm reaching some deadlines, and can not provide NME for specific examples right now.

For the trained SAN model on AFLW, if you want to reproduce the results, you can run commands following https://github.com/D-X-Y/SAN#aflw to obtain. I cannot find the trained model, but I just re-run the codes and should obtain models in several hours. After the training procedure finished, I will share a google driver link.

from landmark-detection.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.