Giter VIP home page Giter VIP logo

Comments (7)

patrikhuber avatar patrikhuber commented on August 26, 2024

Hi!

This library is not only about landmark detection - as the title says, it's a A C++11 implementation of the supervised descent optimisation method, and it's quite generic. In fact, the examples/ directory contains first-run examples for approximating an arbitrary function, 3D pose estimation with a face model, and landmark detection. You are right that this code doesn't directly contain an implementation of the "Fitting 3D Morphable Models using Local Features" paper - however, this library is used as the foundation of the algorithm presented in the paper, it's the "heart" of the algorithm. So I would disagree that it's "misleading". However we are happy if you want to instead cite our other paper "Random Cascaded-Regression Copse for Robust Facial Landmark Detection". Also I'm open for suggestions if you still think it's misleading.

from superviseddescent.

mkutny avatar mkutny commented on August 26, 2024

My understanding that the 4dface logic is the following:

  • detect landmarks (using RCR in superviseddescent)
  • estimate pose given the landmarks (eos)
  • fit the shape (eos)

Now with regard to the references:

  • 4dface page reference "A Multiresolution ..." and "Fitting 3DMM ...". Both papers discuss pose estimation and shape fitting but miss landmark detection.
  • eos page reference "A Multiresolution ..." which is perfectly ok.
  • superviseddescent page refers to the "Supervised Descent Method and Its Applications to Face Alignment" (which is ok) and to the "Fitting 3DMM ...".

As I can see all the references miss landmark detection. And the problem with "Fitting 3DMM ..." is that it's written in a way as though landmark detection was not needed for pose estimation/shape fitting (suggesting quite opposite - that PE/SF could be used for LD).

Is my current understanding of the code's logic correct or I'm missing something?

from superviseddescent.

patrikhuber avatar patrikhuber commented on August 26, 2024

My understanding that the 4dface logic is the following: ...

Yes, that's correct!

As I can see all the references miss landmark detection.

In case of 4dface, this is kind of intentional. The landmark detection is not really what sets 4dface apart - the landmark detection is just a necessity. Actually, you can plug in any landmark detection, for example a commercial one, or dlib, and it might even run better. However we provide it with our RCR model (which is quite good too!), to offer a "complete package".
We could add "Random Cascaded-Regression Copse for Robust Facial Landmark Detection" to the references list, that's true, (also on the superviseddescent page), I'll think about that - it's our paper too anyway ;-)

And the problem with "Fitting 3DMM ..." is that it's written in a way as though landmark detection was not needed for pose estimation/shape fitting (suggesting quite opposite - that PE/SF could be used for LD).

Aah! I think here indeed you might be misunderstanding that algorithm. In "Fitting 3DMM ...", we directly estimate the pose and shape parameters from local features (initialised with a face box). No landmarks are needed. In fact, of course, trivially, once we obtained the pose and shape parameters through the regressors, we can render the model and project whichever landmarks we want.

However as you are noticed, this algorithm is not directly used in 4dface (or eos) - rather, and that's what I mentioned earlier, eos and superviseddescent are used as the foundation of the "Fitting 3DMM ..." algorithm, which is why I think the reference is appropriate. I agree with you that for 4dface, we should probably put a better fitting reference, specific to 4dface - but we don't have that yet, so the other papers are kind of the "best fit", to give us credit for our work (which I think is again appropriate).

Feel free to ask if you hvae further questions or concern.

from superviseddescent.

mkutny avatar mkutny commented on August 26, 2024

The first time I read "Fitting 3DMM ..." I was sure it described algorithm to fit the face w/o landmarks. As SD was used in 4dface it set my expectations to be markless implementation. It appeared to be not. So I inferred "Fitting 3DMM ..." was misleading.

Finally it cleared up: SD had been prepared for "Fitting 3DMM ..." but another implementation (4dface) was based on SD instead.

Now I wonder why 4dface doesn't use SD as intended in "Fitting 3DMM"? Is it just a step towards "Fitting 3DMM" implementation?

from superviseddescent.

patrikhuber avatar patrikhuber commented on August 26, 2024

Finally it cleared up

Great! :-) So I think your biggest misconception was that SD is a particular algorithm of only one paper, when in fact it's an optimisation algorithm (like gradient descent) that can be used for many tasks - depending on how you set them up, for "traditional" landmark detection, or for "markerless" face fitting. (and for much more)

SD had been prepared for "Fitting 3DMM ..." but another implementation (4dface) was based on SD instead.

Well, more or less - the superviseddescent library was developed initially for "traditional" landmark detection, but yes, we always had an algorithm like in "Fitting 3DMM ..." in mind, so that's why we developed the framework in a generic way. But this point is not really important.

Now I wonder why 4dface doesn't use SD as intended in "Fitting 3DMM"?

One "problem" with the "direct" estimate in "Fitting 3DMM..." is that you need ground-truth camera and shape information, which you don't really have, or at least not a lot of, and not for "realistic" in-the-wild databases like ibug/LFPW/Helen etc. There is some ways around it and some interesting stuff I'd like to try but it's rather low priority and I never got to it. If you're interested in that, you can have a look at Stan Li's paper, who published the more or less same idea at a similar time.
In any case, for the purpose of 4dface, it was much easier to just train a "traditional" landmark detection on 3000 ibug images with superviseddescent (like the RCR), and be done with it. It has the additional advantage that you can plug in any landmark detection, if you have a better one (for example commercial).

from superviseddescent.

mkutny avatar mkutny commented on August 26, 2024

Thanks for the reference, I'll definitely look at the paper!

from superviseddescent.

patrikhuber avatar patrikhuber commented on August 26, 2024

I'll close this issue. Feel free to re-open or open a new one if you have further questions.

from superviseddescent.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.