Giter VIP home page Giter VIP logo

Comments (11)

uricamic avatar uricamic commented on June 1, 2024

Hi @mousomer,

the input image is rescaled internally to a fixed resolution, so called normalized frame. So, if the face is smaller than this (you can check what the precise size is in the .xml models, there are <bw_width> and <bw_height> tags defining it), the detection will be "more precise", and when the face is bigger, there is a systematic error introduced by scaling the image down to this fixed resolution.

In short, if the images are bigger, the retraining with a bigger normalized frame would increase the precision. However, the bigger the normalized frame is, the slower the detection (and therefore also training of the model) would be.

In case of any further question, please do not hesitate to ask them either here or on email.

from clandmark.

mousomer avatar mousomer commented on June 1, 2024

I see. thanks. From reviewing the code I had the impression that the NormalizedFrame was constant size (per model type). Was I wrong?

from clandmark.

uricamic avatar uricamic commented on June 1, 2024

Hi @mousomer,

yes, it is constant size per model type. But the input image is always rescaled to this size. So it can detect landmarks on "arbitrary" sized faces, however the detection precision is beside others also influenced by the normalized frame size.

from clandmark.

mousomer avatar mousomer commented on June 1, 2024

So there is an optimal face size per model?

from clandmark.

uricamic avatar uricamic commented on June 1, 2024

Yep, we could call the faces which are of the same size (or smaller) that the model's normalized frame optimal. Because there is no precision loss due to the downscaling.

However, it is definitely not necessary to have very huge normalized frames. Look for example on the results of CLandmark in the 300-W and 300-VW challenges, where the face size per example was very big. Our solution C2F-DPM used normalized frame of 80 x 80 px for the coarse detector and 160 x 160 px for the fine one.

from clandmark.

mousomer avatar mousomer commented on June 1, 2024

Thanks.
Well, the problem I'm having is with the joint MV models (profiles and half-profiles). Suppose the vertical distance eyes-to-mouth is 100 pixels. What box should I send over to detect_optimized?

from clandmark.

uricamic avatar uricamic commented on June 1, 2024

I would go first for the detected face size, check the results and only if they were not satisfactory enough, I would start thinking about re-training the model.

The learning scripts for the jointmv model are very time demanding. I have some unpublished improvements which reduce the time from 2 weeks to 2 days for the current model. But those will require some time before being published. And both variants are quite heavy on memory requirements (around 20GB RAM is needed).

from clandmark.

mousomer avatar mousomer commented on June 1, 2024

Ah, but I'm trying to work with 3-rd party detectors. I guess I could run the openCV cascade first and gather statistics from there.

from clandmark.

uricamic avatar uricamic commented on June 1, 2024

Yeah, I haven't tried OpenCV cascades for profiles yet myself, but it should be surely possible.

from clandmark.

mousomer avatar mousomer commented on June 1, 2024

That's not what you're using for [pre-model] detection? (I was assuming that's the right thing do to because that's what you use in the static_input.cpp example).

from clandmark.

uricamic avatar uricamic commented on June 1, 2024

Nope, I was using the commercial face detector (http://www.eyedea.cz/) for the development of the landmark detector. It provides square face sizes for arbitrary yaw angle oriented faces.

from clandmark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.