Giter VIP home page Giter VIP logo

Comments (3)

vitoralbiero avatar vitoralbiero commented on June 2, 2024

Hello,

Thank you for pointing this out!
Our model is indeed outputting rotation vectors, and the comparison on the notebooks should first be converted to Euler angles. I will update the evaluation notebooks to reflect this change.
Thanks!

from img2pose.

KarlKulator avatar KarlKulator commented on June 2, 2024

Hi,

there is still the problem that the euler angle convention and coordinate system convention does not match the convention used in AFLW2000-3D.
This can lead to different errors for yaw, pitch and roll even when using the same convention for groundtruth and estimate. Usually the convention in AFLW2000-3D is used, making your results not comparable to literature.

When looking at the person from the front as an observer, in the AFLW2000-3D coordinate system the X-Axis goes to the right, the Y-Axis to the top and the Z-Axis towards the observer. In your coordinate system (OpenCV) the Y- and Z-Axis are inverted.
Additionally the euler (tait-bryan) convention in AFLW2000-3D is 'XYZ' instrinsic rotations (and counter clockwise rotation).

So to convert your rotvec to AFLW2000-3D euler convention you need to use the function you already provided in #17 (comment) (convert_to_aflw). But then use the return from this function directly and not transform them again. The errors will be different to your current ones.

Ideally you would additionally use the groundtruth from AFLW2000-3D (pose_para) instead of you own groundtruth.

from img2pose.

vitoralbiero avatar vitoralbiero commented on June 2, 2024

Hello @KarlKulator,

Thank you for the suggestion.

As our model is not constrained by yaw angles of (-90, 90), the AFLW2000-3D convention fails to accurately measure our errors. One example is the image below, where the error is qualitatively small in visualization, small on zxy rotation angles, but large on xyz rotation angles, due to yaw been predicted a little above 90 degrees.

example_xyz_zxy

Nevertheless, in response to your question, we released a model trained with constrained yaw poses and updated the evaluation notebooks to use the AFLW2000-3D ground-truth (pose_para) with its standard convention (xyz).

The updated version obtains state-of-the-art accuracy when measured in the AFLW2000-3D representation (in fact, pose estimation results actually improved in some cases). We will update the information in the camera-ready version of our CVPR'21 paper, and arXiv accordingly.

from img2pose.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.