Giter VIP home page Giter VIP logo

Comments (3)

aleksandrkim61 avatar aleksandrkim61 commented on June 12, 2024

Hi, thank you for checking out our work!

While our IDs numbers for pedestrians aren't the best overall, I think it's important to make sure you make holistic comparisons - looking at all metrics and considering each method's strengths. For instance:

  • the only two methods that have higher HOTA numbers (the metric that takes into account all other tracking metrics, including IDs and tries to produce an overall metric) do not work on the Car class, which might mean that they are employing techniques that do not generalize to other classes. I do not remember those methods at this point, so perhaps they simply did not report their numbers for some other reason. On the other hand, EagerMOT is the best for cars AND has good results for pedestrians. Some parts of the method could have also been optimised for a specific class, but we chose to go for a more general approach, also check out our results on NuScenes with 7 tracked classes.
  • CenterTrack is only slightly better for pedestrians, but is worse for cars by roughly the same amount, so I, personally, view these two as making the same tradeoff, but in opposing directions.
  • The other methods have lower overall HOTA, despite lower IDs, which could be explained by lower recall - obviously it is easier to make fewer IDs when tracking fewer objects. I do not remember if this was the reason for their lower HOTA numbers, but this also shows that a single metric is not enough to compare methods. HOTA is also not a perfect representation and we all should always use metrics that are appropriate for the use case at hand. EagerMOT is very easy to understand and each parameter, logic flow can be modified to suit your exact needs and preferred tradeoffs, for example to have lower IDs at the cost of lower recall.

In our particular case, our association method is intentionally extremely simple, relying on a simple motion model on 3D and IoU for 2D. In both of these cases, precise bounding boxes are very important for correct matching, which is a lot easier to do for cars than most other classes: cars have a lot of training data and do not make rapid or agile movements. Pedestrians are the hardest class to track because they have small bounding boxes, are omnidirectional and do not necessarily acceleration smoothly, which is hard to take into account with simple motion models.

At the heart of it all is the fact that assumptions for how a car moves do not hold for pedestrians as vice versa, which is why a single motion model type does not work equally well for everything.
You probably already knew a lot of this, but I started writing and decided to finish the full thought :)

In conclusion, while IDs are lower than we would like, this configuration allowed us to get better overall HOTA numbers. If IDs are the most important thing for you, you can tweak parameters to do so. I am quite sure that if you get better pedestrian metrics, you will see a noticeable drop for cars. Feel free to revisit the chapter in the paper where motion modeling and parameters are explained and you should be able to tell which ones to tweak if needed.

I hope this answers your questions.

from eagermot.

Zhangyongtao123 avatar Zhangyongtao123 commented on June 12, 2024

Thank you very much for your detailed explanation!
Have you ever done any experiments on the waymo dataset? I'm very curious about the tracking result of EagerMOT on the waymo dataset (the waymo dataset has a higher frame rate)

from eagermot.

aleksandrkim61 avatar aleksandrkim61 commented on June 12, 2024

Waymo was not used - did not get to it in time for ICRA

from eagermot.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.