Giter VIP home page Giter VIP logo

Comments (16)

mrharicot avatar mrharicot commented on July 29, 2024 8

Hi,

To convert disparities to depth you need to use the usual formula, depth = baseline * focal / disparity, we assume we know both of these.
The disparities generated by the model are normalized by the image width, just scale them by it to get the value in pixels.
You can find the values for the focal of KITTI here, its baseline is ~0.54m.
For cityscapes the baseline is 0.22m and the focal is 2262 for a width of 2048.

Please remember that the disparities generated by the model are normalized to the image width, you thus need to scale them back before converting to depth.

from monodepth.

SAmmarAbbas avatar SAmmarAbbas commented on July 29, 2024 6

@mrharicot I think you have written wrong formula in your first comment. Shouldn't it be depth = baseline * focal / disparity ?

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024 2

You need to use the baseline from the dataset it was last trained on, in this case kitti.

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024 1

You need to use the one of the dataset used for training.
However if you test with images captured with a different focal and aspect ratio, it might not produce reliable results.

from monodepth.

harishannavajjala avatar harishannavajjala commented on July 29, 2024

Oh... what if there is no second camera. Say I used only one camera setup. From my understanding, I thought, during test time this model works for single mono-camera images as well. And I pinged you on your gmail the image that I used and the disparity map I got. Thanks !

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024

You need to use the baseline of the dataset you trained on!

from monodepth.

villanuevab avatar villanuevab commented on July 29, 2024

@mrharicot thank you for your answer. What about for models fine-tuned on multiple datasets? For example, if we used the pre-trained model model_city2kitti, which baseline would we use for inference, KITTI's or Cityscapes'?

from monodepth.

villanuevab avatar villanuevab commented on July 29, 2024

Thank you. I forgot to ask: would the focal length to use also be the one used during training? Or during inference?

from monodepth.

villanuevab avatar villanuevab commented on July 29, 2024

Regarding the width_to_focal dict you pointed to, how do you calculate the key:dict pairs from a focal length? For example, from an 8mm focal length camera used during training

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024

Those parameters were obtained from the intrinsic matrices given with the KITTI dataset.
To obtain the proper focal length of your camera in pixel you should calibrate it using a checkerboard, there are lots of tutorials on how to do this in OpenCV.
If you want to read more about calibration take a look at the chapter 14 of this book:
http://web4.cs.ucl.ac.uk/staff/s.prince/book/book.pdf

from monodepth.

keishatsai avatar keishatsai commented on July 29, 2024

I am also struggling with this issue. I have tried using the focal and baseline of training one while doing inference on different dataset. The depth which calculated with training sets’ focal and baseline was pretty large, over 1000 meters (if the real meter is 2m), however, I got reasonable results by using testing set’s focal and baseline. Is there anything wrong with my implementation?

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024

@SAmmarAbbas Good catch!! I can't believe no one saw this before! :)

from monodepth.

SAmmarAbbas avatar SAmmarAbbas commented on July 29, 2024

Also, there are four different focal lengths given for KITTI dataset, can I use any of them? And given I use the first one, do I scale it by width of my image e.g. image_width * 721.5377/1242

width_to_focal[1242] = 721.5377 width_to_focal[1241] = 718.856 width_to_focal[1224] = 707.0493 width_to_focal[1238] = 718.3351

from monodepth.

mrharicot avatar mrharicot commented on July 29, 2024

Is your test image from KITTI?
As it has been discussed here before, there is no obvious way to convert the disparity of an image with focal and aspect ratio different from KITTI.

from monodepth.

ShubhamNagarkar avatar ShubhamNagarkar commented on July 29, 2024

I used a single camera to create a depth image. Now i want to calculate the approximate distance of a certain pixel. In my case, since there is no baseline, what approach should i use ? i want to find out distances of objects in the range of 5-10 meters only. Also, i am using inputs from my phone camera. Please help.

from monodepth.

salmankh47 avatar salmankh47 commented on July 29, 2024

How can I get ground truth dense disparity/depth map?

sparse_depth

This is what I am getting immediate after viewing gt_depth from following code

gt_depth = width_to_focal[width] * 0.54 / (gt_disp + (1.0-gt_mask))

from monodepth.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.