Hi, I ran monodepth for an image and got a depth map. Could someone tell me, using

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Regarding the <a href="https://github.com/mrharicot/monodepth/blob/master/utils/evalua

Getting depth out of a depth map about monodepth HOT 16 CLOSED

mrharicot commented on July 29, 2024

Getting depth out of a depth map

from monodepth.

Comments (16)

mrharicot commented on July 29, 2024 8

Hi,

To convert disparities to depth you need to use the usual formula, depth = baseline * focal / disparity, we assume we know both of these.
The disparities generated by the model are normalized by the image width, just scale them by it to get the value in pixels.
You can find the values for the focal of KITTI here, its baseline is ~0.54m.
For cityscapes the baseline is 0.22m and the focal is 2262 for a width of 2048.

Please remember that the disparities generated by the model are normalized to the image width, you thus need to scale them back before converting to depth.

from monodepth.

SAmmarAbbas commented on July 29, 2024 6

@mrharicot I think you have written wrong formula in your first comment. Shouldn't it be depth = baseline * focal / disparity ?

from monodepth.

mrharicot commented on July 29, 2024 2

You need to use the baseline from the dataset it was last trained on, in this case kitti.

from monodepth.

mrharicot commented on July 29, 2024 1

You need to use the one of the dataset used for training.
However if you test with images captured with a different focal and aspect ratio, it might not produce reliable results.

from monodepth.

harishannavajjala commented on July 29, 2024

Oh... what if there is no second camera. Say I used only one camera setup. From my understanding, I thought, during test time this model works for single mono-camera images as well. And I pinged you on your gmail the image that I used and the disparity map I got. Thanks !

from monodepth.

mrharicot commented on July 29, 2024

You need to use the baseline of the dataset you trained on!

from monodepth.

villanuevab commented on July 29, 2024

@mrharicot thank you for your answer. What about for models fine-tuned on multiple datasets? For example, if we used the pre-trained model model_city2kitti, which baseline would we use for inference, KITTI's or Cityscapes'?

from monodepth.

villanuevab commented on July 29, 2024

Thank you. I forgot to ask: would the focal length to use also be the one used during training? Or during inference?

from monodepth.

villanuevab commented on July 29, 2024

Regarding the width_to_focal dict you pointed to, how do you calculate the key:dict pairs from a focal length? For example, from an 8mm focal length camera used during training

from monodepth.

mrharicot commented on July 29, 2024

Those parameters were obtained from the intrinsic matrices given with the KITTI dataset.
To obtain the proper focal length of your camera in pixel you should calibrate it using a checkerboard, there are lots of tutorials on how to do this in OpenCV.
If you want to read more about calibration take a look at the chapter 14 of this book:
http://web4.cs.ucl.ac.uk/staff/s.prince/book/book.pdf

from monodepth.

keishatsai commented on July 29, 2024

I am also struggling with this issue. I have tried using the focal and baseline of training one while doing inference on different dataset. The depth which calculated with training sets’ focal and baseline was pretty large, over 1000 meters (if the real meter is 2m), however, I got reasonable results by using testing set’s focal and baseline. Is there anything wrong with my implementation?

from monodepth.

mrharicot commented on July 29, 2024

@SAmmarAbbas Good catch!! I can't believe no one saw this before! :)

from monodepth.

SAmmarAbbas commented on July 29, 2024

Also, there are four different focal lengths given for KITTI dataset, can I use any of them? And given I use the first one, do I scale it by width of my image e.g. image_width * 721.5377/1242

width_to_focal[1242] = 721.5377 width_to_focal[1241] = 718.856 width_to_focal[1224] = 707.0493 width_to_focal[1238] = 718.3351

from monodepth.

mrharicot commented on July 29, 2024

Is your test image from KITTI?
As it has been discussed here before, there is no obvious way to convert the disparity of an image with focal and aspect ratio different from KITTI.

from monodepth.

ShubhamNagarkar commented on July 29, 2024

I used a single camera to create a depth image. Now i want to calculate the approximate distance of a certain pixel. In my case, since there is no baseline, what approach should i use ? i want to find out distances of objects in the range of 5-10 meters only. Also, i am using inputs from my phone camera. Please help.

from monodepth.

salmankh47 commented on July 29, 2024

How can I get ground truth dense disparity/depth map?

This is what I am getting immediate after viewing gt_depth from following code

gt_depth = width_to_focal[width] * 0.54 / (gt_disp + (1.0-gt_mask))

from monodepth.

Getting depth out of a depth map about monodepth HOT 16 CLOSED

Comments (16)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent