diode-dataset / diode-devkit Goto Github PK

DIODE Development Toolkit

License: MIT License

Jupyter Notebook 91.81% Python 8.19%

monodepth depth-estimation rgbd dataset normal-estimation

diode-devkit's Introduction

DIODE: A Dense Indoor and Outdoor DEpth Dataset

DIODE (Dense Indoor/Outdoor DEpth) is a dataset that contains diverse high-resolution color images with accurate, dense, and far-range depth measurements. DIODE is the first public dataset to include RGBD images of indoor and outdoor scenes obtained with one sensor suite.

Refer to our homepage, dataset sample gallery and technical report for more details.

Dataset Download

We have released the train and validation splits of DIODE depth and DIODE normal, including RGB images, depth maps, depth validity masks and surface normal maps. Test set is coming soon.

Download links:

DIODE Depth (RGB images, Depth maps and Depth validity masks):

Partition	Amazon Web Service	Baidu Cloud Storage	MD5 Hash
Train (81GB)	train.tar.gz	train.tar.gz	3a94632398fe1d002d89f11743f748b1
Validation (2.6GB)	val.tar.gz	val.tar.gz	5c895d09201b88973c8fe4552a67dd85

DIODE Normal (Normal maps only):

Partition	Amazon Web Service	Baidu Cloud Storage	MD5 Hash
Train (126GB)	train_normals.tar.gz	train_normals.tar.gz	9c0617ebe1eaf1928fdf3344e1c42aef
Validation (4.6GB)	val_normals.tar.gz	val_normals.tar.gz	323ccaf60abebdb59705dcd8b529d709

Dataset Layout

DIODE data is organized hierarchically. Detailed structure is shown as follows:

File Naming Conventions and Formats

The dataset consists of RGB images, depth maps, depth validity masks and surface normal maps. Their formats are as follows:

RGB images (*.png): RGB images with a resolution of 1024 × 768.

Depth maps (*_depth.npy): Depth ground truth with the same resolution as the images.

Depth validity masks (*_depth_mask.npy): Binary depth validity masks where 1 indicates valid sensor returns and 0 otherwise.

Surface normals maps (*_normal.npy): Surface normal vector ground truth with the same resolution as the images. Invalid normals are represented as (0,0,0).

Devkit

This development toolkit contains:

A json file that enumerates the data in DIODE. The layout of this file is explained in diode.py. It serves as the single point of reference during dataloading.
A sample pytorch data loading module.
A jupyter-notebook demo showcasing data loading, metadata querying and depth as well as normal visualization.
A text file documenting camera intrinsics.
A python file for computation of metrics using numpy.

Citation

@article{diode_dataset,
    title={{DIODE}: {A} {D}ense {I}ndoor and {O}utdoor {DE}pth {D}ataset},
    author={Igor Vasiljevic and Nick Kolkin and Shanyi Zhang and Ruotian Luo and 
            Haochen Wang and Falcon Z. Dai and Andrea F. Daniele and Mohammadreza Mostajabi and 
            Steven Basart and Matthew R. Walter and Gregory Shakhnarovich},
    year = {2019}
    journal={CoRR},
    volume={abs/1908.00463},
    year = {2019},
    url={http://arxiv.org/abs/1908.00463}
}

Contact

If you have any questions, please contact us at [email protected].

diode-devkit's People

Contributors

Stargazers

Watchers

Forkers

qnchlu jwfaye liuguoyou hczhangpq flamehaze1115 shivampr21 techthiyanes vejbomar

diode-devkit's Issues

Could you provide the scale factor?

Hello, could you please provide the scale factor of depth map? When I want to convert the depth map to 3D point clouds, I cannot find this param.
Thank you

Could you provide camera focal length?

Hello, could you provide camera focal length? This is very important to my project！
Thanks a lot!
Best wishes!

Bug on line 111-115 in diode.py?

    im_fname = osp.join(self.data_root, '{}.png'.format(im))
    de_fname = osp.join(self.data_root, '{}_depth.npy'.format(im))
    de_mask_fname = osp.join(self.data_root, '{}_depth_mask.npy'.format(im))

    im = np.array(Image.open(osp.join(self.data_root, im_fname)))

on the last line the im_fname is joined twice with self.data_root

How to handle invalid depths

From the paper, it seems that there are some invalid depths that were shown as black pixels in the depth map. I'm wondering if there is any way to handle those, perhaps using inpainting?

Will you provide extrinsics parameters of cameras?

Hi, thanks for providing this dataset. Besides intrinsics parameters, will you also provide the extrinsics parameters of the cameras?

What is the bit-depth of the depth data

Hello, nice work. I noticed that the depth map is stored in .npy format, I want to know what is the bit-depth of the raw depth images?

name format for each file

Hi,

Thanks for the work, in our research, we need to recover the extrinsic of the virtual parameter.

I wonder how the name corresponding to the rotation of elevation angles and azimuth.

00007_00082_outdoor_020_000

is 020 asimuth and 000 corresponding to -20 elevation ?

Thanks

Add LICENSE

Hello, what is the LICENSE of this dataset? Is it meant only for academic use? Am I allowed to use it for commercial purposes? Please add a license file to clarify this.

normal data missing

normal data doesn't exist for train/indoors/scene_00000/scan_00007 ...

Noise point cloud from bi-linear interpolation

As #3 mentioned, depth around edges are noisy. Which is a common case in depth estimation and could have been solved.

As is known, depth sensor return invalid value around edge, which is often shown as black hole, instead of random points.

However, if the incomplete depth map is resized with bi-linear interpolation, the pixels around the holes would be resampled with valid depth and zeros-values in the hole, making the new depth vary from 0 to the ground truth, which is the case here

So if is there raw data of the depth sensor? And we can try to fix the bug by reprocess the depth map.

Non-square pixels and baseline results

Hi. I notice that the images have been produced with different vertical and horizontal focal lengths. The images are slightly vertically stretched, their size should be (1024 x 735) to have square pixels.

When running the baselines, did you feed the images as-is or did you correct for this?

thanks

how to train with the diode dataset

thanks for your great work，but I have a question that how can I train with the dataset while there are many zeros in the depth map, in NYUv2 dataset, there is no zero in depthmap, and how should I deal with the pixels without depth? thanks again

How to change the format depth.npy to depth.png

Incorrect FoV 60° x 45°, maybe 42.3° x 32.4°?

Both "intrinsics.txt" and the technical report claim a horizontal FoV of 60° and a vertical FoV of 45°. However, the point clouds corresponding to such FoV appear distorted. For example, the surfaces of the corner are apparently not orthogonal. The provided FoV is larger than expected.

I have examined a few cases where surfaces that are obviously supposed to be orthogonal and then fit the optimal value of FoV to make those surfaces orthogonal. The values I obtain are 42.3° horizontally and 32.4° vertically, with a standard deviation of less than 1°. The point cloud looks visually better.

Since the depth maps and surfaces look smooth and accurate, such an error is likely introduced in a post-processing step. Moreover. the FoV of 60°x45° does not imply a consistent focal length in the X-Y axes, as tan(30°):tan(22.5°) is far from the 4:3 aspect ratio. Would the authors please check the FoV after cropping?

Here are some side-by-side comparisons between the point clouds with the original FoV (60°) on the left and the corrected FoV (42.3°) on the right. All of them are viewed from the top.

Test dataset release date

The readme mentions a test set: