Giter VIP home page Giter VIP logo

adobeindoornav's Introduction

AdobeIndoorNav Dataset

Dataset Overview

Figure 1. The AdobeIndoorNav Dataset and other 3D scene datasets. Our dataset supports research on robot visual navigation in real-world scenes. It provides visual inputs given a robot position: (a) the original 3D point cloud reconstruction; (b) the densely sampled locations shown on 2D scene map; (c) four examples RGB images captured by robot camera and their corresponding locations and poses. Sample views from 3D synthetic and real-world recontructed scene datasets: (d) Observation images from two synthetic datasets: SceneNet RGB-D and AI2-THOR; (e) Rendered images from two real-world scene datasets: Stanford 2D-3D-S and ScanNet.

About the paper

Arxiv Version: https://arxiv.org/abs/1802.08824

Project Page: https://cs.stanford.edu/~kaichun/adobeindoornav/

Video: https://youtu.be/iqo1ihr_qXI

Contact: [email protected]

About this repository

This repository contains the AdobeIndoorNav dataset and the relevant codes for visualization. The dataset is proposed and used in the paper The AdobeIndoorNav Dataset: Towards Deep Reinforcement Learning based Real-world Indoor Robot Visual Navigation by Kaichun Mo, Haoxiang Li, Zhe Lin, Joon-Young Lee. We design a semi-automatic pipeline to collect a new dataset for robot indoor visual navigation. Our dataset includes 3D reconstruction for real-world scenes as well as densely captured real 2D images from the scenes. It provides high-quality visual inputs with real-world scene complexity to the robot at dense grid locations.

Dependencies

All the code is tested in Python2.7. Please run the following commands to install the dependencies.

       pip install -r requirements.txt

The Dataset

Please check the README.md under folder datasets to download the dataset.

The first-version dataset contains 24 scenes (i.e. 15 office rooms, 5 conference rooms, 2 open spaces, 1 kitchen, 1 storage room). For each scene, we propose the raw 3D point cloud in ply format, the 2D obstacle map and laser-scan map, the ground-truth world graph map and a set of densely captured panoramic 360 images at each observation location.

The dataset splits are in splits folder. It contains the train/test split and all scene sub-category splits.

The dataset statistics are in stats folder. It contains the sparse landmark location ids (stats/landmark_targets) and the dense SIFT-featureful location ids (stats/landmark_targets), as introduced in the paper.

Quick Start

You can run the following command to quickly browse in et12-kitchen scene.

        bash quick_browse.sh

To prepare all the 24 scenes for visualization, run the following command. This will take a while. Be patient. Please make sure you have downloaded the dataset and put it under folder datasets/adobeindoornav_dataset.

        bash prepare_all_scenes.sh

To prepare the scenes with random camera jitters and visual noises, please run

        bash prepare_all_scenes_with_jitters.sh

Code Details

You need to first crop regular images from the panoramic 360 images. To process each scene, go to scripts folder and run

        python crop_panorama_images.py [scene_name]

We also provide the functionality to add camera jitters and random noises to the visual inputs, check the following for more details.

        python crop_panorama_images.py --help

To run batch generation for all 24 scenes, please run

        bash run_crop_all_scenes_without_jitters.sh
        bash run_crop_all_scenes_with_jitters.sh

Then, we dump the data into HDF5 files. To process each scene, go to scripts folder and run

        python gen_visu_h5.py [scene_name]

This commands defaults to load the images from data/panorama_images_cropped_rgb_images folder and generate one image per location. To load the cropped images from different folder and to load more images per location, use the following

        python gen_visu_h5.py [scene_name] --data_dir [data_dir] --num_imgs_per_loc [num_imgs_per_loc]

To run in batch, please use

        bash run_gen_visu_h5_without_jitters.sh
        bash run_gen_visu_h5_with_jitters.sh

Finally, we are ready to use a keyboard controlled agent to visualze the scene. Go to visu folder and run

        python keyboard_agent.py [scene_name]

Run the following to see more options. You can disable the overhead map, or specify a target observation (shown as red arrow in the map).

        python keyboard_agent.py --help

Citing this work

If the dataset is useful for your research, please consider cite the following paper:

    @article{Mo18AdobeIndoorNav,
        Author = {Kaichun Mo and Haoxiang Li and Zhe Lin and Joon-Young Lee},
        Title = {The AdobeIndoorNav Dataset: Towards Deep Reinforcement Learning based Real-world Indoor Robot Visual Navigation},
        Year = {2018},
        Eprint = {arXiv:1802.08824},
    }

License

MIT Licence

adobeindoornav's People

Contributors

daerduocarey avatar

Stargazers

Hongwei Yi avatar Sajad avatar Luke Van In avatar fuj avatar Victor Augusto Kich avatar  avatar Maulana Bisyir Azhari avatar  avatar Alexander Xavier O'Rourke Goby avatar QinjieLin avatar anranxu avatar  avatar TUZ avatar Alireza Nourian avatar  avatar Xuejian Rong avatar Haipeng Wang avatar Xin (Eric) Wang avatar  avatar Chaitanya Hebbal avatar  avatar  avatar Jimmy Yu avatar Hyuta Tanaka avatar Smrutiranjan Sahu avatar  avatar Haoxiang Li avatar Fei avatar Yixin Zhu avatar Robin Lehmann avatar Jose Cohenca avatar Dickachu Yang avatar

Watchers

James Cloos avatar  avatar Chaitanya Hebbal avatar paper2code - bot avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.