Giter VIP home page Giter VIP logo

deft-data's Introduction

Dataset for "DEFT: Dexterous Fine-Tuning for Real-World Hand Policies"

This repository contains instructions to download the pre-processed data (from Ego4D, EK-100, and HOI4D datasets) used for the grasp affordance model training in "DEFT: Dexterous Fine-Tuning for Real-World Hand Policies." This data is modified from Ego4D, Epic-Kitchens, and HOI4D and pre-processed with the labels necessary for training.

Instructions

Download data from here).

Alternatively, download directly to disk using gdown:

pip install gdown
gdown --folder 1E8RqVa8RDRlNJGX0FDt5aWV48ndo-XFJ

Unzip all data:

tar -xf deft-data-all/*.tar.gz -C deft-data-all
rm deft-data-all/*.tar.gz

Attribution

This data is modified from the Ego4D, Epic-Kitchens 100, and HOI4D datasets. We used a subset of the images from the videos recorded those datasets, detected the affordances, and provided labels for model training. Both Epic Kitchena and HOI4D are licensed under CC BY-NC 4.0, and Ego4D's license is here.

License

This dataset is licensed under CC BY-NC 4.0.

BibTeX

When using this dataset, please reference:

@article{kannan2023deft,
         title={DEFT: Dexterous Fine-Tuning for Real-World Hand Policies},
         author={Kannan, Aditya, and Shaw, Kenneth and Bahl, Shikhar, and Mannam, Pragna and Pathak, Deepak},
         journal={Conference on Robot Learning (CoRL)},
         year={2023}}

deft-data's People

Contributors

adityak77 avatar

Stargazers

Jihoon Oh avatar  avatar Qi Jiang avatar AXiang avatar  avatar Adriel Martins avatar Yuanchen Ju avatar

Watchers

 avatar

deft-data's Issues

Request for Clarification on Dataset Preprocessing Process

Thanks for your great work. I have a question about the data processing process of the deft-data. According to the official metadata of the Ego4D dataset, only around 15,000 annotated clips are available, and some may not include human hands. However, your paper mentions using over 60,000 clips from Ego4D.

Was the deft data modified from the raw Ego4D videos rather than using the official annotated clips? Additionally, if raw full-scale videos were used, could you share how the task descriptions were extracted from these videos(e.g., using some type of video caption model)?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.