deft-data's Introduction

Dataset for "DEFT: Dexterous Fine-Tuning for Real-World Hand Policies"

This repository contains instructions to download the pre-processed data (from Ego4D, EK-100, and HOI4D datasets) used for the grasp affordance model training in "DEFT: Dexterous Fine-Tuning for Real-World Hand Policies." This data is modified from Ego4D, Epic-Kitchens, and HOI4D and pre-processed with the labels necessary for training.

Instructions

Download data from here).

Alternatively, download directly to disk using gdown:

pip install gdown
gdown --folder 1E8RqVa8RDRlNJGX0FDt5aWV48ndo-XFJ

Unzip all data:

tar -xf deft-data-all/*.tar.gz -C deft-data-all
rm deft-data-all/*.tar.gz

Attribution

This data is modified from the Ego4D, Epic-Kitchens 100, and HOI4D datasets. We used a subset of the images from the videos recorded those datasets, detected the affordances, and provided labels for model training. Both Epic Kitchena and HOI4D are licensed under CC BY-NC 4.0, and Ego4D's license is here.

License

This dataset is licensed under CC BY-NC 4.0.

BibTeX

When using this dataset, please reference:

@article{kannan2023deft,
         title={DEFT: Dexterous Fine-Tuning for Real-World Hand Policies},
         author={Kannan, Aditya, and Shaw, Kenneth and Bahl, Shikhar, and Mannam, Pragna and Pathak, Deepak},
         journal={Conference on Robot Learning (CoRL)},
         year={2023}}

deft-data's People

Contributors

Stargazers

Watchers

deft-data's Issues

Request for Clarification on Dataset Preprocessing Process

Thanks for your great work. I have a question about the data processing process of the deft-data. According to the official metadata of the Ego4D dataset, only around 15,000 annotated clips are available, and some may not include human hands. However, your paper mentions using over 60,000 clips from Ego4D.

Was the deft data modified from the raw Ego4D videos rather than using the official annotated clips? Additionally, if raw full-scale videos were used, could you share how the task descriptions were extracted from these videos(e.g., using some type of video caption model)?

Recommend Projects

adityak77 / deft-data Goto Github PK

deft-data's Introduction

Dataset for "DEFT: Dexterous Fine-Tuning for Real-World Hand Policies"

Instructions

Attribution

License

BibTeX

deft-data's People

Contributors

Stargazers

Watchers

deft-data's Issues

Request for Clarification on Dataset Preprocessing Process

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent