yzqin / dexpoint-release Goto Github PK

View Code? Open in Web Editor NEW

59.0 2.0 6.0 43.64 MB

DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation, CoRL 2022

Home Page: https://yzqin.github.io/dexpoint/

License: MIT License

sed 0.19% Dockerfile 1.16% Python 98.65%

dexterous-manipulation pointcloud reinforcement-learning sim2real

dexpoint-release's Introduction

DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation

[Project Page] [Paper] [Poster][ShapeNet Object Models]

DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation

Yuzhe Qin*, Binghao Huang*, Zhao-Heng Yin, Hao Su, Xiaolong Wang, CoRL 2022.

DexPoint is a novel system and algorithm for RL from point cloud. This repo contains the simulated environment and training code for DexPoint.

Bibtex

@article{dexpoint,
  title          = {DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation },
  author         = {Qin, Yuzhe and Huang, Binghao and Yin, Zhao-Heng and Su, Hao and Wang, Xiaolong},
  journal        = {Conference on Robot Learning (CoRL)},
  year           = {2022},
}

Installation

git clone [email protected]:yzqin/dexpoint-release.git
cd dexart-release
conda create --name dexpoint python=3.8
conda activate dexpoint
pip install -e .

Download data file for the scene from Google Drive Link. Place the day.ktx at assets/misc/ktx/day.ktx.

pip install gdown
gdown https://drive.google.com/uc?id=1Xe3jgcIUZm_8yaFUsHnO7WJWr8cV41fE

File Structure

dexpoint: main content for the environment, utils, and other staff needs for RL training.
assets: robot and object models, and other static files
example: entry files to learn how to use the DexPoint environment
docker: dockerfile that can create container to be used for headless training on server

Quick Start

Use DexPoint environment and extend it for your project

Run and explore the comments in the file below provided to familiarize yourself with the basic architecture of the DexPoint environment. Check the printed messages to understand the observation, action, camera, and speed for these environments.

state_only_env.py: minimal state only environment
example_use_pc_env.py: minimal point cloud environment
example_use_imagination_env.py: point cloud environment with imagined point proposed in DexPoint
example_use_multi_camera_visual_env.py: environment with multiple different visual modalities, including depth, rgb, segmentation. We provide it for your reference, although it is not used in DexPoint

The environment we used in the training of DexPoint paper can be found here in example_dexpoint_grasping.py.

Training

Download the ShapeNet models from Google Drive can place it inside the following directory dexpoint-release/assets/shapenet/.

The DexPoint repo is using the same training code as DexArt and environment interface for RL training. Please check the training code here to train DexPoint with PPO.

Acknowledgements

We would like to thank the following people for making this project possible:

Tongzhou Mu and Ruihan Yang for helpful discussion and feedback.
Fanbo Xiang for invaluable help on rendering.

Example extension of DexPoint environment framework in other project

DexArt: Benchmarking Generalizable Dexterous Manipulation with Articulated Objects (CVPR 2023): extend DexPoint to articulated object manipulation.

From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation (RA-L 2022): use teleoperation for data collection in DexPoint environment.

dexpoint-release's People

Contributors

Stargazers

Watchers

Forkers

tianyudwang cc299792458 dexsuite felixpun ykwang20

dexpoint-release's Issues

Regarding Loading a custom object and the replication of training code

Hi @yzqin,
I had a couple of questions I was hoping to get some insight on.

Upon inspecting load_ycb_object() in dexpoint-release/dexpoint/utils/ycb_object_utils, I see that one needs the following files for loading an object:

textured_simple.obj
collision.obj
However, in order to load a new object (like a spoon for example) in the codebase, I am unable to find the corresponding collision.obj on the YCB dataset website. Where may I obtain these assets?

I was trying to replicate the training of the policy in the codebase but encountered an error in doing so.

Here's a replication of my error:
After cloning the modified codebase here.
Run the command: python training_ppo.py
I get the following error:

It would be great if you could kindly give some insight to resolve these issues.

Thanking you in anticipation!

Request for scenario files or methods for creating them

Thank you for your contribution.
I noticed that the data file in the readme are different from the scenarios presented in this article. Could you kindly provide the scenario files or methods for creating them?
I would appreciate your help. Thank you.

Reproducible issue: cannot converge to a valid grasp policy

Hi yz:
I am currently following your dex-series work, nice job!
However, I follow the instrument to copy the dexpoint env under dexart training root. The main training code are organized like this, some env setting code are omitted and the policy is trained by ycb dataset instead of shapenet:
`

def get_3d_policy_kwargs(extractor_name):
    feature_extractor_class = PointNetImaginationExtractorGP
    feature_extractor_kwargs = {"pc_key": "relocate-point_cloud",
                                # "pc_key": "instance_1-point_cloud",
                                # "gt_key": "instance_1-seg_gt",
                                "extractor_name": extractor_name,
                                "imagination_keys": [f'imagination_{key}' for key in IMG_CONFIG['relocate_goal_robot'].keys()],
                                "state_key": "state"
    }
    policy_kwargs = {
        "features_extractor_class": feature_extractor_class,
        "features_extractor_kwargs": feature_extractor_kwargs,
        # "net_arch": [dict(pi=[64, 64], vf=[64, 64])],
        "activation_fn": nn.ReLU,
    }
    return policy_kwargs
def training():
        env_params = dict(object_name=object_name, rotation_reward_weight=rotation_reward_weight,
                          randomness_scale=1, use_visual_obs=use_visual_obs, use_gui=False,
                          # no_rgb=True
                          )
        if "CUDA_VISIBLE_DEVICES" in os.environ:
            env_params["device"] = "cuda"
        environment = AllegroRelocateRLEnv(**env_params)
        model = PPO("PointCloudPolicy", env, verbose=1,
                    n_epochs=args.ep,
                    n_steps=(args.n // args.workers) * horizon,
                    learning_rate=args.lr,
                    batch_size=args.bs,
                    seed=seed,
                    policy_kwargs=get_3d_policy_kwargs(extractor_name=extractor_name),
                    min_lr=args.lr,
                    max_lr=args.lr,
                    adaptive_kl=0.02,
                    target_kl=0.2,
                    )
        obs=env.reset()
        if pretrain_path is not None:
            state_dict: OrderedDict = torch.load(pretrain_path)
            model.policy.features_extractor.extractor.load_state_dict(state_dict, strict=False)
            print("load pretrained model: ", pretrain_path)
        rollout = int(model.num_timesteps / (horizon * args.n))
        if args.freeze:
            model.policy.features_extractor.extractor.eval()
            for param in model.policy.features_extractor.extractor.parameters():
                param.requires_grad = False
            print("freeze model!")
        model.learn(
            total_timesteps=int(env_iter),
            reset_num_timesteps=False,
            iter_start=rollout,
            callback=None
    )

`
And the experiment result just like this:

Is the training code organized well? Or can you give some advice to reproduce the experiment result?

Training issue: Should get_observation return array instead of dict？

Hi！
I find that when set use_obs to True, the environment will return a dict of observation after invoking get-observation function.
However, the dict is not supported when invoking the feature_extractor of stable baseline3.
So, in order to employ dexpoint env based on dexart, should the get_observation fn in env be modified to return array instead of dict？

Specific: modify the get_observation fn in:
dexpoint-release-main/dexpoint/env/rl_env/base.py

Thanks for any help:)

Can you provide the code to get point clouds by D435 in the real world?

Hi, @yzqin
Thanks for your great work!

Can you provide the code to get point clouds by D435 in the real world?

Thanks for any help:)
Best,
Charlie

yzqin / dexpoint-release Goto Github PK

dexpoint-release's Introduction

DexPoint: Generalizable Point Cloud Reinforcement Learning for Sim-to-Real Dexterous Manipulation

Bibtex

Installation

File Structure

Quick Start

Use DexPoint environment and extend it for your project

Training

Acknowledgements

Example extension of DexPoint environment framework in other project

dexpoint-release's People

Contributors

Stargazers

Watchers

Forkers

dexpoint-release's Issues

Recommend Projects

Recommend Topics

Recommend Org