real-stanford / diffusion_policy Goto Github PK

View Code? Open in Web Editor NEW

944.0 944.0 167.0 12.49 MB

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion

Home Page: https://diffusion-policy.cs.columbia.edu/

License: MIT License

Python 100.00%

robotics

diffusion_policy's People

Contributors

Stargazers

Watchers

Forkers

hangj11 codeaudit vidhijain galaxies99 paarthshah-tri mertcookimg dexin-wang nepfaff blake-wulfe-tri mbreuss zhenjia-xu manasi-sharma pfrommerd tmats csufangyu preritpaliwal sun-hub1020 ajaysridhar0 aaditya-prasad mbronars kristery xinqiangyu rpl-cs-ucl chadwick-yao anqiaoli lwygzh vonhartz drmj yunxitang lostxine ngcngc1 captjulian pyalgosilphf adrshsrvstv felixpun asclepiusinformatica wangyixuan12 mimichu harryzhesutacta shutongjin ruofeng-wang xxs90 ases200q2 barisyazici rolandzhu yanweiw wangyan-hlab easedot elonmusk9577 pointw ngmor chahyon-ku bolundai0216 airollie lucagrillotti-sony mahaitongdae xxhaonan mohanksriram moreinfoy gchenfc jiaming-ai maxsobolmark mumujun97 thomasrantian henry-lsy barath19 mzhao98 mihdalal kotakondo keti-ai xueminchi capri2014 kyle-hatch-tri binhu85 mikeferguson-forks kassasin agroboticsresearch vhyrsd weihangguo naveenbiitk alexander-soare peterchenyipu yangcyself beautifulv0id titardrew aaroncaozj clayhaight01 peterdavidfagan jhejna seagul1 selamie adeeb10abbas amanuelergogo janneshb nidesuipao codemasterzhao gerty168 ali-bdai ericsun2024 lonelyfluency

diffusion_policy's Issues

robomimic_img demo file is too large to download

Dear author:
非常感谢你的工作。我这几天正在尝试在robomimic环境里面复现diffusion policy，但是我发现你们提供的处理过的robomimic_img演示文件居然有足足78G大小，这需要相当长的时间才能全部下载。为什么这个文件会这么大，我非常诧异，因为我看了一下robomimic官网的演示数据集可能针对每一个任务只有几百MB大小。所以如果我只想在image输入的can(ph)这一项任务上复现的话，应该怎么解决数据集的问题，是只能下载全部的演示数据还是可以下载robomimic官网的数据集经过后处理用于模仿学习。

谢谢！

Question: virtual environment rendering/acceleration

Hi there! Thanks for your impressive work and beautiful code :)
I tried to run lift_image_abs with transformer hybrid workspace HEADLESS, but it logged that:

[root][INFO] Command '['/mambaforge/envs/robodiff/lib/python3.9/site-packages/egl_probe/build/test_device', '0']' returned non-zero exit status 1.
[root][INFO] - Device 0 is not available for rendering

and it keeps repeating on all of the 4 GPUs. Afterwards, I found the "Eval LiftImage" process is really slow, I wonder if I should turn on or install some driver for hardware acceleration?

nvidia-smi command during Eval (GPU-Util keeps 0%):

top command during Eval:

wandb monitor data:

Regarding the issue of loss differences in the DDPM algorithm between the training and validation sets.

Hello @cheng-chi ! Thank you for sharing your beautiful code as open-source. I have integrated your code into a custom environment that I've developed. After training, I noticed that the loss on the training set for the DDPM algorithm consistently decreased, whereas the loss on the validation set kept increasing (the final loss on the training set was 10e-5, and on the validation set, it was 1.3), showing a significant difference in magnitude. I also checked the training logs you provided and observed a similar magnitude difference, although the increase in validation set loss wasn't very pronounced (for example, the final losses in data/experiments/image/pusht/diffusion_policy_cnn/train_0/logs.json were 0.00024978463497540187 and 0.24248942732810974, respectively). Moreover, the success rate of the closed-loop test for the final model was just over 70% (compared to a 98% success rate with expert data). Therefore, I would like to inquire whether this issue could be affecting the test performance and if you have any good debugging experience to share.
Thank you and look forward to your reply!

Question about the observation

Hello, thank you for providing the code! I just started learning about the diffusion model and was deeply inspired after reading your paper. In your paper, you mentioned using observation to guide the CNN base diffusion model for action generation, transform observation Ot into observation embedding sequence by a shared MLP. Is this something you first proposed or an improvement on previous papers? Thank you!

Transformer based vision colab example

Hi, I am wondering how much harder is the transformer-based diffusion policy is to train? Will it be possible to adapt it to the vision based colab example on colab? @NeilNie @pointW @jingxixu @cheng-chi

Policy evaluation during training

Thank you for the beautiful code!

In the evaluation of the training process, I see you have logged the train_action_mse_error, which samples the trajectories from the training set and calculates the error between the perdicted action and the target. Is there a specific reason why there isn't a corresponding validation_action_mse_error that calculates this error on the validation set?

Any suggestions on speeding up training?

Hi Cheng, this work is incredible and elegant! I tried the code, and I found that the training time for each task and method is around 12 hrs. I am also testing on the robomimic can task, and it seems to take a while to get a performant policy. I am wondering whether you have any suggestions for speeding up the training process.

about the keypoint?

I can know from the article that the keypoint in this pustt is 9 2D positions obtained from the real pose of the block, but I still don’t know what its meaning is and which positions it is? What does it have to do with the x and y positions of the block?

What are the differences between low_dim.hdf5 and low_dim_abs.hdf5?

Hi Cheng, I'm wondering what are the differences between low_dim.hdf5 and low_dim_abs.hdf5 since the dataset in the robomimic environment only has low_dim.hdf5 and image.hdf5. How is the low_dim_abs.hdf5 and image_abs.hdf5 dataset produced?
Thanks for your help!

Conda environment build takes forever

The conda environment takes forever to build on my end. Does anyone know how to solve it?

Code for Franka Panda robot

Hi,

In the paper, you mentioned that you executed the policy on two different robots: UR5 and Franka. Is it possible to release the code for Franka as well? Thanks!

How to access to the depth data in the simulation?

I haven't found a way to inspect the depth data. It seems the version of Robomimic is kind of low, but I am not sure about it.

do you have any idea about visualising the depth image and also access to the camera parameters?

Imports in diffusion_policy_state_pusht_demo notebook fail in Google Colab runtime

Hello guys,

Very impressive work. Would love to try it out with state-based notebook, but fail to run in Google Colab runtime. Import cell fails with:

XlaRuntimeError: FAILED_PRECONDITION: DNN library initialization failed. Look at the errors above for more details.

on this line:

from diffusers.training_utils import EMAModel

Could you please advise on fix/workaround?

[Question] How to demo in simulation?

Hi,

I would like to use this amazing Diffusion Policy method in a custom simulation. As a first step, I am trying to reproduce some of the simulations given in the paper ('lift', 'can', 'square', ...) and understand how this code works to see how I can make an environment for my own simulation.

In the ReadMe, there are instructions on how to train and evaluate simulations, but using already gathered data for the pygame pusht example. Moreover, in the code itself, I can only find demo_real_robot.py (which obviously interfaces with the real robot) and a demo_pusht.py which interfaces with pygame but not with e.g. the robomimic simulations (which would be interesting to see how e.g. visual information is retrieved from the simulation so I can try replicate that).

Maybe I am missing something obvious, but is there any easy way to reproduced one of the simulated examples from the paper, from demonstrating to evaluation? Or would this require some custom code that is not in this repo?

Thank you!

About Dataset Padding

Thanks for providing this beautiful code and documentation!
I have been reading the implementation of Dataset and SequenceSampler in the Colab example and I have a question about it.

def create_sample_indices(
        episode_ends:np.ndarray, sequence_length:int,
        pad_before: int=0, pad_after: int=0):
    indices = list()
    for i in range(len(episode_ends)):
        start_idx = 0
        if i > 0:
            start_idx = episode_ends[i-1]
        end_idx = episode_ends[i]
        episode_length = end_idx - start_idx

        min_start = -pad_before
        max_start = episode_length - sequence_length + pad_after

        # range stops one idx before end
        for idx in range(min_start, max_start+1):
            buffer_start_idx = max(idx, 0) + start_idx
            buffer_end_idx = min(idx+sequence_length, episode_length) + start_idx
            start_offset = buffer_start_idx - (idx+start_idx)
            end_offset = (idx+sequence_length+start_idx) - buffer_end_idx
            sample_start_idx = 0 + start_offset
            sample_end_idx = sequence_length - end_offset
            indices.append([
                buffer_start_idx, buffer_end_idx,
                sample_start_idx, sample_end_idx])
    indices = np.array(indices)
    return indices

I understand that we need to pad $T_o-1$ (pad_before) steps before an episode because we need to make sure the first action is predicted conditioning on observation of length $T_o$. But why do we need to pad after an episode for $T_a-1$ (pad_after) steps? What is the problem if we make the max_start to be episode_length-sequence_length?

Incorrect Image Normalization

Hi @cheng-chi, this work is incredible! I read the code carefully and I have a doubt about image normalization.
For example, in real_pusht_image_dataset.py, the following code normalizes the image to [-1, 1]

for key in self.rgb_keys:
    normalizer[key] = get_image_range_normalizer()
return normalizer

In multi_image_obs_encoder.py, ImageNet statistics are also used in the code, but this requires the image to be between [0, 1]

if imagenet_norm:
    this_normalizer = torchvision.transforms.Normalize(
        mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

I don’t know what the impact of this bug is on the final performance, but it may be that there is something wrong with my understanding.

Could you please clarify the evaluation code? (possible bug)

Hello,

Thank you so much for your amazing work and beautiful code. However, when I was reading the code, I got confused in this reward collection section. Could you please clarify why you use the len(self.env_fns) in the first line of this block?
https://github.com/columbia-ai-robotics/diffusion_policy/blob/27395b75008269ebac3ceb2192fadd647f288e7f/diffusion_policy/env_runner/robomimic_lowdim_runner.py#L320-L325

My understanding is that your current code will only take part of the trajectories into consideration when the number of running simulators is smaller than the number of trajectories to test. Please correct me if I am wrong, this line should be for i in range(n_inits): to take all trajectories into consideration. Could you take a look? Will this issue affect the numbers you reported in the paper?

Btw, I'm also curious that why you take the mean of ten checkpoints as your evaluation metric, do you have a specific reason to do so?

Thank you so much for your time.

Best regards,

Xiang Li

How to run state-based notebook with pred_horizon which is not a power of 2?

Hi,
Thanks a lot for making the code very easy to interpret and set up!
I have been playing around with the policy, and it seems like currently, it is not able to handle "pred_horizon" which is not a power of 2. For example, it works for pred_horizon = 2, 4, 8, 16 .... but doesn't work for other values. Is there a quick solution to this?

Important components

Hi, I am currently looking at implementing a diffusion model for policy learning and was very impressed by your work! I was wondering what components of your approach you found to be particularly important for good results? 3 things I specifically was curious about were:

I see you use EMA, did you find that the model predictions were particularly unimodal/overfit to recent training data without it?
Was the causal attention masking used in the transformer variant crucial in getting this architecture to work, or do you think simply decoding waypoints from a more BERT-style encoder architecture would work?
In the appendix it seems you used a particularly large model for the CNN variant and say that you always found larger CNN -> better performance. Was the performance of much smaller CNNs (e.g. ~10M) much worse?

Spatial softmax instead of global avg pooling?

Hi, thank you for your beautiful code ❤️

In section III.B of your paper you mention that you replace the global average pooling in the ResNet with a spatial softmax.
However, I cannot find where this is done in your code.

I can only see where you change the batch norm for a group norm
https://github.com/columbia-ai-robotics/diffusion_policy/blob/0d00e02b45e9e3f37f4eeb68bff076b68d9e9d44/diffusion_policy/model/vision/multi_image_obs_encoder.py#L62-L69

and where you remove the fully connected final layer
https://github.com/columbia-ai-robotics/diffusion_policy/blob/0d00e02b45e9e3f37f4eeb68bff076b68d9e9d44/diffusion_policy/model/vision/model_getter.py#L15

but not where you change the average pooling.

Am I missing something or did you actually use average pooling, contrary to what's stated in the paper?

Mujoco eq_active attribute not found

I was having an issue with the Kitchen environment training. I was receiving this error and the training was never starting

 co/index.py", line 628, in struct_indexer
    attr = getattr(struct, field_name)
AttributeError: 'MjModel' object has no attribute 'eq_active'

After some debugging I found the issue is with the dm_control package version. I solved it by updating to the version v1.0.16.
I could update the conda environment yaml, but not sure if everyone is receiving this error.

shapely.errors.TopologicalError: The operation 'GEOSIntersection_r' could not be performed.

Eval PushtImageRunner 1/1: 0%| | 0/300 [00:00<?, ?it/s][2024-03-05 13:47:41,822][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 210.5008311236281 342.48464083840656 at 210.5008311236281 342.48464083840656
[2024-03-05 13:47:41,822][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 234.99189481952544 314.67771107771085 at 234.99189481952544 314.67771107771085
[2024-03-05 13:47:41,822][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 220.39022562435579 235.97833701476938 at 220.39022562435579 235.97833701476938
[2024-03-05 13:47:41,822][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 178.67475743781893 209.88693535477299 at 178.67475743781893 209.88693535477299
[2024-03-05 13:47:41,822][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 289.14878950291342 266.5939992541135 at 289.14878950291342 266.5939992541135
[2024-03-05 13:47:41,824][shapely.geos][INFO] - Self-intersection at or near point 289.14878950291342 266.5939992541135
[2024-03-05 13:47:41,824][shapely.geos][INFO] - Self-intersection at or near point 234.99189481952544 314.67771107771085
[2024-03-05 13:47:41,824][shapely.geos][INFO] - Self-intersection at or near point 210.5008311236281 342.48464083840656
[2024-03-05 13:47:41,824][shapely.geos][INFO] - Self-intersection at or near point 220.39022562435579 235.97833701476938
[2024-03-05 13:47:41,824][shapely.geos][INFO] - Self-intersection at or near point 178.67475743781893 209.88693535477299
[2024-03-05 13:47:41,825][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 309.8263772698329 215.10372685673218 at 309.8263772698329 215.10372685673218
[2024-03-05 13:47:41,827][shapely.geos][INFO] - Self-intersection at or near point 309.8263772698329 215.10372685673218
[2024-03-05 13:47:41,825][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 291.18848680659426 277.83949209027594 at 291.18848680659426 277.83949209027594
[2024-03-05 13:47:41,827][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 283.61346399000001 286.18896886258517 at 283.61346399000001 286.18896886258517
[2024-03-05 13:47:41,829][shapely.geos][INFO] - Self-intersection at or near point 291.18848680659426 277.83949209027594
[2024-03-05 13:47:41,829][shapely.geos][INFO] - Self-intersection at or near point 283.61346399000001 286.18896886258517
[2024-03-05 13:47:41,835][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 282.46518743654559 322.54902054743388 at 282.46518743654559 322.54902054743388
[2024-03-05 13:47:41,840][shapely.geos][INFO] - Self-intersection at or near point 282.46518743654559 322.54902054743388
[2024-03-05 13:47:41,825][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 268.47390492719433 304.81559802469485 at 268.47390492719433 304.81559802469485
[2024-03-05 13:47:41,841][shapely.geos][INFO] - Self-intersection at or near point 268.47390492719433 304.81559802469485
[2024-03-05 13:47:41,838][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 318.04646451990203 363.62674201727964 at 318.04646451990203 363.62674201727964
[2024-03-05 13:47:41,841][shapely.geos][INFO] - Self-intersection at or near point 318.04646451990203 363.62674201727964
[2024-03-05 13:47:41,858][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 297.62376437258138 216.36702689446531 at 297.62376437258138 216.36702689446531
[2024-03-05 13:47:41,858][shapely.geos][ERROR] - TopologyException: Input geom 1 is invalid: Self-intersection at or near point 228.68062339104259 244.17213369889146 at 228.68062339104259 244.17213369889146
[2024-03-05 13:47:41,859][shapely.geos][INFO] - Self-intersection at or near point 228.68062339104259 244.17213369889146
[2024-03-05 13:47:41,860][shapely.geos][INFO] - Self-intersection at or near point 297.62376437258138 216.36702689446531
ERROR: Received the following error from Worker-13: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b37f0>
ERROR: Shutting down Worker-13.
ERROR: Received the following error from Worker-24: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033c0190>
ERROR: Shutting down Worker-24.
ERROR: Received the following error from Worker-4: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b7580>
ERROR: Shutting down Worker-4.
ERROR: Received the following error from Worker-25: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033bf190>
ERROR: Shutting down Worker-25.
ERROR: Received the following error from Worker-23: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033bf220>
ERROR: Shutting down Worker-23.
ERROR: Received the following error from Worker-21: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b82b0>
ERROR: Shutting down Worker-21.
ERROR: Received the following error from Worker-9: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b62b0>
ERROR: Shutting down Worker-9.
ERROR: Received the following error from Worker-5: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b66d0>
ERROR: Shutting down Worker-5.
ERROR: Received the following error from Worker-7: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033b7970>
ERROR: Shutting down Worker-7.
ERROR: Received the following error from Worker-38: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033c12e0>
ERROR: Shutting down Worker-38.
ERROR: Received the following error from Worker-52: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033c75b0>
ERROR: Shutting down Worker-52.
ERROR: Received the following error from Worker-39: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033c1430>
ERROR: Shutting down Worker-39.
ERROR: Received the following error from Worker-53: TopologicalError: The operation 'GEOSIntersection_r' could not be performed. Likely cause is invalidity of the geometry <shapely.geometry.multipolygon.MultiPolygon object at 0x7f8d033c7700>
ERROR: Shutting down Worker-53.
ERROR: Raising the last exception back to the main process.

train config

train_robomimic_image_workspace.yaml
What does this configuration file mean?
Will the data from robomimic be used to train with your method?
I want to collect my data from robomimic and train with diffusion_policy. How can I do?

执行gym项目中的async_vector_env.py文件里的reset_async函数时，diffusion_policy崩溃

我在阿里云上租用了一块V100，当diffusion_policy被安装到阿里云之后，按照diffusion_policy中给出的说明，在Running for a single seed方式下，执行
python train.py --config-dir=. --config-name=image_pusht_diffusion_policy_cnn.yaml training.seed=42 training.device=cuda:0 hydra.run.dir='data/outputs/${now:%Y.%m.%d}/${now:%H.%M.%S}${name}${task_name}'
指令时，程序只能正常运行一个批次的训练。当进行完一个批次的训练以后，计算机调用gym项目中的async_vector_env.py文件里的reset_async函数时，出现了崩溃现象。是不是在pusht_image_runner.py文件中的run(self, policy: BaseImagePolicy)函数里，一些语句写错了，从而引发了程序执行的异常，还要把源代码修改修改才能正常进行？或者说，是不是单独一块V100执行不了diffusion_policy，从而引发了上面所说的程序执行异常？

How to run real-world example on a uFactory lite 6 robot

Great project!

I understand that the rtde_interpolation_controller.py is controlling the UR5.
Is there any way to use a uFactory lite 6 robot arm instead?

Thanks for your help!

Question regarding real-world experiment dataflow design

Hi Cheng, I found that in demo_real_robot.py, you sync data from different sources manually, instead of using ROS. Also, multiple realsense cameras are handled by yourself instead of multiple ROS nodes. Is there any reason behind this design?

EOF Error in "async_vector_env.py"

Hi, I am trying to run the command in the README.md for Reproducing Simulation Benchmark Results:


============= Initialized Observation Utils with Obs Spec =============

using obs modality: low_dim with keys: ['agent_pos']
using obs modality: rgb with keys: ['image']
using obs modality: depth with keys: []
using obs modality: scan with keys: []
/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and will be removed in 0.15, please use 'weights' instead.
  warnings.warn(
/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or `None` for 'weights' are deprecated since 0.13 and will be removed in 0.15. The current behavior is equivalent to passing `weights=None`.
  warnings.warn(msg)
[2023-12-02 23:07:29,677][diffusion_policy.model.diffusion.conditional_unet1d][INFO] - number of parameters: 2.515119e+08
Diffusion params: 2.515119e+08
Vision params: 1.119709e+07
pygame 2.1.2 (SDL 2.0.16, Python 3.9.15)
Hello from the pygame community. https://www.pygame.org/contribute.html
wandb: Currently logged in as: jehanyang (jehan_testcrew). Use `wandb login --relogin` to force relogin
wandb: wandb version 0.16.0 is available!  To upgrade, please run:
wandb:  $ pip install wandb --upgrade
wandb: Tracking run with wandb version 0.13.3
wandb: Run data is saved locally in /home/projectimit/diffusion_project/diffusion_policy/data/outputs/2023.12.02/23.07.27_train_diffusion_unet_hybrid_pusht_image/wandb/run-20231202_230734-1g8u9a71
wandb: Run `wandb offline` to turn off syncing.
wandb: Syncing run 2023.01.16-20.20.06_train_diffusion_unet_hybrid_pusht_image
wandb: ⭐️ View project at https://wandb.ai/jehan_testcrew/diffusion_policy_debug
wandb: 🚀 View run at https://wandb.ai/jehan_testcrew/diffusion_policy_debug/runs/1g8u9a71
Process Worker<AsyncVectorEnv>-55:                                              
Killed
(robodiff) projectimit@RCHI-CPU-4:~/diffusion_project/diffusion_policy$ Traceback (most recent call last):
  File "/home/projectimit/diffusion_project/diffusion_policy/diffusion_policy/gym_util/async_vector_env.py", line 622, in _worker_shared_memory
    command, data = pipe.recv()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
    raise EOFError
EOFError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/projectimit/diffusion_project/diffusion_policy/diffusion_policy/gym_util/async_vector_env.py", line 669, in _worker_shared_memory
    pipe.send((None, False))
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 211, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Process Worker<AsyncVectorEnv>-54:
Traceback (most recent call last):
  File "/home/projectimit/diffusion_project/diffusion_policy/diffusion_policy/gym_util/async_vector_env.py", line 622, in _worker_shared_memory
    command, data = pipe.recv()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 255, in recv
    buf = self._recv_bytes()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 419, in _recv_bytes
    buf = self._recv(4)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 388, in _recv
    raise EOFError
EOFError

The above block repeats about 50 times.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/home/projectimit/diffusion_project/diffusion_policy/diffusion_policy/gym_util/async_vector_env.py", line 669, in _worker_shared_memory
    pipe.send((None, False))
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 211, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 416, in _send_bytes
    self._send(header + buf)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 373, in _send
    n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
Exception in thread MsgRouterThr:
Traceback (most recent call last):
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/site-packages/wandb/sdk/interface/router.py", line 70, in message_loop
    msg = self._read_message()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/site-packages/wandb/sdk/interface/router_queue.py", line 36, in _read_message
    msg = self._response_queue.get(timeout=1)
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/queues.py", line 117, in get
    res = self._recv_bytes()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 217, in recv_bytes
    self._check_closed()
  File "/home/projectimit/miniforge3/envs/robodiff/lib/python3.9/multiprocessing/connection.py", line 141, in _check_closed
    raise OSError("handle is closed")
OSError: handle is closed

[Question] Why a_{t-1}, a_{t-2}, ... also contribute to diffusion loss?

Dear Cheng @cheng-chi,

Thank you for your elegant and inspiring codes! I have a little question about the loss computation of noise prediction.

I think that actions before $a_t$ (i.e., $a_{t-1}, a_{t-2},...$) in a sample of prediction horizon also contribute to the loss because they are not masked out. Am I right about this?

If noise of $a_{t-1}, a_{t-2},...$ contributing to loss is a speicial design, I wonder the reason of this. Is it for actions consistency or convenience? Because I think that just taking actions from $t$ and performing diffusion on $(a_t, a_{t+1},...)$ are also intuitive. Why when we are at time $t$ we still predict the action in the past (I know that they are not returned by predict_action)?

Thank you for your time!

Regards,
Dongjie

Possible Issue with Data Organization in Buffer Addition of kitchen_lowdim_dataset.py

Hello,

Thank you so much for your amazing work and beautiful code.

I'm currently working with Franka Kitchen dataset where the dimensions of a numpy array masks suggest that there are 566 demonstrations, each represented by a column in a (409, 566) shape array. Each column in masks corresponds to the existence of data per timestep for each demonstration.

However, I have encountered what seems to be a potential issue in the code where episodes are being added to the replay buffer. Here is the snippet:

diffusion_policy/diffusion_policy/dataset/kitchen_lowdim_dataset.py

Lines 23 to 37 in 548a52b

 data_directory = pathlib.Path(dataset_dir) 

 observations = np.load(data_directory / "observations_seq.npy") 

 actions = np.load(data_directory / "actions_seq.npy") 

 masks = np.load(data_directory / "existence_mask.npy") 

 self.replay_buffer = ReplayBuffer.create_empty_numpy() 

 for i in range(len(masks)): 

 eps_len = int(masks[i].sum()) 

 obs = observations[i,:eps_len].astype(np.float32) 

 action = actions[i,:eps_len].astype(np.float32) 

 data = { 

 'obs': obs, 

 'action': action 

 } 

 self.replay_buffer.add_episode(data)

data_directory = pathlib.Path(dataset_dir)
observations = np.load(data_directory / "observations_seq.npy")
actions = np.load(data_directory / "actions_seq.npy")
masks = np.load(data_directory / "existence_mask.npy")

self.replay_buffer = ReplayBuffer.create_empty_numpy()
for i in range(len(masks)):
eps_len = int(masks[i].sum())
obs = observations[i,:eps_len].astype(np.float32)
action = actions[i,:eps_len].astype(np.float32)
data = {
'obs': obs,
'action': action
}
self.replay_buffer.add_episode(data)

From this code, it appears that each iteration of the loop is supposed to handle a single demonstration. However, the indexing used (observations[i,:eps_len] and actions[i,:eps_len]) seems to imply that the demonstrations are organized by rows rather than columns. If each demonstration is indeed a column in the observations and actions arrays, the correct indexing should possibly be column-wise rather than row-wise.

Could you please confirm whether the demonstrations are intended to be represented as rows or columns in the dataset? If they are indeed columns, would the correct approach be to modify the indexing to reflect this structure?

Thank you for looking into this matter. I am looking forward to your clarification.

Best regards

Question about the controller used for diffusion policy

Hi,

Thank you for your fantastic work and beautiful code! As described in the paper, you use position control instead of velocity control on the robomimic tasks for diffusion policy. However, I didn't find the corresponding changes to the robomimic environment controller in the code, except for the use of "abs_action". Did I miss something?

Thank you so much for your time.

Best regards,

Weikang

Regarding pretrained models

Hello,

Thanks again for this great project. It would be great if you could help diagnose this issue regarding the pretrained models.

When I try to evaluate your pretrained models of the hybrid CNN setting, I found them not working properly on Push-T, Transport ph, and transport mh. There could be more but I haven't tried them yet.
Basically, the action trajectory is relatively reasonable (not random noisy actions), but the agent just could not finish the task. (Push-T mean score: 0.09, Transport mean score: 0)
However, when I tried to train a model from scratch and evaluate it, it works fine, which may indicate that the evaluation code is correct.
I tested them on two machines at different locations, and all models are directly downloaded from your website. I also performed the integrity check and confirm that the two copies of the models on the two machines are identical. The training code can properly load the model file and the num of epochs matches the filename. But, it just does not generate the correct actions. After days of debugging, I could not find any possible directions to look into.

So could you please share some insights on what may cause this issue?

Thank you so much!

Best regards,

How much computing power is required at least?

Hello, Could you introduce your hardware setup, for example, how many GPUs you use or how much computing power is required? Thanks!

Extending to a new robot

Hi,

Really great work. We are trying to extend this codebase to a different robot and with a significantly different learning setup.

In your README, it is mentioned:

"Most of our implementations of Dataset uses a combination of ReplayBuffer and SequenceSampler to generate samples. Correctly handling padding at the beginning and the end of each demonstration episode according to To and Ta is important for good performance. Please read our SequenceSampler before implementing your own sampling method."

Is it really that simple to understand exactly how SequenceSampler works? I feel like it's a bit unapproachable and it would take quite some time to really parse the code and understand what's happening. Has anyone else tried to extend the code without adhering to this structure?

I am in the process of doing that, but as I continue, I am altering significantly parts of the training code because again I am not using the existing SequenceSampler, etc. Any pointers would be helpful!

Visualization of the trajectories

I like your work, I have learned so much from your code and paper. 👍

I still have a question about how to visualize the intermediate result, such as, how to visualize the generated trajectories with different K steps.

Do you have any suggestions about it? I will gratefully appreciate it.

Beautiful code ❤️

and README!

config yaml for the pusht_real experiment?

Tried to train on the pusht_real w. images training data but I can't seem to find the config.yaml for this experiment. Is it me or is there only a sim pusht config file?

Running eval.py "main workspace = cls(cfg, output_dir=output_dir) TypeError: init() got an unexpected keyword argument 'output_dir'?

Appreciate your work and detailed github page, well done!
Just have a minor question, when I tried to reproduce the "pusht" example following the instruction, it worked out all well. However, when I switch to other examples(I only changed corresponding yaml and ckpt file and names), it always throw out the error below:

~/diffusion_policy$ python eval.py --checkpoint data/epoch=5900-test_mean_score=1.000.ckpt --output_dir data/transport_eval_output --device cuda:0
Traceback (most recent call last):
File "/home/ubuntu/diffusion_policy/eval.py", line 64, in
main()
File "/home/ubuntu/anaconda3/envs/robodiff/lib/python3.9/site-packages/click/core.py", line 1128, in call
return self.main(*args, **kwargs)
File "/home/ubuntu/anaconda3/envs/robodiff/lib/python3.9/site-packages/click/core.py", line 1053, in main
rv = self.invoke(ctx)
File "/home/ubuntu/anaconda3/envs/robodiff/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/ubuntu/anaconda3/envs/robodiff/lib/python3.9/site-packages/click/core.py", line 754, in invoke
return __callback(*args, **kwargs)
File "/home/ubuntu/diffusion_policy/eval.py", line 34, in main
workspace = cls(cfg, output_dir=output_dir)
TypeError: init() got an unexpected keyword argument 'output_dir'

same error for "can" case, for example. Wonder why and how to fix it?

Running the diffusion policy colab, subprocess-exited-with-error

I am attempting to run the state-based colab. But the first cell seems to be running into an issue.

I think the first cell tries to run the installation code, but I get the error as shown in the screenshot above. For your copy/paste convenience, the error message is:

Python 3.9.16
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Is there something that I am missing with regards to how to run and use colab? I am also running into the same error with the vision-based Diffusion Policy colab. For colab, normally I just run the cells by clicking the arrow that runs the cell, or just SHIFT+ENTER.

ConditionalUnet1D up_modules

Hi, thanks for this amazing project!

I'm having trouble using the ConditionalUnet1D as part of a custom low dimensional policy.
I have a pretty simple set up with a set of 12-dimensional actions and corresponding 42-dimensional observations. I want to predict the action for a given observation. I.e. horizon=1, n_action_steps=1 and n_obs_steps=1.

When calling the ConditionalUnet1D model (i.e. the forward method) I keep getting a mismatch of dimensions error:

226        for idx, (resnet, resnet2, upsample) in enumerate(self.up_modules):
227            x = torch.cat((x, h.pop()), dim=1)

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list.

Here is the link to the line repo.

This seems to stem from the previous iteration of the for loop where the upsample call returns a tensor with dimensions (256, 512, 2). This is incompatible with the next entry in the h list which has dimensions (256, 512, 1). Due to the mismatch in dimension 2, they cannot be concatenated along axis 1.

If I simply comment out the upsample call (i.e. this line) everything seems to be working fine and I even get reasonable results.

Might there be an issue with the upsample module or did I not configure my dimensions correctly?

Thanks!
Jannes

Question: about the franka's end-effector

Thanks for your novel work!

I'm curious about what's the end-effector of the panda robot in the 6DoF Flipping task? Did you print that or that's an official/third party component? I'd appreciate it if you could show me the way.

Realsense grabbing data issue

In single_realsense.py there is the following code:

# grab data
data = dict()
data["camera_receive_timestamp"] = receive_time
# realsense report in ms
data["camera_capture_timestamp"] = frameset.get_timestamp() / 1000
if self.enable_color:
    color_frame = frameset.get_color_frame()
    data["color"] = np.asarray(color_frame.get_data())
    t = color_frame.get_timestamp() / 1000
    data["camera_capture_timestamp"] = t
    # print('device', time.time() - t)
    # print(color_frame.get_frame_timestamp_domain())
if self enable_depth:
    data["depth"] = np.asarray(frameset.get_depth_frame().get_data())
if self.enable_infrared:
    data["infrared"] = np.asarray(
        frameset.get_infrared_frame().get_data()
    )

I'm wondering why the update for camera_capture_timestamp only occurs when color_frame is involved, but not when dealing with depth and infrared. Is it because enable_color is always set to True, so the timestamp is always based on the acquisition of the color image for all three types of images?

how to prepare a custom skill dataset in the simulator?

Could you give me some hits about how to prepare a custom skill dataset in the simulator?

Goal conditioning via FiLM?

Authors,

First, I greatly appreciate your insightful paper and well-written code!

My questions is in regards to goal-conditioning. In section 3.1 of your paper (Network Architecture Options) when discussing the CNN-based Diffusion Policy, you mention:

"However, goal conditioning is still possible with the same FiLM conditioning method used for observations."

I wonder if you could comment along these lines a bit? As the trajectory sampled is the conditional probability p(A|O), would one simply encode the goal observation and concat it to the initializing observations Ot? Or do you mean something else, like a restructuring to sample the joint trajectory p(A, O)? I'm sorry, I think I'm missing something.

Thanks very much for your time,

Robert Mash

Slow Diffusion Policy Performance on Transport MH Dataset (Image Input)

Dear @cheng-chi,

I would like to bring to your attention a performance issue I've encountered when working with the Transport MH dataset (image input) in robomimic. In particular, the performance of the diffusion policy seems to be significantly slower than expected.

Here are the details of my setup:

Hardware: NVIDIA TITAN RTX
Iteration speed: approximately 1.20 iterations/second
Time per epoch: approximately 40 minutes

The command I use to run the training process is as follows:

python train.py --config-dir configs/image/transport_mh/diffusion_policy_cnn --config-name=config.yaml training.seed=42 training.device=cuda:2 hydra.run.dir=data/outputs/${now:%Y.%m.%d}/${now:%H.%M.%S}_${name}_${task_name} dataloader.batch_size=64 dataloader.num_workers=8

Given these circumstances, I was wondering if there might be some room for optimization or if this is the expected speed considering the complexity of the task.

Also, could you provide details about the hardware you are using and the amount of time it typically takes for the diffusion policy to train on your setup? This could help me understand if what I am experiencing is within the expected range.

Looking forward to your insights.

Best Regards,
@shim0114

Real Robot Data Collection Issue

Hello 🤝 , @cheng-chi . I encountered an issue while applying your code to collect data on the iiwa7 robot. When the machine reaches certain specific positions, I noticed a problem with using Euler angles for interpolation. It causes sudden jerks or acceleration, which can be potentially dangerous. I suggest using quaternions for interpolation. I have tested this and found that using quaternion interpolation resolves the problem I described.

questions about reproduce on real ur5 robot

Hello:
@cheng-chi Thank you very much for your work. Currently, I want to reproduce the diffusion policy on the real ur5 robot, but I still have the following questions. Can you give me some advice?
1. I observed that you recorded more than a hundred demonstrations for the pushT task, and then I directly tried to use these demonstrations for training with your code. However, each epoch of training takes about 10 minutes on my computer. If it takes a long time to complete a task according to the 600 epochs in the configuration file, is this normal? What is the minimum epoch required to train a task? Because I noticed that it seems that you can train a policy in only 12 hours. My GPU is RTX3060 12G.
2. In addition, I noticed that you do not have demonstration examples for the cup-righting and spilling tasks. If I want to train a brand new task, do I only need to use the script you provided on github to perform the same operation? Are the action spaces the same for these different tasks?
Thank you and looking forward to your reply!🥺

How to generate low dim embeddings the for Kitchen Envs

In the dataset, the observation sequence is sized N*60, is there any code snippets to generate the low dim embeddings?

Questions on Velocity Control Implementation

I read about the velocity control mentioned in your paper and am curious about the specific implementation details in both simulation and real-world settings.

real robot

real experiments issue

Hi,

In real experiments, the output (action sequences) is always pointing to a weird direction. Can you please advise me on the following questions:

Do we need calibration of cameras or apply the transformation matrix between the front camera and the robot?
Do we need to 100% rebuild the environment of the simulation (position of cameras)?

BTW, I was caught in a cycle of the problem of how to map the camera and the end effector in the global coordinate. I revisit the diffusion policy in detail, I think the desired pose of the end effector will be output in base coordinates, and the objects' positions obtained by the camera will also be mapped and uniformed in the base coordinate rather than in the camera frame through the diffusion model. Please correct me if I misunderstood anything~ thanks!
Appreciate your kind reply and help!

Best regards

	data_directory = pathlib.Path(dataset_dir)
	observations = np.load(data_directory / "observations_seq.npy")
	actions = np.load(data_directory / "actions_seq.npy")
	masks = np.load(data_directory / "existence_mask.npy")

	self.replay_buffer = ReplayBuffer.create_empty_numpy()
	for i in range(len(masks)):
	eps_len = int(masks[i].sum())
	obs = observations[i,:eps_len].astype(np.float32)
	action = actions[i,:eps_len].astype(np.float32)
	data = {
	'obs': obs,
	'action': action
	}
	self.replay_buffer.add_episode(data)

real-stanford / diffusion_policy Goto Github PK

diffusion_policy's People

Contributors

Stargazers

Watchers

Forkers

diffusion_policy's Issues

Recommend Projects

Recommend Topics

Recommend Org