zhouxian / act3d-chained-diffuser Goto Github PK

A unified architecture for multimodal multi-task robotic policy learning.

Python 94.51% C 4.01% Makefile 0.04% Batchfile 0.06% Lua 0.93% Shell 0.46%

act3d-chained-diffuser's Issues

Multi-task Training

Hi,

I found all the codes (Act3D & DiffusionTracjectory) use single-task training, however, multi-task performance are reported in your paper. So how to train the models in multi-task setting?

Thanks.

Distributed training not working

Hi, thanks for releasing the code!

I am trying to do distributed multi-gpu training, but the process always just launches on one gpu even though torch.cuda.device_count()>1 as printed by main_kepose.py script.

Is there a parameter I need to change? Or define world size somewhere.
I get torch.distributed.get_world_size() = 1 right now.

PS. I am running on a cluster with user level privilege, not sure if that makes a difference.

Request for checkpoint for local trajectory diffuser

Hi, thanks for sharing your work. I checked out the checkpoints you posted in 3d_diffuser_actor, where I could find checkpoints for Act3D and 3d diffuser actor.

Do the checkpoints for 3d_diffuser_actor fit the local trajectory diffuser of act3d_chained_diffuser? Otherwise, could you kindly provide the checkpoints? Thanks so much.

instructions.pkl issue

In the instructions.pkl file, each task only has variation 1. Is this what's expected?

Hyperparameters for single & multitask training

Hello! Thank you for the code and the interesting results.

I have attempted to reproduce the results and noticed a lack of hyperparameters for both single and multitask training. I did not find any information on the difference between single and multitasks setup in either the code or the paper.

Could you please clarify the difference? Are the hyperparameters the same for both settings?

timeline for code release

Hi, your work looks amazing! I wonder is there a timeline for the full release of the code? thx

bugs about data_gen.py

    def __getitem__(self, index: int) -> None:
        ...
        try:
            demo, state_ls, action_ls = get_observation(
                task, variation, episode, self.env
            )
        except (FileNotFoundError, RuntimeError, IndexError, EOFError) as e:
            print(e)
            return

        state_ls = einops.rearrange(
            state_ls,
            "t 1 (m n ch) h w -> t n m ch h w",
            ch=3,
            n=len(args.cameras),
            m=2,
        )

        frame_ids = list(range(len(state_ls) - 1))
        num_frames = len(frame_ids)
        attn_indices = get_attn_indices_from_demo(task, demo, args.cameras)

        if (task in self.variable_lengths and num_frames > self.max_eps_dict[task]) or (
            task not in self.variable_lengths and num_frames != self.max_eps_dict[task]
        ):
            print(f"ERROR ({task}, {variation}, {episode})")
            print(f"\t {len(frame_ids)} != {self.max_eps_dict[task]}")
            return

        state_dict: List = [[] for _ in range(5)]
        print("Demo {}".format(episode))
        state_dict[0].extend(frame_ids)
        state_dict[1].extend(state_ls[:-1])
        state_dict[2].extend(action_ls[1:])
        state_dict[3].extend(attn_indices)
        state_dict[4].extend(action_ls[:-1])  # gripper pos

        np.save(taskvar_dir / f"ep{episode}.npy", state_dict)  # type: ignore

I encountered an issue in the __getitem__ method of a dataset processing script. The line frames.insert(0, 0) adds an element at position 0, but after attn_indices = get_attn_indices_from_demo(task, demo, args.cameras), there's no further manipulation of attn_indices. This discrepancy leads to an error when saving the data using np.save(taskvar_dir / f"ep{episode}.npy", state_dict). The error message is as follows:

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.

How can I modify the script to resolve this issue?

The problem seems to arise from inconsistent data shapes within state_dict, especially regarding attn_indices, which does not align with the modifications made to other data sequences in the script. Any suggestions or corrections would be greatly appreciated.

Data set root does not exists: /projects/act3d-chained-diffuser/data_preprocessing/c2farm

Hi, thank you for releasing the code! I tried to run it, but encountered the following error while executing data_gen.py

Exception has occurred: RuntimeError
Data set root does not exists: /projects/act3d-chained-diffuser/data_preprocessing/c2farm
  File "/projects/act3d-chained-diffuser/RLBench/rlbench/environment.py", line 75, in _check_dataset_structure
    raise RuntimeError(
  File "/projects/act3d-chained-diffuser/RLBench/rlbench/environment.py", line 67, in __init__
    self._check_dataset_structure()
  File "/projects/act3d-chained-diffuser/utils/utils_with_rlbench.py", line 324, in __init__
    self.env = Environment(
  File "/projects/act3d-chained-diffuser/data_preprocessing/data_gen.py", line 78, in __init__
    self.env = RLBenchEnv(
  File "/projects/act3d-chained-diffuser/data_preprocessing/data_gen.py", line 171, in <module>
    dataset = Dataset(args)
RuntimeError: Data set root does not exists: /projects/act3d-chained-diffuser/data_preprocessing/c2farm

I wonder what is the c2farm folder and where to get its content

Can you release the ckpt for multi-task evaluation?

Hi, I'm very interested in your work. I have attempted to reproduce the multi-task training but did not get a reasonable result. Can you release the ckpt for evaluation?

AttributeError: 'Act3D' object has no attribute 'prepare_action

Hi there, I am running the evaluation through bash online_evaluation/eval_single_task_mandoo.sh. However, I met an AttributeError as shown below:

The model class Act3D does not contain the method prepare_action.

error when generate raw train and val data

some task report error when generating data
Will this have an negative impact on the results?

I1129 19:28:34.234854 140468340590400 task_environment.py:152] Bad demo. Error in task close_drawer. Demo was completed, but was not successful.
Process 0 // Task: reach_target // Variation: 0 // Demo: 27
Process 2 // Task: close_drawer // Variation: 0 // Demo: 15
Process 0 // Task: reach_target // Variation: 0 // Demo: 28
Process 1 // Task: close_fridge // Variation: 0 // Demo: 8
Process 0 // Task: reach_target // Variation: 0 // Demo: 29
Process 2 // Task: close_drawer // Variation: 0 // Demo: 16
Process 0 // Task: reach_target // Variation: 0 // Demo: 30
Process 0 // Task: reach_target // Variation: 0 // Demo: 31
Process 2 // Task: close_drawer // Variation: 0 // Demo: 17
Process 0 // Task: reach_target // Variation: 0 // Demo: 32
Process 1 // Task: close_fridge // Variation: 0 // Demo: 9
Process 0 // Task: reach_target // Variation: 0 // Demo: 33
Process 2 // Task: close_drawer // Variation: 0 // Demo: 18
Process 0 // Task: reach_target // Variation: 0 // Demo: 34

About condition on start-end pose when diffusion model sampling

Dear authors,

Hi!

When I read the sampling part of compute_trajectory in model/trajectory_optimization/diffusion_model.py, I am confused about the effect of _use_goal_at_test.

# Condition on start-end pose
# ...
# end pose
if self._use_goal_at_test:
	for d in range(len(cond_data)):
		neg_len_ = -trajectory_mask[d].sum().long()
		cond_data[d][neg_len_ - 1] = goal_gripper[d]
		cond_mask[d][neg_len_ - 1:] = 1
cond_mask = cond_mask.bool()

Here are my question:
When evaluating the models, you do not overwrite the sampled trajectory with the end pose because the _use_goal_at_test is set to False here. Why not use the action predicted by Act3D here in evaluation to overwrite but use the predicted one by the diffusion model?

I am looking forward to your reply. Thank you for your time.

Regrards,
Dongjie

The influence of resolution of images

Is there any experiment that shows the influence of resolution you use (256 x 256) vs the Peract (128 x 128)?

Missing trajectory generation in data_preprocessing

Hi, thanks for your work and the published code.

When running data_preprocessing/compute_workspace_bounds.py, the following error occurs:

Exception has occurred: IndexError

index 5 is out of bounds for axis 0 with size 5
  File "/projects/act3d-chained-diffuser/datasets/dataset_engine.py", line 240, in <listcomp>
    self._interpolate_traj(episode[5][i]) for i in frame_ids
  File "/projects/act3d-chained-diffuser/datasets/dataset_engine.py", line 239, in __getitem__
    traj_items = [
  File "/projects/act3d-chained-diffuser/data_preprocessing/compute_workspace_bounds.py", line 85, in <module>
    ep = dataset[i]
IndexError: index 5 is out of bounds for axis 0 with size 5

This error is due to missing trajectory generation in data_preprocessing/data_gen.py

How to jointly optimize action detector and trajectory diffuser?

Dear authors,

Thank you for your inspiring work!

I noticed that in the ChainedDiffuser paper your mentioned that "...train both the action detector and the trajectory diffuser jointly" and "we train the first 2 terms till convergence, and then add the 3rd term for joint optimization". However, I did not see that there are codes for joint optimization because the only model in main_trajectory.py is a DiffusionPlanner.

Would you please explain more about the actual joint training of Act3d and DiffusionPlanner?

Regards,
Dongjie

code got stuck in sampling

When running the turn_tap dataset, the code got stuck in sample_ghost_points_uniform_sphere, because it can't find a point closer enough to the center (the lr norm > 1 for nearly all the points).

What's the recommend solution in this case? I afraid that directly changing the point selection threshold will result in bad performance.

Success rate < 1.0 reported in validate_data_generation.py

Hi, thank you for the published code!

I am not well familiar with RLBench and wonder: when the success rate for the variation is less than 1.0, how should we interpret this value?

Does it mean that the episode data is bad, and we need to drop such episodes?

Regarding Evaluation Checkpoints

Hi, Could you please provide evaluation checkpoints for the model? Thanks, Harsh

Campatibility issues

Hi Zhuoxian,

Amazing work!

I tried to run your code follows the steps in README, however, I got some errors with module "tap". Would you mind to tell me which tap do you installed and how do installed it?

Here are some error messages:

ModuleNotFoundError: No module named 'tap'

   except MemcachedError, e:
                         ^
SyntaxError: invalid syntax

    import exceptions
ModuleNotFoundError: No module named 'exceptions'

Best,
XP

CoppeliaSim symlink error

Hi. I'm using this paper for my research and currently reproducing the results. I came across this issue while installing PyRep; to be specific the below command:

" pip install -r requirements.txt; pip install -e .; cd .. "

Please see the below screenshot for more details.

The command "pip install -e ." is what is causing troubles

Could you release the trained checkpoint for evaluation?

Hello, I am very interested in your work. Could you release the trained checkpoint for evaluation?

zhouxian / act3d-chained-diffuser Goto Github PK

act3d-chained-diffuser's Issues

Recommend Projects

Recommend Topics

Recommend Org