kvablack / susie Goto Github PK

View Code? Open in Web Editor NEW

89.0 2.0 12.0 46 KB

Code for subgoal synthesis via image editing

Home Page: https://rail-berkeley.github.io/susie

License: MIT License

Python 100.00%

susie's Introduction

susie

Code for the paper Zero-Shot Robotic Manipulation With Pretrained Image-Editing Diffusion Models.

This repository contains the code for training the high-level image-editing diffusion model on video data. For training the low-level policy, head over to the BridgeData V2 repository --- we use the gc_ddpm_bc agent, unmodified, with an action prediction horizon of 4 and the delta_goals relabeling strategy.

For integration with the CALVIN simulator and reproducing our simulated results, see our fork of the calvin-sim repo and the corresponding documentation in the BridgeData V2 repository.

Creating datasets: this repo uses dlimp for dataloading. Check out the scripts/ directory inside dlimp for creating TFRecords in a compatible format.
Installation: pip install -r requirements.txt to install the versions of required packages confirmed to be working with this codebase. Then, pip install -e .. Only tested with Python 3.10. You'll also have to manually install Jax for your platform (see the Jax installation instructions). Make sure you have the Jax version specified in requirements.txt (rather than using --upgrade as suggested in the Jax docs).
Training: once the missing dataset paths have been filled in inside base.py, you can start training by running python scripts/train.py --config configs/base.py:base.
Evaluation: robot evaluation scripts are provided in the scripts/robot directory. You probably won't be able to run them, since you don't have our robot setup, but they are there for reference. See create_sample_fn in susie/model.py for canonical sampling code.

Model Weights

The UNet weights for our best-performing model, trained on BridgeData and Something-Something for 40k steps, are hosted on HuggingFace. They can be loaded using FlaxUNet2DConditionModel.from_pretrained("kvablack/susie", subfolder="unet"). Use with the standard Stable Diffusion v1-5 VAE and text encoder.

Here's a quickstart for getting out-of-the-box subgoals using this repo:

from susie.model import create_sample_fn
from susie.jax_utils import initialize_compilation_cache
import requests
import numpy as np
from PIL import Image

initialize_compilation_cache()

IMAGE_URL = "https://rail.eecs.berkeley.edu/datasets/bridge_release/raw/bridge_data_v2/datacol2_toykitchen7/drawer_pnp/01/2023-04-19_09-18-15/raw/traj_group0/traj0/images0/im_12.jpg"

sample_fn = create_sample_fn("kvablack/susie")
image = np.array(Image.open(requests.get(IMAGE_URL, stream=True).raw).resize((256, 256)))
image_out = sample_fn(image, "open the drawer")

# to display the images if you're in a Jupyter notebook
display(Image.fromarray(image))
display(Image.fromarray(image_out))

susie's People

Contributors

Stargazers

Watchers

Forkers

evelynmitchell johnwick123f catglossop kyle-hatch-tri yunliangchen shutongjin renyu2016 imagebody jackyk02 shenbw kiteretsu77

susie's Issues

Training code on Calvin

Hi, Thanks for your excellent work!

I'm wondering whether you can release the Training code on Calvin?

low-level GCBC policy checkpoint

Hi ,

Thanks for the great work. I can not find the checkpoint for the low-level GCBC policy. Could you please provide the checkpoint for the low level GCBC policy trained on bridgedata_v2 dataset? thanks again. I will really appreciate your help.

error:failed to EGL with glad.

If I set 'show_gui' to true, the error won't occur. If I want to render in headless mode, what should I do? Thanks.

Unable to run train.py with basic settings.

Hello, I'm trying to run train.py, but got error on instruct-pix2pix model. Not sure which one it is on huggingfase.

Repository Not Found for url: https://huggingface.co/instruct-pix2pix/resolve/main/unet/config.json.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated.
Invalid username or password.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/chouyang/susie/susie/scripts/train.py", line 633, in <module>
    app.run(main)
  File "/home/chouyang/susie/env/lib/python3.10/site-packages/absl/app.py", line 308, in run
    _run_main(main, args)
  File "/home/chouyang/susie/env/lib/python3.10/site-packages/absl/app.py", line 254, in _run_main
    sys.exit(main(argv))
  File "/home/chouyang/susie/susie/scripts/train.py", line 279, in main
    pretrained_model_def, pretrained_params = load_pretrained_unet(
  File "/home/chouyang/susie/susie/susie/model.py", line 146, in load_pretrained_unet
    model_def, params = FlaxUNet2DConditionModel.from_pretrained(
  File "/home/chouyang/susie/env/lib/python3.10/site-packages/diffusers/models/modeling_flax_utils.py", line 307, in from_pretrained
    model, model_kwargs = cls.from_config(
  File "/home/chouyang/susie/env/lib/python3.10/site-packages/diffusers/configuration_utils.py", line 218, in from_config
    config, kwargs = cls.load_config(pretrained_model_name_or_path=config, return_unused_kwargs=True, **kwargs)
  File "/home/chouyang/susie/env/lib/python3.10/site-packages/diffusers/configuration_utils.py", line 362, in load_config
    raise EnvironmentError(
OSError: instruct-pix2pix is not a local folder and is not a valid model identifier listed on 'https://huggingface.co/models'
If this is a private repository, make sure to pass a token having permission to this repo with `use_auth_token` or log in with `huggingface-cli login`.
wandb: Waiting for W&B process to finish... (failed 1).

Any pointers? Thank you.

Where can we get the checkpoint for the low level GCBC policy trained on bridgedata_v2 dataset?

Hi! This is an interesting work! Recently, I tried to reproduce some results on the bridgedata_v2 dataset. However, I did not find the checkpoint for the low-level GCBC policy. It would be really appreciated if you could release this! Looking forward to your reply!

Missing config file for Calvin-sim checkpoint

Hello Kevin,

I tried evaluation your model using ./eval_susie.sh. under the calvin-sim (https://github.com/pranavatreya/calvin-sim) repo. I found that I could not use the checkpoint that you have shared here https://huggingface.co/patreya/susie-calvin-checkpoints/tree/main because it does not contain configs for the models. Would you mind either pushing the configs onto huggingface or share where I could find the corresponding config, especially for the gc_policy?

Thanks,

Do we need to install jax gpu version?

error when evaluating

hello, when i try to evaluate i get this error, what could be possibly missing

(susie-calvin) abdo@abdo:~/calvin-sim$ ./eval_susie.sh
Traceback (most recent call last):
File "calvin_models/calvin_agent/evaluation/evaluate_policy_subgoal_diffusion.py", line 12, in
import jax_diffusion_model
File "/home/abdo/calvin-sim/calvin_models/calvin_agent/evaluation/jax_diffusion_model.py", line 1, in
from susie.model import create_sample_fn
ModuleNotFoundError: No module named 'susie'

cannot run train.py

I follow your installation instruction, but cannot run

Unable to reproduce the performance of the pretrained checkpoint for Calvin-Sim

Hello Kevin,

I am back again. Thank you for looking at this issue!

So I attempted to reproduce the performance of the pretrained goal conditioned policy, but was unable to reproduce the performance of your pretrained checkpoint in Calvin-sim, and was wondering whether you could potential shed some lights on what I may be missing. The answer can be short, and doesn't have to be complete. Some simple pointers would likely suffice.

First, I downloaded the calvin-sim data, and preprocess it with experiments/susie/calvin/data_conversion_scripts/goal_conditioned.py on the abc training + d validation dataset. To get it working, I had to modify the raw_dataset_path, tfrecord_dataset_path, and then comment out the following section of code

if start_idx <= scene_info["calvin_scene_D"][1]:
      ctr = D_ctr
      D_ctr += 1
      letter = "D"

Second, I then trained the goal conditioned policy on calvin-sim using the following script

python experiments/susie/calvin/calvin_gcbc.py \
    --config experiments/susie/calvin/configs/gcbc_train_config.py:gc_ddpm_bc \
    --calvin_dataset_config experiments/susie/calvin/configs/gcbc_data_config.py:all

after updating data_path, save_dir in bridge_data_v2/experiments/susie/calvin/configs/gcbc_train_config.py.

I trained the model for 2 million steps as specified in the config, and the loss level went from ~2.5 to roughly 0.65 at the end of the training (see the plot below). Note that I did have to resume the checkpoints multiple times throughout. I then ran evaluations on multiple checkpoints throughout training coupled with the pretrained diffusion model, and these are roughly the success rate that I got for the each no. of instruction chained.

1: 57.0%
2: 21.0%
3: 7.0%
4: 2.0%
5: 1.0%

which is much worse than your pretrained gc policy + your pretrained diffusion model

1: 81.0%
2: 65.0%
3: 46.0%
4: 30.0%
5: 21.0%

If you could potentially give me some pointers on what I may be doing incorrectly, it would be greatly appreciated! :)

pretrained model weights

hello，may I ask where stable-diffusion-v1-5:flax: Flax can obtain these model weight files？

Bad performance in evaluation

I download the diffusion model and goal conditioned policy checkpoints from https://huggingface.co/patreya/susie-calvin-checkpoints and set the values of the environment variables in eval_susie.sh, but the result is not good :
Average successful sequence length: 0.4666666666666667
Success rates for i instructions in a row:
1: 33.3%
2: 13.3%
3: 0.0%
4: 0.0%
5: 0.0%
turn_on_led: 2 / 2 | SR: 100.0%
open_drawer: 4 / 4 | SR: 100.0%
turn_on_lightbulb: 1 / 1 | SR: 100.0%
push_blue_block_right: 0 / 1 | SR: 0.0%
rotate_blue_block_right: 0 / 1 | SR: 0.0%
lift_blue_block_slider: 0 / 1 | SR: 0.0%
lift_blue_block_table: 0 / 1 | SR: 0.0%
push_pink_block_left: 0 / 2 | SR: 0.0%
move_slider_left: 0 / 3 | SR: 0.0%
push_blue_block_left: 0 / 2 | SR: 0.0%
lift_red_block_slider: 0 / 1 | SR: 0.0%
push_red_block_left: 0 / 1 | SR: 0.0%
rotate_red_block_left: 0 / 1 | SR: 0.0%
lift_red_block_table: 0 / 1 | SR: 0.0%
What could be the possible issues? Thank you for your time.

Training gives nan after the first iteration

Hello, when I try to run training with the command

python scripts/train.py --config configs/base.py:base

It seems like the train loss becomes nan immediately after the first iteration. Is this something that you have encountered before?