Giter VIP home page Giter VIP logo

scene-diffuser's Introduction

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Paper PDF Paper arXiv Project Page HuggingFace Checkpoints

Siyuan Huang*, Zan Wang*, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu, Wei Liang, Song-Chun Zhu

This repository is the official implementation of paper "Diffusion-based Generation, Optimization, and Planning in 3D Scenes".

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior work, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented.

Paper | arXiv | Project | HuggingFace Demo | Checkpoints

Abstract

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented. With an iterative sampling strategy, SceneDiffuser jointly formulates the scene-aware generation, physics-based optimization, and goal-oriented planning via a diffusion-based denoising process in a fully differentiable fashion. Such a design alleviates the discrepancies among different modules and the posterior collapse of previous scene-conditioned generative models. We evaluate SceneDiffuser with various 3D scene understanding tasks, including human pose and motion generation, dexterous grasp generation, path planning for 3D navigation, and motion planning for robot arms. The results show significant improvements compared with previous models, demonstrating the tremendous potential of SceneDiffuser for the broad community of 3D scene understanding.

News

  • [ 2023.04 ] We release the code for grasp generation and arm motion planning!

Setup

  1. Create a new conda environemnt and activate it.

    conda create -n 3d python=3.8
    conda activate 3d
  2. Install dependent libraries with pip.

    pip install -r pre-requirements.txt
    pip install -r requirements.txt
    • We use pytorch1.11 and cuda11.3, modify pre-requirements.txt to install other versions of pytorch.
  3. Install Isaac Gym and install pointnet2 by executing the following command (optional for grasp generation and arm motion planning).

    pip install git+https://github.com/daveredrum/Pointnet2.ScanNet.git#subdirectory=pointnet2

Data & Checkpoints

1. Data

You can use our pre-processed data or process the data by yourself following the instructions.

But, you also need to download some official released data assets which are not processed, see instructions. Please remember to use your own data path by modifying the path configuration in:

  • scene_model.pretrained_weights in model/*.yaml for the path of pre-trained scene encoder (if you use a pre-trained scene encoder)

  • dataset.*_dir/dataset.*_path configurations in task/*.yaml for the path of data assets

2. Checkpoints

Download our pre-trained model and unzip them into a folder, e.g., ./outputs/.

task checkpoints desc
Pretrained Point Transformer 2022-04-13_18-29-56_POINTTRANS_C_32768
Pose Generation 2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100
Motion Generation 2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300 w/o start position
Motion Generation 2022-11-09_14-28-12_MotionGen_ddm_T200_lr1e-4_ep300_obser w/ start position
Path Planning 2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL
Grasp Generation 2022-11-15_18-07-50_GPUR_l1_pn2_T100
Arm Motion Planning 2022-11-11_14-28-30_FK2Plan_ptr_T30_4 denoising step is 30

Task-1: Human Pose Generation in 3D Scenes

Train

  • Train with single gpu

    bash scripts/pose_gen/train.sh ${EXP_NAME}
  • Train with 4 GPUs (modify scripts/pose_gen/train_ddm.sh to specify the visible GPUs)

    bash scripts/pose_gen/train_ddm.sh ${EXP_NAME}

Test (Quantitative Evaluation)

bash scripts/pose_gen/test.sh ${CKPT} [OPT]
# e.g., bash scripts/pose_gen/test.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT
  • [OPT] is optional for optimization-guided sampling.

Sample (Qualitative Visualization)

bash scripts/pose_gen/sample.sh ${CKPT} [OPT]
# e.g., bash scripts/pose_gen/sample.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT
  • [OPT] is optional for optimization-guided sampling.

Task-2: Human Motion Generation in 3D Scenes

The default configuration is motion generation without observation. If you want to explore the setting of motion generation with start observation, please change the task.has_observation to true in all the scripts in folder ./scripts/motion_gen/.

Train

  • Train with single gpu

    bash scripts/motion_gen/train.sh ${EXP_NAME}
  • Train with 4 GPUs (modify scripts/motion_gen/train_ddm.sh to specify the visible GPUs)

    bash scripts/motion_gen/train_ddm.sh ${EXP_NAME}

Test (Quantitative Evaluation)

bash scripts/motion_gen/test.sh ${CKPT} [OPT]
# e.g., bash scripts/motion_gen/test.sh ./outputs/2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300/ OPT
  • [OPT] is optional for optimization-guided sampling.

Sample (Qualitative Visualization)

bash scripts/motion_gen/sample.sh ${CKPT} [OPT]
# e.g., bash scripts/motion_gen/sample.sh ./outputs/2022-11-09_12-54-50_MotionGen_ddm_T200_lr1e-4_ep300/ OPT
  • [OPT] is optional for optimization-guided sampling.

Task-3: Dexterous Grasp Generation for 3D Objects

To run this code, you first need to change the git branch to obj by executing

git checkout obj

Make sure you have installed Isaac Gym and pointnet2. See Setup section.

Train

  • Train with single gpu (one gpu is enough)

    bash scripts/grasp_gen_ur/train.sh ${EXP_NAME}

Sample (Qualitative Visualization)

bash scripts/grasp_gen_ur/sample.sh ${CKPT} [OPT]
# e.g., bash scripts/grasp_gen_ur/sample.sh ./outputs/2022-11-15_18-07-50_GPUR_l1_pn2_T100/ OPT
  • [OPT] is optional for optimization-guided sampling.

Test (Quantitative Evaluation)

You first need to run scripts/grasp_gen_ur/sample.sh to sample some results. Then we will compute quantitative metrics with these sampled results.

bash scripts/grasp_gen_ur/test.sh ${EVAL_DIR} ${DATASET_DIR}
# e.g., bash scripts/grasp_gen_ur/test.sh outputs/2022-11-15_18-07-50_GPUR_l1_pn2_T100/eval/final/2023-04-20_13-06-44 YOUR_PATH/data/MultiDex_UR

Task-4: Path Planning in 3D Scenes

Train

  • Train with single gpu

    bash scripts/path_planning/train.sh ${EXP_NAME}
  • Train with 4 GPUs (modify scripts/path_planning/train_ddm.sh to specify the visible GPUs)

    bash scripts/path_planning/train_ddm.sh ${EXP_NAME}

Test (Quantitative Evaluation)

bash scripts/path_planning/plan.sh ${CKPT}
# e.g., bash scripts/path_planning/plan.sh ./outputs/2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL/

Sample (Qualitative Visualization)

bash scripts/path_planning/sample.sh ${CKPT} [OPT] [PLA]
# e.g., bash scripts/path_planning/sample.sh ./outputs/2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL/ OPT PLA
  • The program will generate trajectories with given start position and scene; rendering the results into images. (The results not the planning results, just use diffuser to generate diverse trajectories.)
  • [OPT] is optional for optimization-guided sampling.
  • [PLA] is optional for planner-guided sampling.

Task-5: Motion Planning for Robot Arms

To run this code, you first need to change the git branch to obj by executing

git checkout obj

Make sure you have installed Isaac Gym and pointnet2. See Setup section.

Train

  • Train with single gpu

    bash scripts/franka_planning/train.sh ${EXP_NAME}
  • Train with 4 GPUs (modify scripts/path_planning/train_ddm.sh to specify the visible GPUs)

    bash scripts/franka_planning/train_ddm.sh ${EXP_NAME}

Test (Quantitative Evaluation)

bash scripts/franka_planning/plan.sh ${CKPT}
# e.g., bash scripts/franka_planning/plan.sh outputs/2022-11-11_14-28-30_FK2Plan_ptr_T30_4/

Citation

If you find our project useful, please consider citing us:

@article{huang2023diffusion,
  title={Diffusion-based Generation, Optimization, and Planning in 3D Scenes},
  author={Huang, Siyuan and Wang, Zan and Li, Puhao and Jia, Baoxiong and Liu, Tengyu and Zhu, Yixin and Liang, Wei and Zhu, Song-Chun},
  journal={arXiv preprint arXiv:2301.06015},
  year={2023}
}

Acknowledgments

Some codes are borrowed from latent-diffusion, PSI-release, Pointnet2.ScanNet, point-transformer, and diffuser.

License

This project is licensed under the MIT License. See LICENSE for more details. The following datasets are used in this project and are subject to their respective licenses:

scene-diffuser's People

Contributors

scenediffuser avatar silverster98 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

scene-diffuser's Issues

<lambda> already registered

Traceback (most recent call last):
  File "plan.py", line 16, in <module>
    from models.environment import create_enviroment
  File "/home/dennisushi/repos/temp/Scene-Diffuser/models/environment.py", line 551, in <module>
    class PathPlanningEnvWrapper():
  File "/home/dennisushi/repos/temp/Scene-Diffuser/utils/registry.py", line 61, in deco
    self._do_register(name, func_or_class)
  File "/home/dennisushi/repos/temp/Scene-Diffuser/utils/registry.py", line 45, in _do_register
    assert (
AssertionError: An object named '<lambda>' was already registered in 'Env' registry!

Have you encountered this before? I am trying to run the arm motion planning checkpoint bash scripts/franka_planning/plan.sh ./chkpt/2022-11-11_14-28-30_FK2Plan_ptr_T30_4/.

NoSuchDisplayException

When I run this' When I run this ' bash scripts/pose_gen/sample.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT 'command 'command, the system encountered the error 'NoSuchDisplayException'. The details are as follows:

Error executing job with overrides: ['exp_dir=./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/', 'diffuser=ddpm', 'model=unet', 'task=pose_gen', 'task.visualizer.ksample=10', 'optimizer=pose_in_scene', 'optimizer.scale_type=div_var', 'optimizer.scale=2.5', 'optimizer.vposer=false', 'optimizer.contact_weight=0.02', 'optimizer.collision_weight=1.0']
Traceback (most recent call last):
File "sample.py", line 92, in main
visualizer.visualize(model, dataloaders['test'], vis_dir)
File "/root/autodl-tmp/Scene-Diffuser/models/visualizer.py", line 122, in visualize
render_prox_scene(meshes, camera_pose, save_path)
File "/root/autodl-tmp/Scene-Diffuser/utils/visualize.py", line 65, in render_prox_scene
r = pyrender.OffscreenRenderer(
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/offscreen.py", line 31, in init
self._create()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/offscreen.py", line 149, in _create
self._platform.init_context()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/platforms/pyglet_platform.py", line 50, in init_context
self._window = pyglet.window.Window(config=conf, visible=False,
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/window/xlib/init.py", line 133, in init
super(XlibWindow, self).init(*args, **kwargs)
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/window/init.py", line 513, in init
display = pyglet.canvas.get_display()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/canvas/init.py", line 59, in get_display
return Display()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/canvas/xlib.py", line 88, in init
raise NoSuchDisplayException(f'Cannot connect to "{name}"')
pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Offscreen rendering

If you use osmesa as the backend, please uninstall pyopengl and then reinstall it with pip install pyopengl==3.1.5.

question about 'lemo/normalization.pkl'

I notice this code in dataloader:
self.normalizer = NormalizerPoseMotion((xmin, xmax)) with 'lemo/normalization.pkl'
and I wonder what is this for and the how to get the parameters in normalization.pkl ( I don't find relevant code in LEMO ) ?

NoSuchDisplayException

When I run this' When I run this' zz 'command 'command, the system encountered the error 'NoSuchDisplayException'. The details are as follows:

Error executing job with overrides: ['exp_dir=./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/', 'diffuser=ddpm', 'model=unet', 'task=pose_gen', 'task.visualizer.ksample=10', 'optimizer=pose_in_scene', 'optimizer.scale_type=div_var', 'optimizer.scale=2.5', 'optimizer.vposer=false', 'optimizer.contact_weight=0.02', 'optimizer.collision_weight=1.0']
Traceback (most recent call last):
File "sample.py", line 92, in main
visualizer.visualize(model, dataloaders['test'], vis_dir)
File "/root/autodl-tmp/Scene-Diffuser/models/visualizer.py", line 122, in visualize
render_prox_scene(meshes, camera_pose, save_path)
File "/root/autodl-tmp/Scene-Diffuser/utils/visualize.py", line 65, in render_prox_scene
r = pyrender.OffscreenRenderer(
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/offscreen.py", line 31, in init
self._create()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/offscreen.py", line 149, in _create
self._platform.init_context()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyrender/platforms/pyglet_platform.py", line 50, in init_context
self._window = pyglet.window.Window(config=conf, visible=False,
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/window/xlib/init.py", line 133, in init
super(XlibWindow, self).init(*args, **kwargs)
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/window/init.py", line 513, in init
display = pyglet.canvas.get_display()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/canvas/init.py", line 59, in get_display
return Display()
File "/root/Scene-Diffuser/tutorial-env/lib/python3.8/site-packages/pyglet/canvas/xlib.py", line 88, in init
raise NoSuchDisplayException(f'Cannot connect to "{name}"')
pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None"

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

Weird Eval Results

I followed the step and render the pose using "bash scripts/pose_gen/sample.sh ./outputs/2022-11-09_11-22-52_PoseGen_ddm4_lr1e-4_ep100/ OPT", but I got weird results.
image
I can't figure out how this could happen. Any suggestion?

Abnormal planning results

When I ran this command "bash scripts/path_planning/sample.sh ./outputs/2022-11-25_20-57-28_Path_ddm4_LR1e-4_E100_REL/ OPT PLA", nearly all outputs are abnormal. Any suggestion?
image

I encountered the following two errors repeatedly when installing the https://github.com/Silverster98/pointops package. How can I solve them

(3d) D:\Scene-Diffuser>pip install "git+https://github.com/Silverster98/pointops"
Collecting git+https://github.com/Silverster98/pointops
Cloning https://github.com/Silverster98/pointops to c:\users\tony\appdata\local\temp\pip-req-build-_chb1j3f
Running command git clone --filter=blob:none --quiet https://github.com/Silverster98/pointops 'C:\Users\tony\AppData\Local\Temp\pip-req-build-_chb1j3f'
Resolved https://github.com/Silverster98/pointops to commit 86c68b1e63092152b4ab14ff09b832c5d2d0dc97
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [6 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
File "C:\Users\tony\AppData\Local\Temp\pip-req-build-_chb1j3f\setup.py", line 9, in
flag for flag in opt.split() if flag != '-Wstrict-prototypes'
AttributeError: 'NoneType' object has no attribute 'split'
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

(3d) D:\Scene-Diffuser>pip install "git+https://github.com/Silverster98/pointops"
Collecting git+https://github.com/Silverster98/pointops
Cloning https://github.com/Silverster98/pointops to c:\users\tony\appdata\local\temp\pip-req-build-m97mzi9q
Running command git clone --filter=blob:none --quiet https://github.com/Silverster98/pointops 'C:\Users\tony\AppData\Local\Temp\pip-req-build-m97mzi9q'
fatal: unable to access 'https://github.com/Silverster98/pointops/': Recv failure: Connection was reset
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/Silverster98/pointops 'C:\Users\tony\AppData\Local\Temp\pip-req-build-m97mzi9q' did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git clone --filter=blob:none --quiet https://github.com/Silverster98/pointops 'C:\Users\tony\AppData\Local\Temp\pip-req-build-m97mzi9q' did not run successfully.
│ exit code: 128
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

Queries on the missing files of franka_planning and path_planning

Thank you for your great job!
I just want to follow these five benchmarks that you provided.
However, both the task of franka_planning and path_planning have no test files for evaluating their success rates.
In addition, the visualizer called FK2PlanVisualizer seems to have disappeared.
Could you upload these files? Help me follow the convincing evaluation metrics.

Appreciate your responses.
Best wishes
Thank you

Question about variable σ in Dexterous Grasp Generation for 3D Objects

Hi, thanks for your great work.

I have an issue with the variable σ on Tab.3. In your released code, test results (Success rate, Diversity and Collision) are computed in the same way as GendexGrasp. However, on Tab. 3 and Tab. A3, σ is used to compute success rate, which is different from the released code.

May I ask if test code for reproducing Tab. 3 and Tab. A3 will be released? Or if test results using the released test.py can be shared?

And what's the difference between the diversity metric you presented in this paper and the diversity metric proposed by GenDexGrasp, which you also used in your ablation experiments? Can you elaborate more about the function of the diversity metric you presented in this paper?

Thanks for your time.

questions about lemo/normalization.pkl

I notice this code in dataloader:
self.normalizer = NormalizerPoseMotion((xmin, xmax)) with 'lemo/normalization.pkl'
and I wonder what is this for and the how to get the parameters in normalization.pkl ( I don't find relevant code in LEMO ) ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.