google-research / robopianist Goto Github PK

View Code? Open in Web Editor NEW

542.0 12.0 45.0 2.23 MB

[CoRL '23] Dexterous piano playing with deep reinforcement learning.

Home Page: https://kzakka.com/robopianist/

License: Apache License 2.0

Makefile 0.09% Python 96.20% Shell 1.04% Jupyter Notebook 2.67%

bimanual dexterous-manipulation mujoco piano reinforcement-learning shadow-hand

robopianist's Introduction

RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning

RoboPianist is a new benchmarking suite for high-dimensional control, targeted at testing high spatial and temporal precision, coordination, and planning, all with an underactuated system frequently making-and-breaking contacts. The proposed challenge is mastering the piano through bi-manual dexterity, using a pair of simulated anthropomorphic robot hands.

This codebase contains software and tasks for the benchmark, and is powered by MuJoCo.

Latest Updates
Getting Started
Installation
MIDI Dataset
CLI
Contributing
FAQ
Citing RoboPianist
Acknowledgements
Works that have used RoboPianist
License and Disclaimer

Latest Updates

[24/12/2023] Updated install script so that it checks out the correct Menagerie commit. Please re-run bash scripts/install_deps.sh to update your installation.
[17/08/2023] Added a pixel wrapper for augmenting the observation space with RGB images.
[11/08/2023] Code to train the model-free RL policies is now public, see robopianist-rl.

Getting Started

We've created an introductory Colab notebook that demonstrates how to use RoboPianist. It includes code for loading and customizing a piano playing task, and a demonstration of a pretrained policy playing a short snippet of Twinkle Twinkle Little Star. Click the button below to get started!

Installation

RoboPianist is supported on both Linux and macOS and can be installed with Python >= 3.8. We recommend using Miniconda to manage your Python environment.

Install from source

The recommended way to install this package is from source. Start by cloning the repository:

git clone https://github.com/google-research/robopianist.git && cd robopianist

Next, install the prerequisite dependencies:

git submodule init && git submodule update
bash scripts/install_deps.sh

Finally, create a new conda environment and install RoboPianist in editable mode:

conda create -n pianist python=3.10
conda activate pianist

pip install -e ".[dev]"

To test your installation, run make test and verify that all tests pass.

Install from PyPI

First, install the prerequisite dependencies:

bash <(curl -s https://raw.githubusercontent.com/google-research/robopianist/main/scripts/install_deps.sh) --no-soundfonts

Next, create a new conda environment and install RoboPianist:

conda create -n pianist python=3.10
conda activate pianist

pip install --upgrade robopianist

Optional: Download additional soundfonts

We recommend installing additional soundfonts to improve the quality of the synthesized audio. You can easily do this using the RoboPianist CLI:

robopianist soundfont --download

For more soundfont-related commands, see docs/soundfonts.md.

MIDI Dataset

The PIG dataset cannot be redistributed on GitHub due to licensing restrictions. See docs/dataset for instructions on where to download it and how to preprocess it.

CLI

RoboPianist comes with a command line interface (CLI) that can be used to download additional soundfonts, play MIDI files, preprocess the PIG dataset, and more. For more information, see docs/cli.md.

Contributing

We welcome contributions to RoboPianist. Please see docs/contributing.md for more information.

FAQ

See docs/faq.md for a list of frequently asked questions.

Citing RoboPianist

If you use RoboPianist in your work, please use the following citation:

@inproceedings{robopianist2023,
  author = {Zakka, Kevin and Wu, Philipp and Smith, Laura and Gileadi, Nimrod and Howell, Taylor and Peng, Xue Bin and Singh, Sumeet and Tassa, Yuval and Florence, Pete and Zeng, Andy and Abbeel, Pieter},
  title = {RoboPianist: Dexterous Piano Playing with Deep Reinforcement Learning},
  booktitle = {Conference on Robot Learning (CoRL)},
  year = {2023},
}

Acknowledgements

We would like to thank the following people for making this project possible:

Philipp Wu and Mohit Shridhar for being a constant source of inspiration and support.
Ilya Kostrikov for constantly raising the bar for RL engineering and for invaluable debugging help.
The Magenta team for helpful pointers and feedback.
The MuJoCo team for the development of the MuJoCo physics engine and their support throughout the project.

Works that have used RoboPianist

Privileged Sensing Scaffolds Reinforcement Learning, Hu et. al. (paper, website)

License and Disclaimer

MuJoco Menagerie's license can be found here. Soundfont licensing information can be found here. MIDI licensing information can be found here. All other code is licensed under an Apache-2.0 License.

This is not an officially supported Google product.

robopianist's People

Contributors

Stargazers

Watchers

robopianist's Issues

Fix PyPI distribution.

OSError: undefined symbol ffi_type_uint32 when running make test after installing package from source

When installing the package from source, running make test will result in the following error:

________________________________________________________________________________________________________ ERROR collecting test session ________________________________________________________________________________________________________
../anaconda3/envs/pianist/lib/python3.10/site-packages/pluggy/_hooks.py:265: in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
../anaconda3/envs/pianist/lib/python3.10/site-packages/pluggy/_manager.py:80: in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
../anaconda3/envs/pianist/lib/python3.10/site-packages/_pytest/python.py:216: in pytest_collect_file
    module: Module = ihook.pytest_pycollect_makemodule(
../anaconda3/envs/pianist/lib/python3.10/site-packages/_pytest/config/compat.py:67: in fixed_hook
    return hook(**kw)
../anaconda3/envs/pianist/lib/python3.10/site-packages/pluggy/_hooks.py:265: in __call__
    return self._hookexec(self.name, self.get_hookimpls(), kwargs, firstresult)
../anaconda3/envs/pianist/lib/python3.10/site-packages/pluggy/_manager.py:80: in _hookexec
    return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
/opt/ros/foxy/lib/python3.8/site-packages/launch_testing/pytest/hooks.py:188: in pytest_pycollect_makemodule
    entrypoint = find_launch_test_entrypoint(path)
/opt/ros/foxy/lib/python3.8/site-packages/launch_testing/pytest/hooks.py:178: in find_launch_test_entrypoint
    module = import_path(path, root=None)
../anaconda3/envs/pianist/lib/python3.10/site-packages/_pytest/pathlib.py:533: in import_path
    importlib.import_module(module_name)
../anaconda3/envs/pianist/lib/python3.10/importlib/__init__.py:126: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1050: in _gcd_import
    ???
<frozen importlib._bootstrap>:1027: in _find_and_load
    ???
<frozen importlib._bootstrap>:1006: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:688: in _load_unlocked
    ???
<frozen importlib._bootstrap_external>:883: in exec_module
    ???
<frozen importlib._bootstrap>:241: in _call_with_frames_removed
    ???
robopianist/models/piano/__init__.py:15: in <module>
    from robopianist.models.piano.piano import Piano
robopianist/models/piano/piano.py:24: in <module>
    from robopianist.models.piano import midi_module, piano_mjcf
robopianist/models/piano/midi_module.py:23: in <module>
    from robopianist.music import midi_file, midi_message
robopianist/music/__init__.py:21: in <module>
    from robopianist.music import library, midi_file
robopianist/music/library.py:20: in <module>
    from note_seq.protobuf import music_pb2
../anaconda3/envs/pianist/lib/python3.10/site-packages/note_seq/__init__.py:22: in <module>
    from note_seq.chord_inference import ChordInferenceError
../anaconda3/envs/pianist/lib/python3.10/site-packages/note_seq/chord_inference.py:23: in <module>
    from note_seq import sequences_lib
../anaconda3/envs/pianist/lib/python3.10/site-packages/note_seq/sequences_lib.py:29: in <module>
    import pretty_midi
../anaconda3/envs/pianist/lib/python3.10/site-packages/pretty_midi/__init__.py:145: in <module>
    from .pretty_midi import *
../anaconda3/envs/pianist/lib/python3.10/site-packages/pretty_midi/pretty_midi.py:17: in <module>
    from .instrument import Instrument
../anaconda3/envs/pianist/lib/python3.10/site-packages/pretty_midi/instrument.py:6: in <module>
    import fluidsynth
../anaconda3/envs/pianist/lib/python3.10/site-packages/fluidsynth.py:48: in <module>
    _fl = CDLL(lib)
../anaconda3/envs/pianist/lib/python3.10/ctypes/__init__.py:374: in __init__
    self._handle = _dlopen(self._name, mode)
E   OSError: /lib/x86_64-linux-gnu/libwayland-client.so.0: undefined symbol: ffi_type_uint32, version LIBFFI_BASE_7.0

This error seems to come from the pyfluidsynth package, because when it is installed alongside pretty_midi (which note_seq depends on) then import pretty_midi will give the above error.

Installing from pypi gives the same error.

This is all done on a python 3.10 conda environment with ubuntu 20.04.

Integration with the Unity game engine

I'm going to reproduce the hands movement in Unity. Started with Twinkle Twinkle Little Star example.

At first, I saved the scene with all the resources:

export_with_assets(
      task.root_entity.mjcf_model,
      out_dir="/tmp/robopianist/piano_with_shadow_hands",
      out_file_name="scene.xml",
)

At second, I installed Unity plug-in from MuJoCo website.
At third, I changed scene assets from obj format to stl format like user pkrack wrote here
Finally, Unity has able to load this scene. But when I louched it I recived errors:

Element 'joint', line 1

NullReferenceException: Failed to create Mujoco runtime.
Mujoco.MjScene.StepScene () (at MuJoCo/Runtime/Components/MjScene.cs:345)
Mujoco.MjScene.FixedUpdate () (at MuJoCo/Runtime/Components/MjScene.cs:98)

The first occurrences of "range" in the xml debug file contain a comma in the values, not a period. Maybe that's the problem. But I do not know how to change the generation of this file)

<mujoco>
  <worldbody>
    <body pos="0 0 0" quat="-1 0 0 0" gravcomp="0" name="lh_shadow_hand/_0">
      <body pos="0.4 -0.15 0.13" quat="-0.5 -0.5 0.5 0.5" gravcomp="1" name="lh_shadow_hand/lh_forearm_1">
        <inertial pos="0 0 0.09" quat="-1 0 0 0" mass="3" diaginertia="0.0138 0.0138 0.00744" />
        <joint type="slide" pos="0 0 0" axis="-1 0 0" ref="0" armature="0.0002" springref="0" springdamper="0 0" damping="67.4762" stiffness="0" solreflimit="0.02 1" solimplimit="0.9 0.95 0.001 0.5 2" solreffriction="0.02 1" solimpfriction="0.9 0.95 0.001 0.5 2" frictionloss="0.01" limited="false" margin="0" range="-0.4605 0.7605" name="lh_shadow_hand/forearm_tx_3" />
        <joint type="slide" pos="0 0 0" axis="-1.192093E-07 0 1" ref="0" armature="0.0002" springref="0" springdamper="0 0" damping="67.4762" stiffness="0" solreflimit="0.02 1" solimplimit="0.9 0.95 0.001 0.5 2" solreffriction="0.02 1" solimpfriction="0.9 0.95 0.001 0.5 2" frictionloss="0.01" limited="false" margin="0" range="0 0.06" name="lh_shadow_hand/forearm_ty_4" />
        <geom density="1000" type="mesh" mesh="mesh_6" pos="0 0 0" quat="-1 0 0 0" priority="0" contype="0" conaffinity="0" group="2" condim="3" solmix="1" solref="0.005 1" solimp="0.5 0.99 0.0001" margin="0" gap="0" friction="1 0.005 0.0001" fluidshape="none" fluidcoef="5 0.25 1.5 1 1" name="lh_shadow_hand//unnamed_geom_0_5" />
        <geom density="1000" type="mesh" mesh="mesh_8" pos="0 0 0" quat="-1 0 0 0" priority="0" contype="0" conaffinity="0" group="2" condim="3" solmix="1" solref="0.005 1" solimp="0.5 0.99 0.0001" margin="0" gap="0" friction="1 0.005 0.0001" fluidshape="none" fluidcoef="5 0.25 1.5 1 1" name="lh_shadow_hand//unnamed_geom_1_7" />
        <geom density="1000" type="mesh" mesh="mesh_10" pos="0 0 0" quat="-1 0 0 0" priority="0" contype="1" conaffinity="1" group="3" condim="3" solmix="1" solref="0.005 1" solimp="0.5 0.99 0.0001" margin="0" gap="0" friction="1 0.005 0.0001" fluidshape="none" fluidcoef="5 0.25 1.5 1 1" name="lh_shadow_hand//unnamed_geom_2_9" />
        <geom density="1000" type="box" size="0.035 0.035 0.035" pos="0 -0.01000001 0.181" quat="0.9249092 0 0.3801881 0" priority="0" contype="1" conaffinity="1" group="3" condim="3" solmix="1" solref="0.005 1" solimp="0.5 0.99 0.0001" margin="0" gap="0" friction="1 0.005 0.0001" fluidshape="none" fluidcoef="5 0.25 1.5 1 1" name="lh_shadow_hand//unnamed_geom_3_11" />
        <site type="box" size="0.001 0.001 0.001" pos="0 0 0" quat="-1 0 0 0" name="lh_shadow_hand/forearm_tx_site_12" />
        <site type="box" size="0.001 0.001 0.001" pos="0 0 0" quat="-1 0 0 0" name="lh_shadow_hand/forearm_ty_site_13" />
        <body pos="0 -0.01000001 0.21301" quat="-1 0 0 0" gravcomp="1" name="lh_shadow_hand/lh_wrist_14">
          <inertial pos="0 1.490116E-08 0.029" quat="0.5 0.5 0.5 0.5" mass="0.1" diaginertia="6.4E-05 4.38E-05 3.5E-05" />
          <joint type="hinge" pos="0 0 0" axis="-1.192093E-07 -1 0" ref="0" armature="0.0002" springref="0" springdamper="0 0" damping="0.5" stiffness="0" solreflimit="0.02 1" solimplimit="0.9 0.95 0.001 0.5 2" solreffriction="0.02 1" solimpfriction="0.9 0.95 0.001 0.5 2" frictionloss="0.01" limited="false" margin="0" range="-30,00002 10" name="lh_shadow_hand/lh_WRJ2_16" />

...

I attached this file to post, you may look it.

As a result, maybe it would be better to somehow copy the frame-by-frame hand positions to make an animated clip? Where can I get these values from an xml file generated by a neural network?
Importing the environment as it is in Unity seems to be a more confusing task.
debug.txt

Observation space and action space

Hi,

I have some questions on the observation and action space.

The observation space in the colab for PianoWithShadowHands has a goal key. What exactly is that?

Next, could you give some details about the reduced action space? Is it helpful for learning? Does it limit the agent's final performance?

Availability of Training Codes

Hi,

I love your work and would like to use this benchmark for my RL research.
I want to reproduce DroQ and PPO results illustrated in Figure 4 in your paper.

Do you have any plans to release the training codes?
I would also appreciate it if you could provide the pre-trained models.

Error: Index 158 is out of bounds for axis 0 with size 158

I'm new to reinforcement learning. I'm facing the issue which dm_control viewer intercepted an environment error.
Original message: index 158 is out of bounds for axis 0 with size 158. This error is while running the twinkle_twinkle_actions.npy as the action sequence for piano_with_shadow_hands_env.py.

Adding variable velocity to outputted MIDI notes

My name is Herbie Turner and I'm an MIT CS Meng student working with my friend PhD student Ruben Castro, whose undergrad MechE thesis is cited in your paper. Together, we are working on a class project using your dataset and infrastructure. We're interested in including volume (velocity) in our reward function. Right now, we have some basic formulas for MIDI note velocity as a function of key angular velocity at the time of activation but I'm new to RL and using mujoco and am struggling to get the key angular velocity. I tried using qvel but it appears to be 0 for every timestep when running the example action sequence. Does anyone have suggestions?

PIG Dataset Issue Tracker

The PIG dataset has a few issues:

Wrong tempo: a song's tempo does not match the original piece's tempo.
- Nocturne
Inconsistent sustain: a song can have the sustain pedal baked into the notes which means the hands need to reach more notes than physically possible at a given timestep.
- Gymnopédie No. 1

We need to fix these issues or find all the affected songs so that they can be excluded from the benchmark score calculation.

Bugs with MacOS

Running the ipynb, I get different results as what I get on colab.

MacOS:

CoLab:

Gym env version?

Hi,

Is there a gym or gymnasium version of the tasks as mentioned in the paper?

Play midi script takes a while to load and play the midi file.

Hi, I'm running this on my M1 mac.

python examples/play_midi_file.py --file robopianist/music/data/rousseau/twinkle-twinkle-trimmed.mid

It takes about ~20 seconds before audio playback starts. Is this normal?

OSError from libwayland-client

I installeded from source on WSL2 running Ubuuntu 20.04. The installation throws no warnings or errors but when I try to run "make test" or the tutorial notebook and import robopianist I get the following error:

ERROR robopianist/suite/tasks/self_actuated_piano_test.py - OSError: /lib/x86_64-linux-gnu/libwayland-client.so.0: undefined symbol: ffi_type_uint32, version LIBFFI_BASE_7.0