Giter VIP home page Giter VIP logo

embodiedpose's Introduction

Embodied Scene-aware Human Pose Estimation

Official implementation of NeurIPS 2022 paper: "Embodied Scene-aware Human Pose Estimation". In this paper, we estimate 3D poses based on a simulated agent's proprioception and scene awareness, along with external third-person observations.

[paper] [website] [Video]

News 🚩

[June 18, 2023 ] Releaseing in-the-wild Demo. Please note that this is mainly as a proof-of-concept as EmbodiedPose is great at recovering global motion, but isn't so good at capturing high-frequecy movement

[March 31, 2023 ] Training code released.

[February 25, 2023 ] Evaluation code released.

Introduction

In this project, we develop "Embodied Human Pose Estimation", where we control a humanoid robot to move in a scene and estimate the human pose. Using 2D keypoint and scene information (camera pose and scene) as input, we estimate global pose in a casual fashion.

Dependencies

To create the environment, follow the following instructions:

  1. Create new conda environment and install pytroch:
conda create -n embodiedpose python=3.8
conda install pytorch torchvision torchaudio pytorch-cuda=11.6 -c pytorch -c nvidia
pip install -r requirements.txt
  1. Download and setup mujoco: Mujoco
wget https://github.com/deepmind/mujoco/releases/download/2.1.0/mujoco210-linux-x86_64.tar.gz
tar -xzf mujoco210-linux-x86_64.tar.gz
mkdir ~/.mujoco
mv mujoco210 ~/.mujoco/
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/.mujoco/mujoco210/bin
  1. Download and install Universal Humanoid Controller locally and follow the instructions to setup the data and download the models. ❗️❗️❗️Make sure you have UHC running locally before proceeding:
git clone [email protected]:ZhengyiLuo/UniversalHumanoidControl.git 
cd UniversalHumanoidControl
pip install -e .
bash download_data.sh

Data processing for evaluating/training UHC

EmbodiedPose is trained on a combinatino of AMASS, kinpoly, and h36m motion dataset. We generate paired 2D keypoints from the motion captre data and randomly selected camera information. Use the following script to download trained models, evaluation data, and pretrained humor models.

bash download_data.sh

You will need to have downloaded smpl model files from SMPL, SMPL-H, and SMPL-X.

If you wish to train EmbodiedPose, first, download the AMASS dataset from AMASS. Then, run the following script on the unzipped data:

python uhc/data_process/process_amass_raw.py

which dumps the data into the amass_db_smplh.pt file. Then, run

python uhc/data_process/process_amass_db.py

For processing your own SMPL data for training, you can refer to the required data fields in process_amass_db.py.

Evaluation

python scripts/eval_scene.py --cfg tcn_voxel_4_5 --epoch -1

Evaluate on In-the-wild Data

To run EmbodiedPose on in-the-wild data (mainly as a proof-of-concept, as EmbodiedPose is really great at recovering global motion, but isn't great at capturing high-frequecy movement), we will use HybrIK to generate the camera information, 2D keypoints, initialization pose. We do not use the 3D pose estimation (though directly use UHC to imitate the pose is possible) and only uses the 2D keypoints.

First, run HybrIK on the in-the-wild following their instructions:

python scripts/demo_video.py --video-name assets/demo_videos/embodied_demo.mp4 --out-dir sample_data/res_dance --save-pk 

Using the saved pk file, we will further process it into the format that EmbodiedPose can use using the script process_hybrik_data.py. Details of how to debug this script can be found in the notebook in_the_wild_poc.ipynb.

python scripts/process_hybrik_data.py --input sample_data/res_dance/res.pk --output sample_data/wild_processed.pkl

Finally, run the following script to evaluate the model on the in-the-wild data.

python scripts/eval_scene.py --cfg tcn_voxel_4_5 --epoch -1 --data sample_data/wild_processed.pkl

Training

python scripts/train_models.py --cfg tcn_voxel_4_5 

Citation

If you find this work useful for your research, please cite our paper:

@inproceedings{Luo2022EmbodiedSH,
  title={Embodied Scene-aware Human Pose Estimation},
  author={Zhengyi Luo and Shun Iwase and Ye Yuan and Kris Kitani},
  booktitle={Advances in Neural Information Processing Systems},
  year={2022}
}

References

This repository is built on top of the following amazing repositories:

  • State transition code is from: HuMoR
  • Part of the UHC code is from: rfc
  • SMPL models and layer is from: SMPL-X model
  • Feature extractors are from: SPIN
  • NN modules are from (khrylib): DLOW

embodiedpose's People

Contributors

zhengyiluo avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.