Giter VIP home page Giter VIP logo

hmr's Introduction

End-to-end Recovery of Human Shape and Pose

Angjoo Kanazawa, Michael J. Black, David W. Jacobs, Jitendra Malik CVPR 2018

Project Page Teaser Image

Requirements

  • Python 2.7
  • TensorFlow tested on version 1.3, demo alone runs with TF 1.12

Installation

Linux Setup with virtualenv

virtualenv venv_hmr
source venv_hmr/bin/activate
pip install -U pip
deactivate
source venv_hmr/bin/activate
pip install -r requirements.txt

Install TensorFlow

With GPU:

pip install tensorflow-gpu==1.3.0

Without GPU:

pip install tensorflow==1.3.0

Windows Setup with python 3 and Anaconda

This is only partialy tested.

conda env create -f hmr.yml

if you need to get chumpy

https://github.com/mattloper/chumpy/tree/db6eaf8c93eb5ae571eb054575fb6ecec62fd86d

Demo

  1. Download the pre-trained models
wget https://people.eecs.berkeley.edu/~kanazawa/cachedir/hmr/models.tar.gz && tar -xf models.tar.gz
  1. Run the demo
python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Images should be tightly cropped, where the height of the person is roughly 150px. On images that are not tightly cropped, you can run openpose and supply its output json (run it with --write_json option). When json_path is specified, the demo will compute the right scale and bbox center to run HMR:

python -m demo --img_path data/random.jpg --json_path data/random_keypoints.json

(The demo only runs on the most confident bounding box, see src/util/openpose.py:get_bbox)

Webcam Demo (thanks @JulesDoe!)

  1. Download pre-trained models like above.
  2. Run webcam Demo
  3. Run the demo
python -m demo --img_path data/coco1.png
python -m demo --img_path data/im1954.jpg

Training code/data

Please see the doc/train.md!

Citation

If you use this code for your research, please consider citing:

@inProceedings{kanazawaHMR18,
  title={End-to-end Recovery of Human Shape and Pose},
  author = {Angjoo Kanazawa
  and Michael J. Black
  and David W. Jacobs
  and Jitendra Malik},
  booktitle={Computer Vision and Pattern Recognition (CVPR)},
  year={2018}
}

Opensource contributions

russoale has created a Python 3 version with TF 2.0: https://github.com/russoale/hmr2.0

Dawars has created a docker image for this project: https://hub.docker.com/r/dawars/hmr/

MandyMo has implemented a pytorch version of the repo: https://github.com/MandyMo/pytorch_HMR.git

Dene33 has made a .ipynb for Google Colab that takes video as input and returns .bvh animation! https://github.com/Dene33/video_to_bvh

bvh bvh2

layumi has added a 2D-to-3D color mapping function to the final obj: https://github.com/layumi/hmr

I have not tested them, but the contributions are super cool! Thank you!! Let me know if you have any mods that you would like to be added here!

hmr's People

Contributors

adityaas avatar akanazawa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hmr's Issues

How to run this code in real time

Thank you for your hard word, I run your code and everything works fine but I want to run this code in real time using my webcam?
how can I do that?
thanks

3D Pose Performance on Human3.6M

Hi @akanazawa , I wonder how you got the numbers in table 1 and table 2.

I used the ground truth bounding box provided by Human3.6M to crop the person, and I used the util code to scale it and do padding. Then I passed it to the network with the provided pre-trained model. But the mean reconstruction error is around ~77mm (in Protocol 2).

I wonder what could be wrong. I only use the video sequence from camera 60457274, and I evaluate 1 frame from every 5 frames.

Thanks.

The number of datasets is in correct.

I have download the Mosh Data (named as mosh_data.tar.gz if you have tried) dataset mentioned in https://github.com/akanazawa/hmr/blob/master/doc/train.md.

However, when i use it i have found the amount of poses saved in neutrSMPL_CMU and neutrSMPL_H3.6 is larger than the number noted in paper. Only neutrSMPL_jointLim(i presume that is the PosePrior dataset in paper) is right.

AKA, as show in below.

  • neutrSMPL_CMU, 3, 934, 267 (3.9 M) in download datset, 390, 000 (150K) noted in paper.

  • neutrSMPL_H3.6, 1, 872, 173 (1.8 M) in download datset, 150, 000 (150K) noted in paper.

  • neutrSMPL_jointLim, 181, 968 (180 K) in download datset, 180, 000 (180K) noted in paper.

Render bugs occurs when vertices are more than 20000

Hi Professor,
I am a master from Zhejiang University working on human face reconstruction and I tried to using your code for rendering face mesh. But when I replace the human body vertex and face index in your code with BFM model which has almost 50000 vertices and 90000 faces, then the renderer doesn't work well and some faces whose vertex indices are greater than 20000(not precise, maybe 19000) become black.
mesh_rst0_0
And I did other experiments which also showed that when vertices are more than 20000(not precise, maybe 19000) and some faces' using these large index vertices whill cause this bug.
For example, one experiment is as follows. At first, I split the face mesh into 3 parts each contains no more than 17000 vertices and no more than 32500 faces, and render these 3 parts recursively by calling your rendering interface in a loop and seting the previous output as background of the current step. So I can render these 3 parts together on one image. result is as follows:
mesh_rst0_0

And I am sure that the face mesh is correct and I haven't done any modification on the code except replace the mesh vertices and faces with my BFM face model.
So can you give me some advice and can you do some experiment on the large mesh with more than 20000 vertices?
Thank you very much!

line 167 of file 'mpii_to_tfrecords.py'

Thanks for sharing the code.

Could you have a look at the line 167 of file 'mpii_to_tfrecords.py', looks like it should be moved outside of the outer "if/else" block.

How can I generate moshed data by myself?

In train.md I see that you generate mosh data for(CMU dataset, human3.6m and mpi-3d, but in the page.I can't find how to generate mosh data.So how can I generate moshed data by myself?

questions about learning rate

Hi,
I have two questions about the learning rate

  1. In your paper, you said the learning rate of encoder is set to 1e-5, but in your code, the learning rate is set to be 1e-3? So I am puzzled about this and wondering which one is the actual learning rate I should use?
  2. Have you adjusted the learning rate during training? I haven't found this kind of operation in your code.

Thanks.

Regarding mesh from smpl

Hi
Can i use male or female pkl from smpl rather than neutral_smpl_with_cocoplus_reg.pkl.
if so what changes i will need to do in your code.

Thanks!

Get 3D model

Thank you for opening code.
I could get images of model, but are there any way to get 3D model?
I'd like to see model from a lot of directions.

how to get angle of 3d joints?

hello @akanazawa ,
the render images is as follow,
image
how to get joints' (x, y, z) coordinates or angle of rotation, i want to use it to model human skeleton in maya.

joints, verts, cams, joints3d, theta = model.predict()
I try to use joints3d to model it, but it is wrong.
could you give me some suggest?
thanks!

can not find "mosh/gt3d" from downloaded h36m tfrecords files

I downloaded pre-computed tfrecords files for Human3.6M, which are supposed to have paired 3d joints annotation included, but I can not find "mosh/gt3d" features from any .tfrecord file.

The way I checked tfrecords content:

import tensorflow as tf
for example in tf.python_io.tf_record_iterator("downloaded_tfrecord_file"):
    result = tf.train.Example.FromString(example)
    print(result.features.feature.keys())

coco_to_tfrecords

1) For generating tf_records for COCO dataset, the file under "src.datasets.coco_to_tfrecords.py" expects pycocotools.coco. It was not found by the code while running "prepare_datasets.sh".

2) "from .common import convert_to_example, ImageCoder, resize_img" was not found in the path.

Compensation for changing camera parameters in a video sequence

I am running HMR on a video sequence. After completion, I get Theta (85X1) for each frame. For visualization, I used Theta[3:75] for SMPL pose and Theta[75:85] for SMPL shape. The visualized mesh sequence seems to be incorrect and I think it is because I am not taking care of the inferred camera parameters i.e. Theta[0:3].
I am not sure what exactly get_original function in renderer.py does, but I suspected it compensates for the camera parameters. So , I tried visualizing the vertices given by get_original function and it looks a bit better. Am I correct?

code for human part segmentation

Great work and thanks for make the code public available.
Human part segmentation was mentioned in your paper and I am wondering have you made part of this work public available?

Not working on fat men

Hi,

We have been trying your code recently with different images of our own, and we have noticed that the shape parameters doesn't seem to adjust correctly.
For example we have tried with different fat men pictures and even though the pose is almost correctly detected, the shape is always thin and it's not adapted to the actual shape.
Is there any limitation in the demo code? Or is it due to a limitation with the SMPL model? Could it be because you are using the neutral SMPL model?

In the picture you can see what we mean.

fatman

Thank you,

Alex B.

Cant create a female mesh

I have tried looking for shape space as told by you in #15 but I couldn't understand. Would you please elaborate on how to use female SMPL here?

Running this code on videos

Hi Team,

I was reading this paper and found that you presented this code for videos along with images. I am interested to learn how are you training this code on videos? Is it the same way as we do for images or do you have any additional steps for running this on videos?

Thanks,
Teja

question about 'gt3d' label

Hi @akanazawa, I found the 'gt3d' labels are weird...

  1. The 'gt3d' labels in h36m are centralized to zero, which correspond to the augment code 'reflect_joints3d'. But the 'gt3d' labels in mpi_inf_3dhp are not centralized... I guess this will leads incorrect label when performing random_flip.
  2. The 'gt3d' labels in h36m are different to the joints calculated by 'pose' and 'shape' labels in h36m.
  3. In addition, the scale and direction of gt3d in mpi_inf_3dhp seems to be different to h36m.

To summarize, I found the all the 'gt3d' labels are not in line with expectations, which will confuse the training. But it seems that the final results are not so bad... Is there any problem with my understanding? Any explanation?

Not able to reproduce result.

System details:

Ubuntu 18.04

I have been trying to run the demo file since last night. but It still not showing anything in output. it just stop wihout any error starting python console.

Here is the terminal logs.
mirrorsize@mirrorsize-Latitude-E6420:~/HMR$ python demo.py --img_path /home/mirrorsize/HMR/data/random.jpg
Iteration 0
Iteration 1
Reuse is on!
Iteration 2
Reuse is on!
Restoring checkpoint /home/mirrorsize/HMR/src/../models/model.ckpt-667589..
Resizing so the max image size is 224..
/home/mirrorsize/HMR/src/util/renderer.py:313: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
if np.issubdtype(image.dtype, np.float):
--Return--
None

/home/mirrorsize/HMR/demo.py(91)visualize()
90 import ipdb
---> 91 ipdb.set_trace()
92

ipdb>

Can anyone help with this?

can't untar models.tar.gz: `gzip: stdin: not in gzip format`

Hi,

I downloaded model.tar.gz from https://people.eecs.berkeley.edu/~kanazawa/cachedir/hmr/models.tar.gz and then I run tar -xvzf and get the following error. See Stacktrace:

tar -xvzf models.tar.gz

gzip: stdin: not in gzip format
tar: Child returned status 1
tar: Error is not recoverable: exiting now

I'm surprised no one else has reported this problem. I must be overlooking something stupid. system info: Ubuntu 16.04, 64-bit

Thank you!

Why you ignore global rotation in SMPL?

hello @akanazawa ,
When I scaned your code which is 'batch_smpl.py', I found you ignore global rotation in which adding pose blend shapes. Have you changed this part from official smpl version?

code:
# 3. Add pose blend shapes # N x 24 x 3 x 3 Rs = tf.reshape( batch_rodrigues(tf.reshape(theta, [-1, 3])), [-1, 24, 3, 3]) with tf.name_scope("lrotmin"): # Ignore global rotation. pose_feature = tf.reshape(Rs[:, 1:, :, :] - tf.eye(3), [-1, 207])

Frame of reference for 3d keypoints

What is the frame of reference with respect to which the 3d keypoints are found? Is this frame of reference same across any input image?

I am trying to do perspective correction on the 3d keypoints. I need to know the frame of reference to do this. Thanks in advance!

Cant run this with the json keypoints from Openpose

2018-08-24 10:23:44.654694: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2018-08-24 10:23:44.743349: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:898] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2018-08-24 10:23:44.744012: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1356] Found device 0 with properties:
name: GeForce 920M major: 3 minor: 5 memoryClockRate(GHz): 0.954
pciBusID: 0000:01:00.0
totalMemory: 1.96GiB freeMemory: 1.57GiB
2018-08-24 10:23:44.744031: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1435] Adding visible gpu devices: 0
2018-08-24 10:23:45.020994: I tensorflow/core/common_runtime/gpu/gpu_device.cc:923] Device interconnect StreamExecutor with strength 1 edge matrix:
2018-08-24 10:23:45.021074: I tensorflow/core/common_runtime/gpu/gpu_device.cc:929] 0
2018-08-24 10:23:45.021086: I tensorflow/core/common_runtime/gpu/gpu_device.cc:942] 0: N
2018-08-24 10:23:45.021292: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1053] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 1345 MB memory) -> physical GPU (device: 0, name: GeForce 920M, pci bus id: 0000:01:00.0, compute capability: 3.5)
WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/contrib/slim/python/slim/nets/resnet_v2.py:224: calling reduce_mean (from tensorflow.python.ops.math_ops) with keep_dims is deprecated and will be removed in a future version.
Instructions for updating:
keep_dims is deprecated, use keepdims instead
Iteration 0
Iteration 1
Reuse is on!
Iteration 2
Reuse is on!
Restoring checkpoint /home/mirrorsize/hmr/src/../models/model.ckpt-667589..
Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/home/mirrorsize/hmr/demo.py", line 143, in
main(config.img_path, config.json_path)
File "/home/mirrorsize/hmr/demo.py", line 123, in main
input_img, proc_param, img = preprocess_image(img_path, json_path)
File "/home/mirrorsize/hmr/demo.py", line 108, in preprocess_image
scale, center = op_util.get_bbox(json_path)
File "src/util/openpose.py", line 19, in get_bbox
kps = read_json(json_path)
File "src/util/openpose.py", line 13, in read_json
kp = np.array(people['pose_keypoints']).reshape(-1, 3)
KeyError: 'pose_keypoints'

Cocoplus joint order

Hi,

Would it be possible to share the joint order of the 19 cocoplus joints returned by the SMPL model code, as well as the 24 joints of the original SMPL model?

Is the ground truth mosh data of h36m incorrect?

I tried the following code and the visualization results show that the global rotation of the ground truth mosh data doesn't correspond to the image. Did I get something wrong or the data is incorrect?

flength = 1000.
renderer = SMPLRenderer(img_size=224, flength=flength)
smpl_model = SMPL('neutral_smpl_with_cocoplus_reg.pkl')

fqueue = tf.train.string_input_producer(
    ['/data/tf_datasets/tf_records_human36m_wjoints/train/h36m_train_mixed_0000.tfrecord'])
reader = tf.TFRecordReader()
_, example_serialized = reader.read(fqueue)
image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_ = data_utils.parse_example_proto(
    example_serialized, has_3d=True)

pose_ph = tf.placeholder(tf.float32, [None, 72])
shape_ph = tf.placeholder(tf.float32, [None, 10])
verts_, joints_, Rs_ = smpl_model(shape_ph, pose_ph, True)
init = tf.global_variables_initializer()
sess = tf.train.MonitoredTrainingSession()
sess.run(init)

while 1:
    image, image_size, label, center, fname, pose, shape, gt3d, has_smpl3d = sess.run(
        [image_, image_size_, label_, center_, fname_, pose_, shape_, gt3d_, has_smpl3d_])
    verts = sess.run(verts_, feed_dict={pose_ph: np.expand_dims(pose, 0), shape_ph: np.expand_dims(shape, 0)})
    vert = verts[0]
    vert_shift = np.array([[0., 0., flength / 112.]])
    vert = vert + vert_shift
    rendered_img = renderer(vert, do_alpha=False)
    cv2.imshow('a', rendered_img)
    cv2.imshow('b', cv2.cvtColor((image*255).astype(np.uint8), cv2.COLOR_RGB2BGR))
    cv2.waitKey()

screenshot from 2018-11-06 16 47 11
screenshot from 2018-11-06 16 47 31

The problem about the version of libprotobuf

I try to use hmr on my TX2 flashed with JetPack 3.1.
The tensorflow I am using is tensorflow 1.6(which is a strange version, but does support python 2 and Cuda8.0). And it does work on my machine.
All seemed to be normal until I tried to run the demo. The result follows:

[libprotobuf FATAL google/protobuf/stubs/common.cc:61] This program requires version 3.5.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/arm-opt/genfiles/tensorflow/contrib/tpu/proto/tpu_embedding_config.pb.cc".)
terminate called after throwing an instance of 'google::protobuf::FatalException'
what(): This program requires version 3.5.0 of the Protocol Buffer runtime library, but the installed version is 2.6.1. Please update your library. If you compiled the program yourself, make sure that your headers are from the same version of Protocol Buffers as your link-time library. (Version verification failed in "bazel-out/arm-opt/genfiles/tensorflow/contrib/tpu/proto/tpu_embedding_config.pb.cc".)
Aborted (core dumped)

Is anybody here? Who can help me?

about the H3.6M dataset

Can you provide the camera parameters of the H3.6M dataset you uploaded?The original H3.6M has camera parameters but i can not find the correspondence to it.

The question about the mosh data

Hello, I have download the Mosh dataset, and I have learn that there many argument of pose and shape in the dataset.
But, I want to know the proportion of male and female in data ,it is important for me. I hope you can help me solve this problem,thank you!

Using 3D labels!! exit(1)

Hi,akanazawa,

First, I want to appreciate your so clear explanation, it's the most wonderful git I had seen.
I follow your tips without any problem until reaching the final step. When I input "sh do_train.sh", the terminal output is

Using 3D labels!!
Dont run this without any datasets with gt 3d

~/hmr/src/data_loader.py(134)get_loader_w3d()
133 import ipdb; ipdb.set_trace()
--> 134 exit(1)
ipdb>

I dont know what's the warning means, it's the issues of my used datasets? I used lsp, lsp_ext, mpii and human36m dataset, not all datasets you mention.

Thank you very much~~

gender of the model

How do you choose the gender of the model?
Do I need to set it up somewhere?
Thanks

Question about part segmentation

Hi akanazawa, I want to use the part segmentation of your model for my project. But where is the code for part segmentation as you mentioned in the paper? I can't find it, can you help me add it to your repo or guide me how to generate it? Thank a lot

About the source code

Hello!I am interested in human shape regression.And I think your end-to-end regression is creative in this area.I want to know when your code can upload here so that i can understand some detail in your paper.

How to make 3D templates?

Thank you for your shared. I want to create 3d female models, but the provided template is male model. Can you tell me how to make 3D templates? This pkl is not the same as the SMPL pkl. Hope you can help me. Thank you very much.

3d render image can't show

hi, I successfully set up the environment ,and ran the project . The input and joint projection image could be shown. But the 3D Mesh overlay only shew the original rgb image . 3D mesh and diff vp didn't show anything. Could you please tell me why?

the list of 3dhp test set

Hi, I found that you use the 2929 images for testing in your paper. However, the images is more than that number in the data set I downloaded. Could you provide a list to tell me which images you used for testing?
Thank you!

Save output 3D model.

Hi Thanks for this brilliant Project.
I am having issue with saving the 3D model from the project.
Could you please tell me steps how to save the resulted 3D model.

Thanks

using openpose keypoints

Hello
I am trying to use the json keypoints from OpenPose as an input to your demo.py.
I have modified the src/util/openpose.py to accept 'pose_keypoints_2d' instead of 'pose_keypoints', but it seems my keypoints aren't used at all.

See these images as an example (note: I have cropped original images so they fit in this box):

first image: pose/skel render from OpenPose (BODY_25)
image
second image: pose/skel render from hmr
image
third image: pose overlay from hmr
image

Looking at those images, obviously my keypoints got discarded ; the pose of the overlayed mesh is the same as the pose/skel rendered by hmr.

What could I have done wrong ?
Thanks for this great code and any help !

Hmr:
Ubunto 18 on windows Hyper-V
Python 2.7, Tensor flow no GPU
$python -m demo --img_path data/me.jpg --json_path data/me_keypoints.json
OpenPose:
Windows pre-built demo from git
C:\OpenPose>bin\openposedemo.exe --image_dir I:\images -write_images I:\images\out --write_json I:\images\out

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.