mkocabas / vibe Goto Github PK

Official implementation of CVPR2020 paper "VIBE: Video Inference for Human Body Pose and Shape Estimation"

Home Page: https://arxiv.org/abs/1912.05656

License: Other

Python 99.49% Shell 0.51%

video-pose-estimation 3d-pose-estimation human-pose-estimation smpl pytorch 3d-human-pose cvpr2020 cvpr-2020 cvpr20 cvpr

vibe's People

Contributors

Stargazers

Watchers

Forkers

peterzs cvasrest ml-lab eddiecityu tamwaiban alleundkalle liu824 tchigher d0cx4nd3r tszhang97 nguyenvuducthuy cloud-rsa ghislain5 nghorbani nickshawn asdlei99 jamesli1618 nkmhd3 gigakiller spikeking csq20081052 puchan-hci hongyi-zhao xinghuokang azuredsky prpankajsingh baileyqbb gsygsy96 akashsengupta1997 saccadic phymucs hyzcn broderickhigby dai-xi ikvision liuguoyou cuicaicai pandongwei todosthing jing-vision bruinxiong dominictabu bobiblazeski zebrajack shanqi tarsbase lzbgt salmedina hhy5277 emadshihab alighofrani95 databill86 hadryan 818ajian 0x01001011 reloadbrain shaunstanislauslau zouxiangxiang kocyigitkim ai-lin-ng shirohige felixzhang7 xrosliang hzhang57 ericwang0701 avatarworld jacklongking theyellowdiary marcvoegele-fhnw tinghuiz peterzhousz jingpengsun berlinsilkroad templeblock dreamtalecore niceban peng2017 robot-ai-machinelearning sdlibowen yanyuhongchen mail6562 luoyu123666 thisisfalcon trantorrepository growold arturbecker xjsxujingsong mathpopo marvis robinluog wenlong0913 liuwenhaha shadowkun wanggossip dhuy228 awoziji makulu1987 lovelyqian dorianhenning vyvydkf628

vibe's Issues

Question about papper

When I read

During training, “VIBE” takes in-the-wild images as input and predicts SMPL body model parameters.

and

Then, a motion discriminator takes predicted poses along with the poses sampled from the AMASS dataset and outputs a real/fake label for each sequence.

one question that bothers me is that whether all poses from wild images are subset of AMASS dataset? If not , poses in the wild are complex and varied, so how discriminator to decide they are real by only using AMASS?

MPoser

Hello,

I was very interested in the sequential VAE that was trained as a motion prior (MPoser) mentioned in the paper. Is this one also included in the published code? I could not find any references/implementation of it...

Thanks

-M

Error when using --tracking_method pose

It looks the code use absolute path when looking for openposetrack

Here is the traceback:

Traceback (most recent call last):
  File "demo.py", line 383, in <module>
    main(args)
  File "demo.py", line 83, in main
    tracking_results = run_posetracker(video_file, staf_folder=args.staf_dir, display=args.display)
  File "/home/feli/git/VIBE/lib/utils/pose_tracker.py", line 91, in run_posetracker
    staf_folder=staf_folder
  File "/home/feli/git/VIBE/lib/utils/pose_tracker.py", line 32, in run_openpose
    os.chdir(staf_folder)
FileNotFoundError: [Errno 2] No such file or directory: '/home/mkocabas/developments/openposetrack'

3D Joints from Vertecies

Hello, I have one more question

I confirmed that VIBE extract 3D joints, vertecies, shapes, betas, etc.

But what I want is 3D joints in the original SMPL format (see link below).

https://khanhha.github.io/assets/images/smpl/joint_locations.png

Your model extract 49 3D Joints, but it looks different from the original SMPL Joint format.

Please advise how I can calculate 3D joints from vertices.

Get animated mesh as .fbx

Is there a fast way of getting estimated animation as .fbx (or any other format)? To import to 3d software. Thanks.

Questions about bbox

Thank you for your perfect work! It perform well. I have a question about bbox. The format of the bbox is (cx, cy, w, h), when I use it to draw a rectangle and I referenced the code of your multi_person_tracker. But I get a wrong result and I found that h=w. Can you help me?

Question about cam projection.

Hi, mkocabas, thanks for your great work !!!
I have a question about cam projection code:

VIBE/lib/models/spin.py

Lines 426 to 439 in 945fd10

 def projection(pred_joints, pred_camera): 

 pred_cam_t = torch.stack([pred_camera[:, 1], 

 pred_camera[:, 2], 

 2 * 5000. / (224. * pred_camera[:, 0] + 1e-9)], dim=-1) 

 batch_size = pred_joints.shape[0] 

 camera_center = torch.zeros(batch_size, 2) 

 pred_keypoints_2d = perspective_projection(pred_joints, 

 rotation=torch.eye(3).unsqueeze(0).expand(batch_size, -1, -1).to(pred_joints.device), 

 translation=pred_cam_t, 

 focal_length=5000., 

 camera_center=camera_center) 

 # Normalize keypoints to [-1,1] 

 pred_keypoints_2d = pred_keypoints_2d / (224. / 2.) 

 return pred_keypoints_2d

When i check hmr code, they assume weak perspective cam, and predict scale and trans(x,y) in camera plane. However, in your code, it seems like that you assume scale(focal_length) is same all the time, and predict translation from world coordinate to cam coordinate with pred_camera. What is the difference? Could you give a more detailed explanation about 2 * 5000. / (224. * pred_camera[:, 0] + 1e-9)] ? Thanks !!!

How to save output data in json instead of pkl?

Hi @mkocabas , we need a little favor from you that, can you add piece of code to save output data in json file instead of pkl file as an option?

OPenGL error

An error in OpenGL occured, Please see the below output:
Thanks for your support

`python demo.py --vid_file sample_2.mp4 --output_folder output/ --display
Running "ffmpeg -i sample_2.mp4 -f image2 -v error /tmp/sample_2_mp4/%06d.png"
Images saved to "/tmp/sample_2_mp4"
Input video number of frames 1013
Running Multi-Person-Tracker
100%|███████████████████████████████████████████| 85/85 [04:14<00:00, 2.99s/it]
Finished. Detection + Tracking FPS 3.99
Displaying results..
=> loaded pretrained model from 'data/vibe_data/spin_model_checkpoint.pth.tar'
Performance of pretrained model on 3DPW: 56.56075477600098
Loaded pretrained weights from "data/vibe_data/vibe_model_wo_3dpw.pth.tar"
Running VIBE on each tracklet...
100%|████████████████████████████████████████████████████████████████████████████████████████| 2/2 [02:09<00:00, 64.92s/it]
VIBE FPS: 7.80
Total time spent: 394.83 seconds (including model loading time).
Total FPS (including model loading time): 2.57.
Saving output results to "output/sample_2/vibe_output.pkl".
Rendering output video, writing frames to /tmp/sample_2_mp4_output
6%|█████▏ | 63/1013 [00:01<00:19, 48.56it/s]
Traceback (most recent call last):
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/latebind.py", line 41, in call
return self._finalCall( *args, **named )
TypeError: 'NoneType' object is not callable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "demo.py", line 377, in
main(args)
File "demo.py", line 292, in main
mesh_filename=mesh_filename,
File "/home/riotu/VIBE/lib/utils/renderer.py", line 118, in render
rgb, _ = self.renderer.render(self.scene, flags=render_flags)
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/pyrender/offscreen.py", line 99, in render
return self._renderer.render(scene, flags)
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/pyrender/renderer.py", line 121, in render
self._update_context(scene, flags)
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/pyrender/renderer.py", line 709, in _update_context
p._add_to_context()
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/pyrender/primitive.py", line 324, in _add_to_context
self._vaid = glGenVertexArrays(1)
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/latebind.py", line 45, in call
return self._finalCall( *args, **named )
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/wrapper.py", line 657, in wrapperCall
result = wrappedOperation( *cArguments )
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 401, in call
if self.load():
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 390, in load
error_checker = self.error_checker,
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 148, in constructFunction
if (not is_core) and not self.checkExtension( extension ):
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/platform/baseplatform.py", line 270, in checkExtension
result = extensions.ExtensionQuerier.hasExtension( name )
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/extensions.py", line 98, in hasExtension
result = registered( specifier )
File "/home/riotu/VIBE/vibe/lib/python3.7/site-packages/OpenGL/extensions.py", line 105, in call
if not specifier.startswith( self.prefix ):
TypeError: startswith first arg must be bytes or a tuple of bytes, not str
`

Choosing a custom color for person ID and background

Is there a simple way to modify the code so I can assign a custom color to each person ID number instead of having it randomized? Such as "0: Green" "1: Red", >9: Random?
I found the part that assigns colors convoluted and confusing (ie. mesh_color =... , mc = mesh_color... and color=mc), also, how to disable the background or assign a solid color to it? Any help is much appreciated, thanks!

Unable to render result

Thanks for the great work. I'm running the solve in a Docker container using the CPU for calculations and Mesa offscreen software rendering for pyrender. Unfortunately, I get the following error message while rendering at this line:

https://github.com/mkocabas/VIBE/blob/master/lib/utils/renderer.py#L120

  File "/opt/vibe/lib/utils/renderer.py", line 120, in render
    output_img = rgb[:, :, :-1] * valid_mask + (1 - valid_mask) * img
ValueError: operands could not be broadcast together with shapes (1080,1920,2) (1080,1920,3)

Any help getting past this point would be much appreciated.

pyrenderer Error！

Hi！this is my problem，when i ran “python demo.py --vid_file sample_video.mp4 --output_folder output/ --display ”， I get the following error：

Traceback (most recent call last):
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\egl.py", line 71, in EGL
    mode=ctypes.RTLD_GLOBAL
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\ctypesloader.py", line 45, in loadLibrary
    return dllType( name, mode )
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] 找不到指定的模块。

Traceback (most recent call last):
  File "demo.py", line 33, in <module>
    from lib.utils.renderer import Renderer
  File "D:\Project\Python\VIBE-master\lib\utils\renderer.py", line 19, in <module>
    import pyrender
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\pyrender\__init__.py", line 3, in <module>
    from .light import Light, PointLight, DirectionalLight, SpotLight
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\pyrender\light.py", line 11, in <module>
    from .texture import Texture
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\pyrender\texture.py", line 8, in <module>
    from OpenGL.GL import *
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\GL\__init__.py", line 3, in <module>
    from OpenGL import error as _error
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\error.py", line 12, in <module>
    from OpenGL import platform, _configflags
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\__init__.py", line 35, in <module>
    _load()
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\__init__.py", line 32, in _load
    plugin.install(globals())
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\baseplatform.py", line 92, in install
    namespace[ name ] = getattr(self,name,None)
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in __get__
    value = self.fget( obj )
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\egl.py", line 94, in GetCurrentContext
    return self.EGL.eglGetCurrentContext
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\baseplatform.py", line 14, in __get__
    value = self.fget( obj )
  File "C:\Users\41656\Anaconda3\envs\pyVIBE\lib\site-packages\OpenGL\platform\egl.py", line 74, in EGL
    raise ImportError("Unable to load EGL library", *err.args)
ImportError: ('Unable to load EGL library', 22, '找不到指定的模块。', None, 126, None, 'GL', None)

I run this command on win10 Thanks for your support！！

pytube3

Hello, opening this issue to let you know that there is a new, actively maintained, Python3 only pytube fork: https://github.com/hbmartin/pytube3

W (joint regressor from vertex)?

May I ask which specific model is the W in '3D Body Representation' part of the original paper to project M(\theta,\beta) to 3d key points and what is the architecture for that?

Temporal SMPLify Option Error

Hi,
Thank you for your awesome work.

I have a problem with Temporal SMPLify Option.

I installed VIBE and Openpose with STAF.

When I execute command like this

python demo.py --vid_file sample_video.mp4 --output_folder output/ --tracking_method pose --display --staf_dir /home/user/1.Code/openpose_STAF --run_smplify

I got this error

Error:
Cuda check failed (77 vs. 0): an illegal memory access was encountered

Coming from:

/home/user/1.Code/openpose_STAF/src/openpose/pose/poseExtractorNet.cpp:getHeatMapsCopy():228

/home/user/1.Code/openpose_STAF/src/openpose/gpu/cuda.cpp:cudaCheck():42

/home/user/1.Code/openpose_STAF/src/openpose/pose/poseExtractorNet.cpp:getHeatMapsCopy():234

/home/user/1.Code/openpose_STAF/src/openpose/pose/poseExtractor.cpp:getHeatMapsCopy():65

/home/user/1.Code/openpose_STAF/include/openpose/pose/wPoseExtractor.hpp:work():108

/home/user/1.Code/openpose_STAF/include/openpose/thread/worker.hpp:checkAndWork():93

Error:
Cuda check failed (77 vs. 0): an illegal memory access was encountered

Coming from:

/home/user/1.Code/openpose_STAF/src/openpose/net/nmsBase.cu:nmsGpu():345

/home/user/1.Code/openpose_STAF/src/openpose/gpu/cuda.cpp:cudaCheck():42

/home/user/1.Code/openpose_STAF/src/openpose/net/nmsBase.cu:nmsGpu():349

/home/user/1.Code/openpose_STAF/src/openpose/net/nmsCaffe.cpp:Forward_gpu():240

/home/user/1.Code/openpose_STAF/src/openpose/net/nmsCaffe.cpp:Forward():200

/home/user/1.Code/openpose_STAF/src/openpose/pose/poseExtractorCaffeStaf.cpp:forwardPass():411

/home/user/1.Code/openpose_STAF/src/openpose/pose/poseExtractor.cpp:forwardPass():53

/home/user/1.Code/openpose_STAF/include/openpose/pose/wPoseExtractor.hpp:work():108

/home/user/1.Code/openpose_STAF/include/openpose/thread/worker.hpp:checkAndWork():93
F0107 11:18:46.491232 17293 syncedmem.hpp:22] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
*** Check failure stack trace: ***
F0107 11:18:46.491261 17294 math_functions.hpp:179] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
*** Check failure stack trace: ***
@ 0x7f7d02e960cd google::LogMessage::Fail()
@ 0x7f7d02e97f33 google::LogMessage::SendToLog()
@ 0x7f7d02e95c28 google::LogMessage::Flush()
@ 0x7f7d02e960cd google::LogMessage::Fail()
@ 0x7f7d02e98999 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f7d0235c853 caffe::SyncedMemory::mutable_cpu_data()
@ 0x7f7d02368c95 caffe::Blob<>::Reshape()
@ 0x7f7d023690ca caffe::Blob<>::Reshape()
@ 0x7f7d02e97f33 google::LogMessage::SendToLog()
@ 0x7f7d0236917c caffe::Blob<>::Blob()
@ 0x7f7d048378cf op::ArrayCpuGpu<>::ArrayCpuGpu()
@ 0x7f7d02e95c28 google::LogMessage::Flush()
@ 0x7f7d049232b5 op::PoseExtractorCaffeStaf::netInitializationOnThread()
@ 0x7f7d04932853 op::PoseExtractorNet::initializationOnThread()
@ 0x7f7d02e98999 google::LogMessageFatal::~LogMessageFatal()
@ 0x7f7d04914d21 op::PoseExtractor::initializationOnThread()
@ 0x7f7d0490f7c1 op::WPoseExtractor<>::initializationOnThread()
@ 0x7f7d0235bfca caffe::SyncedMemory::mutable_gpu_data()
@ 0x7f7d049754c1 op::Worker<>::initializationOnThreadNoException()
@ 0x7f7d0235e0d2 caffe::Blob<>::mutable_gpu_data()
@ 0x7f7d049755f0 op::SubThread<>::initializationOnThread()
@ 0x7f7d04977948 op::Thread<>::initializationOnThread()
@ 0x7f7d02538590 caffe::CuDNNConvolutionLayer<>::Forward_gpu()
@ 0x7f7d04977b17 op::Thread<>::threadFunction()
@ 0x7f7d0236ff72 caffe::Net<>::ForwardFromTo()
@ 0x7f7d03dac66f (unknown)
@ 0x7f7d034ce6db start_thread
@ 0x7f7d048f3215 op::NetCaffe::forwardPass()
@ 0x7f7d0380788f clone

So, I execute simple command like this on Openpose_STAF folder
./build/examples/openpose/openpose.bin --video examples/media/video.avi
It works well

But, when i run the command as your code(pose_tracker.py)

./build/examples/openpose/openpose.bin --video examples/media/video.avi --model_pose BODY_21A --tracking 1 --render_pose 1 --display 2

I got same error.
How can i solve this?

This is my Computer Environment.

Cuda:10.0, CuDNN:7.5.0, GTX2080TI x 4

As an additional question, Does STAF mean Spatio-Temporal-Affiny-Field?

I got ELS module error while running on a virtual enviroment

I run the project on the anocanda with python 3.7
I installed the pytorch 1.4.0 and all the requirements but after running the demo,it seems got error
"unable to load egl library ..."
on the line of EGL
I did not know what cause that because this is not really a moudle which can pick by pip.
Should I install cuda or tensorflow or just torch is enough.
Hope give some advice.
Thanks

Online Inference

How do I get the online inference?

openpose posetracker and BODY_21A

Hello,

The posetracker runs openpose with the "--model_pose', 'BODY_21A'," setting. However, the STAF code does not seem to recognise that model (by default at least).

-M

Motion Discriminator classifier

Is your feature request related to a problem? Please describe.
I am not clear how the motion discriminator classify fake/true sequence

Describe the solution you'd like
The output seems to be an unnormalized vector of size 2 for each sequence. I was expecting a softmax layer to use it for a classification decision

VIBE/lib/models/motion_discriminator.py

Line 78 in 731c273

output = self.fc(torch.cat([avg_pool, max_pool], dim=1))

Additional context
Maybe Its usage in the training loss

[FEATURE] For half-body estimation

Thanks for your great work!
I run some demo but find it difficult to generalize on the video when only upper body is captured, would it be possible for you to give me some suggestions on how to modify your code to fit for half-body capture?

How can I get an .obj file directly from a single rgb image? Any available demo codes for that?

Train code

Hi @mkocabas, great thanks for your awersome work!!
Have you any plans of sharing train code for VIBE and it's HMR concerned components?

View meshes from different angles?

How can I view the predicted meshes from different angles?

2D joints from STAF output format

Hi @mkocabas, great thanks for your great work!
My question is about STAF tracker output format.
According to https://github.com/mkocabas/VIBE/blob/master/doc/demo.md#runtime-performance
we get:
joints2d (n_frames, 21, 3) # 2D keypoint detections by STAF if pose tracking enabled otherwise None

21 is the number of joints
but what is the last 3? as far as i can understand from STAF repo it have to contain x,y coords from original frame, but what it really contains?

Rendering Problem

Hi @mkocabas regarding previous issue #27 can you tell us that what was your torch and other libraries versions? I think I'm facing problem due to different versions. Kindly update requirements.txt file by specifying the versions. Thanks

obj output

Is it possible get outputs as obj files?

Export bone animation

Hi friend, is there currently a way to export bone sequences?...

RuntimeError: expected device cuda:0 but got device cpu

pred_joints3d as added in b95074a misses the move to CPU within the Temporal SMPLify section.

pyrender error

Hi, here is my problem, when I run "python demo.py ...."，it shows something in this picture:

can you help me? I doubt something wrong with pyrender version?

Running with black background

Hello,
I would like to know how I can set the background to black for the normal video?

3D Mesh problem

Hi @mkocabas Thanks for such a great work. I've tested your project in two different PCs with different environments like Python 3.6 & 3.7 with Torch 1.2.0 and 1.4.0 in both PCs.

1st PC Configuration and Problem:

Ubuntu 16.04 RTX2080Ti CUDA 9.0
In this PC I am getting following error

I tested with your provided video. Data is saving in pkl file. I checked the pkl file from this piece of code and I am getting following results, which seems that pkl file data and output is ok.

So the problem is in 3D rendering. It seems that there is problem with PyOpenGl that is using "egl" as back-end platform. I changed it from "egl" to "osmesa" but didn't work for me. I also checked that all the required libraries (FreeGlut, OpenGlut etc.) for OpenGl are already upto date.

Second PC Configuration and Problem

Ubuntu 16.04 RTX2060 CUDA 9.0

In this PC rendering is working but getting the same video as input video and no 3D mesh because output results are not saving in pkl file. pkl file is generating but empty file, there is no any data in pkl file. For more extent kindly check following screenshot.

All the libraries and data is same in both PCs and I also try to find the problem by debugging the code but I'm unable to get any solution. Kindly suggest some solutions regarding above problems.

Is there any need to download SMPL data and to add in project? As I checked that required data is already in project directory in vibe_data provided by you.

2D joints from 3D prediction

Hi @mkocabas, thank you a lot for your repo!
I wounder, if there any way of how to obtain 2D joints not from STAF tracker, but from 3D joints predicted by your net? Am I use pred_cam from https://github.com/mkocabas/VIBE/blob/master/doc/demo.md and how I can use it?

[BUG] Missing load_mean_dict function for in case use_6d is False

First, thank you for your great research and making the code available.

VIBE/lib/models/spin.py

Line 335 in 731c273

from lib.models.hmr import load_mean_dict

the function load_mean_dict is imported from lib.models.hmr

But there doesn't seem to be a hmr module in the code, I assume it is based on the original hmr code?

OpenGL.error.GLError: GLError( err = 12289,

When I run the "python demo.py --vid_file sample_video.mp4 --output_folder output/", I get this error

Question about render the video

Hello @mkocabas, Thanks for your awesome works, When I was running the demo.py, something wrong happened in rendering the output video. The details are in the picture below, can you help me slove the problem.

No yolov3.cfg included in the repository

After installing dependencies with install_pip.sh and downloading data with prepare_data.sh, I tried to run demo with command ./vibe-env/bin/python demo.py --vid_file ~/sample_video.mp4 --output_folder output/ and there is a No such file or directory error regarding ${HOME}/.torch/config/yolov3.cfg. I tried to locate the yolov3.cfg but cannot find it in this repo.

[BUG] Demo code crashing with temporal tracking

OS: running the demo notebook on google Colab
your python version: 3.6.9
your pytorch version: 1.4.0

When running the demo instruction below, I have an error. It occurs when I try to run with the temporal SMPLify algo as per the demo document:

!python demo.py --vid_file sample_video.mp4 --output_folder output/ --tracking_method pose --display --run_smplify

then I get the following error (it seems some files are missing in the repository):
Running "ffmpeg -i sample_video.mp4 -f image2 -v error /tmp/sample_video_mp4/%06d.png" Images saved to "/tmp/sample_video_mp4" Input video number of frames 300 Traceback (most recent call last): File "demo.py", line 383, in <module> main(args) File "demo.py", line 83, in main tracking_results = run_posetracker(video_file, staf_folder=args.staf_dir, display=args.display) File "/content/VIBE/VIBE/lib/utils/pose_tracker.py", line 91, in run_posetracker staf_folder=staf_folder File "/content/VIBE/VIBE/lib/utils/pose_tracker.py", line 32, in run_openpose os.chdir(staf_folder) FileNotFoundError: [Errno 2] No such file or directory: '/home/mkocabas/developments/openposetrack'

Please can you help? thanks a lot

Modeling a sequence of images

If I understand correctly the HMR is initialized from mean_pose for each frame of the video:
https://github.com/mkocabas/VIBE/blob/master/lib/models/spin.py#L196

Did you try to initialize each frame prediction from the previously predicted pose?

Question about the Demo.py

Thank you for your perfect work. I have a problem when I ran the demo.py on Ubuntu with python 3.7 as following:

Could you please give me some suggestions? Thank you very much.

Web Cam

Is there an option to run pre-trained model on a webcam?

reducing the input size of the model from [1, 16, 3, 1024, 1024]

I'm trying to convert the VIBE model to ONNX currently, and am running out of GPU space when I do it. The most common two answers to this are to get more GPU's and to reduce the model input size. I'm currently on a 4 GPU 128gb instance and am still running out of memory. I've tried a myriad of combinations and tried modifying the code. Is it possible to change the size of the inputs into the model from the current inputs: x = torch.randn(1, 16, 3, 2048, 2048).type(dtype) where the inputs are batch_size, seq_length, num_channels, size, size.
Thank you for your help, I appreciate it. It seems that getting a bigger instance shouldn't be necessary, and that I may be missing something.

... imports
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')
torch_model = VIBE_Demo(
        seqlen=16,
        n_layers=2,
        hidden_size=1024,
        add_linear=True,
        use_residual=False,
    ).to(device)
model = 'data/vibe_data/vibe_model_w_3dpw.pth.tar'

map_location = lambda storage, loc: storage 
if torch.cuda.is_available():
    map_location = None
torch_model.load_state_dict(torch.load(model, map_location=torch.device('cpu')), strict=False)
torch_model.eval()
dtype = torch.cuda.FloatTensor
x = torch.randn(1, 16, 3, 2048, 2048).type(dtype)
torch_out = torch_model(x)
with torch.no_grad():
	torch.onnx.export(torch_model, 
						x,
						"vibe.onnx", 
						export_params=True,
						opset_version=10,
						do_constant_folding=True, # constant folding for optimization
						input_names = ['input'],
						output_names = ['output'],
						dynamic_axes={'input' : {0 : 'batch_size'},
										'output' : {0 : 'batch_size'}})

Missing bbox and frames

Hi @mkocabas , In this issue #27 as I mentioned that I tested this project in two different PC, in first PC rendering is not working both with osmesa and egl.
And in second PC, there is no any problem with Pyrender and rendering is working but no 3D predictions because model is saving empty pkl file.
I figured out this problem and I came to know that problem starts from demo.py line 83. In tracking_results (pointing to run_posetracker) there are arrays of bbox and frames but in my case (in second PC) tracking_results is empty dict.
In run_posetracker there is "video_file", "staf_folder", and "posetrack_output_folder". For "staf_folder" is it mandatory to install OpenPose STAF and set the directory as in current situation it is pointed to your directory.
So my current problem is due to STAF or there is something else?

How can I use 3d pose

Hi, thank you for your amazing work, I have try it and get perfect video and vibe_output.pkl file. I don't know much about 3d mesh. Can you tell me what are saved in the file? Can I get 3d pose from it? Look forward to your reply.

Colab error with youtube video

Hi,
Using the google colab demo, I am able to install and run. However, for a youtube link, where I run this code:
!python demo.py --vid_file https://www.youtube.com/watch?v=E7ETLLEOV4s --output_folder output/ --display

I get the following error:

Donwloading YouTube video "https://www.youtube.com/watch?v=E7ETLLEOV4s"
Traceback (most recent call last):
File "demo.py", line 382, in
main(args)
File "demo.py", line 58, in main
video_file = download_youtube_clip(video_file, '/tmp')
File "/content/VIBE/lib/utils/demo_utils.py", line 88, in download_youtube_clip
return YouTube(url).streams.first().download(output_path=download_folder)
File "/usr/local/lib/python3.6/dist-packages/pytube/main.py", line 88, in init
self.prefetch_init()
File "/usr/local/lib/python3.6/dist-packages/pytube/main.py", line 97, in prefetch_init
self.init()
File "/usr/local/lib/python3.6/dist-packages/pytube/main.py", line 143, in init
mixins.apply_descrambler(self.player_config_args, fmt)
File "/usr/local/lib/python3.6/dist-packages/pytube/mixins.py", line 96, in apply_descrambler
for i in stream_data[key].split(',')
KeyError: 'url_encoded_fmt_stream_map'

Missing bbox and frames

@mkocabas Hi, yesterday I submitted same issue #37 and you closed it after your response.

As in this issue #37 I explained about problem and your response was that I should install STAF for tracking method (pose).
I checked that bbox as tracking method is default so I'm not using pose as tracking method, I'm using bbox as tracking method but facing problem same to #37
May be my this information (issue #37 ) is not enough for solution but I need your suggestions about this problem that can help me solve it.

Demo not working

I would like to thank you first for the great scientific contribution made in the paper.
I got an error when trying to run the demo, This is the error message got:
python demo.py --vid_file https://www.youtube.com/watch?v=c4DAnQ6DtF8 --output_folder output/ --display Donwloading YouTube video "https://www.youtube.com/watch?v=c4DAnQ6DtF8" Traceback (most recent call last): File "demo.py", line 377, in <module> main(args) File "demo.py", line 58, in main video_file = download_youtube_clip(video_file, '/tmp') File "/home/bilel/1-demos/VIBE/lib/utils/demo_utils.py", line 88, in download_youtube_clip return YouTube(url).streams.first().download(output_path=download_folder) File "/home/bilel/Downloads/anaconda3-2/envs/vibe-env/lib/python3.7/site-packages/pytube/__main__.py", line 88, in __init__ self.prefetch_init() File "/home/bilel/Downloads/anaconda3-2/envs/vibe-env/lib/python3.7/site-packages/pytube/__main__.py", line 97, in prefetch_init self.init() File "/home/bilel/Downloads/anaconda3-2/envs/vibe-env/lib/python3.7/site-packages/pytube/__main__.py", line 143, in init mixins.apply_descrambler(self.player_config_args, fmt) File "/home/bilel/Downloads/anaconda3-2/envs/vibe-env/lib/python3.7/site-packages/pytube/mixins.py", line 96, in apply_descrambler for i in stream_data[key].split(',') KeyError: 'url_encoded_fmt_stream_map'

Do you know how to solve this issue?
Thanks a lot for your support.

Animate model by pose parameters

Hi @mkocabas
I'm trying to use pose parameters to animate my model in 3D engine. I used SMPL model and apply all pose parameters to my model, but my model got some strange rotation.

These results is produced by your demo video.
Every motion looks normally, only problem is that strange rotation.
Do you have any idea how to deal with this problem?

Feet prediction quality

Dear @mkocabas Thank you for the great work!
I have watched and analysed a number of indoor videos which illustrate the approach, and it looks like while the reconstruction results overall are impressive,
there is still a number of leg positioning drawbacks, especially noticeable in feet pose detection.

In some frames feet are not detected correctly, on most of incorrectly predicted frames leg toes are raised up. IMHO, it is likely that all net contraction trained with poor feet labeling. Is it correct assumption?

Now I am in the process of understanding what is the root cause of this issue, and what can be done
in order to alleviate feet prediction error.
I see that you predictor has convolutional backbone inherited from the SPIN solution.
https://github.com/nkolot/SPIN
But I haven't figured out on what data and with what labelling it is trained with?
I mean, does it have only one feet joint or several? Have you retrained SPIN on your datasets?
And in case of retraining CNN backbone, is it necessary to retrain the temporal part of VIBE too?
Or perhaps I can leave it untouched for a while?
Thanks a lot in advance, I would greatly appreciate your response.

Question about the performance of GRU module

Hi,
I set 'disable_temporal' to 'True' when I define the VIBE_Demo model to test the performance of GRU module, but there was no difference in effect compared to when I did not set this option. Is this GRU module really useful? And I didn't see the ablation experiments in the paper. Do you have relevant ablation experiments to verify that the GRU module is effective? :)

Question about 3D pose

Hi @mkocabas Thanks for such a valuable repository. It's amazing work. I just tested the demo for local video. There is no error and the video is saved in output is same as input video. I mean there is no 3D character in output video...
On other side I tested this demo on Google Colab and it's working fine but in my local PC no 3D results. All the commands and commands output are same as in Google Colab.
Can you help me to solve this problem?

	def projection(pred_joints, pred_camera):
	pred_cam_t = torch.stack([pred_camera[:, 1],
	pred_camera[:, 2],
	2 * 5000. / (224. * pred_camera[:, 0] + 1e-9)], dim=-1)
	batch_size = pred_joints.shape[0]
	camera_center = torch.zeros(batch_size, 2)
	pred_keypoints_2d = perspective_projection(pred_joints,
	rotation=torch.eye(3).unsqueeze(0).expand(batch_size, -1, -1).to(pred_joints.device),
	translation=pred_cam_t,
	focal_length=5000.,
	camera_center=camera_center)
	# Normalize keypoints to [-1,1]
	pred_keypoints_2d = pred_keypoints_2d / (224. / 2.)
	return pred_keypoints_2d