barquerogerman / flowmdm Goto Github PK

[CVPR 2024] Official Implementation of "Seamless Human Motion Composition with Blended Positional Encodings".

Home Page: https://barquerogerman.github.io/FlowMDM/

License: Other

Python 99.85% Shell 0.15%

diffusion generative-model human-motion motion-generation human-motion-composition human-motion-extrapolation cvpr cvpr2024

flowmdm's Issues

Regarding Evaluation metric

Thank you for your amazing work!

I have a few questions regarding the evaluation metrics used for the transition part, specifically with the HumanML3D dataset. Given that there's no ground truth available, could you please explain how the FID, Div, PJ, AUJ was calculated for this dataset?

Furthermore, concerning the Peak Jerk metric, I'm interested in knowing the values used for the HumanML3D dataset.
Could you please share the details of Jerk calculation? I'm wondering what values are used among 263 dimension. did the calculation of jerk consider only the joint locations, or did it also include joint rotations? Additionally, is the delta_t for Jerk calculation defined by frame or second?

I appreciate your time and look forward to your insights.

Best,
awdrkjlk966

Discrepancy in Model Performance Reproduction and Pretrained Model Parameters

Hello BarqueroGerman,

I'm working on replicating your model's performance but noticed a gap between my results and the pretrained model's performance. I've confirmed that my hyperparameters match the ones in your Readme. Could you share the pretrained model's hyperparameters to help me troubleshoot? The performence of my trained model is shown.

Thanks

Source code?

Just wondering if there's any timeline about when source code will be available to the public? Would love to have a look and play around with it

An error occurs when running environment.yml。What should I do?

PackagesNotFoundError: The following packages are not available from current channels:

zlib==1.2.13=h5eee18b_0
xz==5.2.6=h5eee18b_0
tk==8.6.12=h1ccaba5_0
sqlite==3.40.0=h5082296_0
setuptools==65.5.0=py38h06a4308_0
readline==8.2=h5eee18b_0
python==3.8.15=h7a1cb2a_2
pip==22.2.2=py38h06a4308_0
openssl==1.1.1s=h7f8727e_0
numpy-base==1.23.4=py38h31eccc5_0
numpy==1.23.4=py38h14f4228_0
ncurses==6.3=h5eee18b_3
mkl_random==1.2.2=py38h51133e4_0
mkl_fft==1.3.1=py38hd3c417c_0
mkl-service==2.4.0=py38h7f8727e_0
mkl==2021.4.0=h06a4308_640
libstdcxx-ng==11.2.0=h1234567_1
libgomp==11.2.0=h1234567_1
libgcc-ng==11.2.0=h1234567_1
libffi==3.4.2=h6a678d5_6
ld_impl_linux-64==2.38=h1181459_1
intel-openmp==2021.4.0=h06a4308_3561
certifi==2022.9.24=py38h06a4308_0
ca-certificates==2022.10.11=h06a4308_0
_openmp_mutex==5.1=1_gnu

Current channels:

To search for alternate channels that may provide the conda package you're
looking for, navigate to

https://anaconda.org

and use the search bar at the top of the page.

BVH file as a output

Congratulation for the awesome work!

Can you please provide a code to get a BVH file as a output?

Motion with fingers

Just curious to know, do you have any plans to add motion with fingers?

How can I reduce GPU memory usage in generation?

I've noticed that FlowMDM consumes over 14GB of GPU memory during the generate phase, which is much higher than the original MDM. What could be the reason for this increased memory consumption? Is there a way to reduce the memory usage so that it can run on a GPU with only 8GB of memory?

What's going on with the ffmpeg?

When I was trying to run the demo composition, I meet with the problem as follows:

Why split query, key, value into rotary and non-rotary parts?

I am intrigued by the code on line 943 of the file 'FlowMDM/model/x_transformers/x_transformers.py':

 (ql, qr), (kl, kr), (vl, vr) = map(lambda t: (t[..., :l], t[..., l:]), (q, k, v)) # split query, key, value into rotary and non-rotary parts

Could you please explain the rationale behind splitting the query, key, and value into rotary and non-rotary parts? I would appreciate your insight. Thank you!

An error(maybe) motion occured when I use a modified input.

When I modified the sequences in the file "composition_babel.json", I have gotten a strange result.
This is my modification:
line 43-48:
"kung fu pose"--->"kung fu pose",
"kung fu pose"--->"dance",
"kung fu pose"--->"lie down",
"step left"---> "stand up",
"throw baseball"--->"throw baseball",
"catch the ball"--->"catch the ball"

The result is here, may I ask if this is normal？
https://github.com/BarqueroGerman/FlowMDM/assets/72643015/97122869-7fe9-4c2c-a592-998b7b013553

Regarding GT Jerk Computations

Hello. Thanks for the great work! 🙂
Could you explain why GT jerk values are constant numbers, which do not vary along the temporal axis (unlike the generated jerk values)?

barquerogerman / flowmdm Goto Github PK

flowmdm's Issues

Regarding Evaluation metric

Discrepancy in Model Performance Reproduction and Pretrained Model Parameters

Source code?

An error occurs when running environment.yml。What should I do?

BVH file as a output

Motion with fingers

How can I reduce GPU memory usage in generation?

What's going on with the ffmpeg?

Why split query, key, value into rotary and non-rotary parts?

An error(maybe) motion occured when I use a modified input.

Regarding GT Jerk Computations

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent