🕹️ Official Implementation of Conditional Motion In-betweening (CMIB) 🏃

Home Page: https://jihoonerd.github.io/Conditional-Motion-In-Betweening/

Python 99.66% Shell 0.34%

conditional-motion-in-betweening's People

Contributors

Stargazers

Watchers

Forkers

peterzhousz junking1 jiahongwu1995 muhammadmoizulhaq roger-peng rmila-dneg matvezy

conditional-motion-in-betweening's Issues

hi, how to visualize the result in unity

How to use the generated actions in Blender or Unreal?

Hi! Thanks for your amazing work!
The trained network works pretty well, however I'm puzzled about how to use the generated actions in Blender or Unreal.
The outputs of the network are: (1)global pos, (2)global rot(as quaternion), using function quat_ik we can get local_quaternion and local_positions. I use the generated local_quaternion to create a animation using Blender, more detailly, I use a fbx downloaded from maximo.com, I delete the original animation, and set every bone's rotation_quaternion frame by frame.
Sadly, the result looks weird and is diffrent from the ploted images, below is the result pose:

It is expected to look like the blue pose in the following image as ploted by the CMIB code :

I guess this is because the Lafan1 dataset do not use popular T-pose as rest_pose. The generated quaternion is aimed to rotate the vector defined in the offsets array (in skeleton.py) instead of the T-pose.
So I wander if there is a convenience way to use the generated actions in Blender or Unreal? I believe this would extremly helpful, looking forward to your reply! Thanks! :)

hyper parameters add to argment in train.py

Root position visualizer

Need 2d projection of root to visualize trajectories

Experiment with unified labeling code

To cover generating motions without predefined condition, use 'unknown' condition as an additional label

Conditioning inside LSTM, not concatenating whole inputs

Refer

https://arxiv.org/pdf/1412.2306.pdf

apply new structure to test.py

Use linear probed discriminator

Current unrolled state does not handle sequential data, which may lead to fail capture modality.
Consider using the last cell state as a motion descriptor and discriminator input.

Some questions about the input of network

The input of transformer model is [seq_len, batch_size, embedding_dim] instead of [batch_size, seq_len, embedding_dim], what‘s the purpose of this design？

Apply weights & biases, amp(automatic mixed precision)

weights & biases is useful machine learning tool.
All experiments can be easily compared and check online.
https://wandb.ai/site

Using torch.amp (automatic mixed precision) will make our training time faster.
https://pytorch.org/docs/stable/amp.html

Benchmark models show different l2p,l2q from the paper

I download the benchmark models from the site, and test it on lanfan dataset. But the l2p and l2q are diffrent from the paper. I wonder if something wrong with my setting. Or, the benchmark models are not the best setting trained models.

Use Gumbel-Softmax to handle imbalanced data

Elastic InfoGAN discusses a method to use InfoGAN in imbalanced data setting. It may be too difficult to use contrastive learning, but employing Gumbel-Softmax looks feasible.

Unit length representation in global coordinates

As prediction in global coordinate system presented improved visual performance, converting input representation from local to global is in progress. However, global coordinate prediction has critical disadvantage that cannot guarantee the length of links.

Predicting the unit displacement and multiplication after it will provide link length preserving representation in global coordinate.

[Hotfix] Replace label infogan encoder to direct injection

Current InfoganCodeEncoder was found not effective on embedding label into a first hidden state of LSTM. It is suspected that idea of expanding labels to hidden state (LSTM's hidden state, mostly >512) was unsuccessful.

This calls for reverting to direct injection of label by concatenating them.

shaking for start and target when I training my self dataset?

when I trained myself data for 175epoch , I found the result sequence joint with start and target will suddenly shake. I wan't to know , How can reduce this phenomenon?

where I can find corresponding code about Motion data augmentation?

Based on my own understand, there are 3 parts process about traing.

Randomized Shuffled Anchor Pose: corresponding to the random mask_start_frame.
Semantic Embedding: in the network Sturcture, cond_embedding
motion data augmentation? I can't find the corresponding code?

Question how is the performance in regards to hand/finger movement and facial expressions?

I was wondering if the method also works on "finer" detail movement in regards to the smaller body parts as hands and facial expressions.

Cool work ;)

[Major Change] Use BERT-based Transformer Encoder / Transformer Decoder

Experiments conducted has shown that finding manifold of motion generation and disentanglement of it is difficult.

BERT suggested not only a transformer based representation learning, but also presented Masked Language Model(MLM) task, which considerably resembles our in-betweening work.

Similar approach is studied at https://arxiv.org/abs/2103.00776.

Any kinds of discriminator (1D Conv, Transformer Decoder, RNN-based Decoder) can be integrated to yield mutual information loss.

Support L1 loss scheduling

Current test.py does not support continuous code

Continuous codes are uniformly distributed in the range of [-1,1].
We need a test code to confirm varying continuous code similar as how we do in discrete code case.

jihoonerd / conditional-motion-in-betweening Goto Github PK

conditional-motion-in-betweening's People

Contributors

Stargazers

Watchers

Forkers

conditional-motion-in-betweening's Issues

Recommend Projects

Recommend Topics

Recommend Org