Giter VIP home page Giter VIP logo

mint's Introduction

AI Choreographer: Music Conditioned 3D Dance Generation with AIST++ [ICCV-2021].

Overview

This package contains the model implementation and training infrastructure of our AI Choreographer.

Get started

Pull the code

git clone https://github.com/liruilong940607/mint --recursive

Note here --recursive is important as it will automatically clone the submodule (orbit) as well.

Install dependencies

conda create -n mint python=3.7
conda activate mint
conda install protobuf numpy
pip install tensorflow absl-py tensorflow-datasets librosa

sudo apt-get install libopenexr-dev
pip install --upgrade OpenEXR
pip install tensorflow-graphics tensorflow-graphics-gpu

git clone https://github.com/arogozhnikov/einops /tmp/einops
cd /tmp/einops/ && pip install . -U

git clone https://github.com/google/aistplusplus_api /tmp/aistplusplus_api
cd /tmp/aistplusplus_api && pip install -r requirements.txt && pip install . -U

Note if you meet environment conflicts about numpy, you can try with pip install numpy==1.20.

Get the data

See the website

Get the checkpoint

Download from google drive here, and put them to the folder ./checkpoints/

Run the code

  1. complie protocols
protoc ./mint/protos/*.proto
  1. preprocess dataset into tfrecord
python tools/preprocessing.py \
    --anno_dir="/mnt/data/aist_plusplus_final/" \
    --audio_dir="/mnt/data/AIST/music/" \
    --split=train
python tools/preprocessing.py \
    --anno_dir="/mnt/data/aist_plusplus_final/" \
    --audio_dir="/mnt/data/AIST/music/" \
    --split=testval
  1. run training
python trainer.py --config_path ./configs/fact_v5_deeper_t10_cm12.config --model_dir ./checkpoints

Note you might want to change the batch_size in the config file if you meet OUT-OF-MEMORY issue.

  1. run testing and evaluation
# caching the generated motions (seed included) to `./outputs`
python evaluator.py --config_path ./configs/fact_v5_deeper_t10_cm12.config --model_dir ./checkpoints
# calculate FIDs
python tools/calculate_scores.py

Citation

@inproceedings{li2021dance,
  title={AI Choreographer: Music Conditioned 3D Dance Generation with AIST++},
  author={Ruilong Li and Shan Yang and David A. Ross and Angjoo Kanazawa},
  booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
  year = {2021}
}

mint's People

Contributors

shanyangmie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mint's Issues

protoc error

when run " protoc ./mint/protos/*.proto ", it output "Missing output directives."

Root translation wrong after 2 seconds

Hi,

Congratulations for such a great work.

We've been trying to recover the generated animations (in the output .npy) files to an fbx, and we are almost there.
Since you are normalizing the translation of the root, we are multiplying it again by the scale of our character in order to use.
However, this only seems to work for the first 2 seconds for every generated motion, after what the root translation gets kind of exagerated and wrong.

Any idea why this could be happening?

Thanks,

about the crossmodal_train.txt

I'm sorry I meet a problem.
where can I get this file of crossmodal_train.txt
In this code, it seems in /mnt/data/AIST/music/, but i can't find where to download .

Evaluation stuck at initialized model.

I0928 11:08:09.961607 139665222619776 controller.py:391] restoring or initializing model...
restoring or initializing model...
I0928 11:08:09.962239 139665222619776 controller.py:397] initialized model.
initialized model.

CUDA, Tensorflow, nvcc versions?

Hello,

which CUDA, tensorflow, nvcc versions are you using? I am having problems with tensorflow. It does not recognise my GPUs and it should be related to version compatibility.

Inference - some output angles seem wrong

Visualizing the output .npy files of evaluator.py according to README with the provided checkpoints seems like some of the angles (e.g. shoulders) are wrong, maybe flipped. In this clip, green is the original sample from AIST++ and red is MINT inference.

To visualize, I implement the opposite operation described here, then used Blender's SMPLX addon to visualize. Here's my code:

    rotations = mint_data[:seq_len, 9:] # trim first 9 entries according to https://github.com/google-research/mint - mint/tools/preprocessing.py +161
    rotations = rotations.reshape([-1, 3, 3])
    rotations = R.from_matrix(rotations).as_rotvec().reshape([seq_len, (joint_dim-1)//9, 3])
    body_pose = rotations[:, :NUM_SMPLX_BODYJOINTS]  # FIXME - not sure about that (trimming last 3 joints from smpl 24, to smplx 21)

How to transfer .npy file to video

Thanks for your fancy work. I run evaluator.py file and the output file is .npy. Could you please give me some suggestions about transferring the output file to video?

In calculate_beat_scores.py, what should be the result_files?

Hi. After evaluation, I just tried to run calculate_beat_scores.py. However, the default result_files is '/mnt/data/aist_paper_results/*.pkl' which doesn't work. Is there anyone could tell me how to generate the motion and replace the default result files? Thank you very much!

crossmodal_val.txt

OSError: /mint/data/aist_plusplus_final/splits/crossmodal_val.txt not found.

freezing motion video

@liruilong940607 Hi, Ruilong, i run the evaluation code with the pretrained model you provided on google drive, however the result video like this, the first two seconds motion from the seed motion, and then got freezing motion in the following seconds. I don't know why. Could you please give me some suggestions?

I have no protos dic

using "protoc ./mint/protos/*.proto", I got a error:"Missing output directives."

bvh_writer instruction

Hi Ruilong,

Is it possible that providing instructions on bvh_writer?
I didn't find the skeleton_csv_filename and joints_to_ignore_csv_filename.

Best,
Wenjie

No successful evaluation is run

I0930 20:10:24.557036 139652333998720 controller.py:277] eval | step: 214501 | running complete evaluation...
eval | step: 214501 | running complete evaluation...
I0930 20:10:27.645429 139652333998720 controller.py:290] eval | step: 214501 | eval time: 3.1 sec | output: {}
eval | step: 214501 | eval time: 3.1 sec | output: {}

No evaluation is actually conducted...

Visulization with 3D character

Hi, thanks for your fancy work. I'm new in 3D visualization. Just curious how you visualize the generate 3D motion with character from Mixamo. Do you use blender or something? Could you possibly refer some helpful websites or something like that?

Thanks in advance.

Where is the aist_features directory?

I'm trying to run the calculate_fid_scores.py and here is the error.
The error shows that the stack of real_features is empty.
image

It seems that I need to load the real_features from the ./data/aist_features/*_kinetic.npy and ./data/aist_features/*_manual.npy, but I didn't find it in my repository?
image

I'd like to know where is the aist_features directory or how to generate *_kinetic.npy and ./data/aist_features/*_manual.npy from the real data?

loss逐步减小,但FID_k却逐渐增大?(tensorflow & pytorch)

本人用Pytorch复刻了一版,也增加了valid过程,结果发现,FID_k最好的是第21个epoch,为101,后面越训练,FID_k的值越大,波动比较大,也不知道什么原因。 loss值降到0.0002左右,基本就不收敛了,FID_k的值能达到7000多, FID_g的值25左右。
使用原作者tensorflow原代码,从头开始训练,loss值降到0.0001后,基本就不收敛了,FID_k的值达到700多,FID_g也就30多,也是完全复现不了原作者放出的训练好的模型。 @liruilong940607

TF: 2.3
cuda: 10.1

pytorch: 1.9.1
cuda: 10.1

阶段进展更新了,pytorch复刻版训练已经收敛,复现了原作者的指标参数,解决方法如下:

1、每一层的初始化方法与TF版保持一致,注意检查每一层的默认初始化方法
2、由于训练集只有952个视频,如果多个GPU同时训练,batch_size设置为32,那么每个epoch只包含几个迭代,建议方法是加载列表后,把列表复制10倍或20倍,这样每个epoch的迭代次数就变为原来的10倍或20倍
3、训练足够多的epoch,如果训练的时候,迭代器中把训练数据列表复制了10倍或20倍,那么loss要收敛到0.00011级别,至少训练800个epoch,0-200 epoch是lr=1e-4,200-500 epoch时lr=1e-5,500-800 是1e-6

另外,我在最后一个CrossTransformer后面,最后一个fc前面,增加了一个2层双向的LSTM,训练下来,loss收敛到0.00011左右,FIDk最小可以达到22.3,比原作者给出的指标还小,但这个指标小,并不代表着生成的舞蹈动作很好,效果如原作者给出的模型类似,只有少数几个效果还行,其他的都不怎么样。

How to export the saved_model correctly?

How to export the saved_model correctly? There are some error in exported saved_model. See below.

saved_model_cli show --dir=savedmodel/214501 --all
2021-12-21 22:36:19.753846: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

signature_def['__saved_model_init_op']:
  The given SavedModel SignatureDef contains the following input(s):
  The given SavedModel SignatureDef contains the following output(s):
    outputs['__saved_model_init_op'] tensor_info:
        dtype: DT_INVALID
        shape: unknown_rank
        name: NoOp
  Method name is: 

Defined Functions:

About the eval result

my eval result using the given checkpoint has only a few second valid dance.
after a few second, the result became totally nonsence.
is this the real case or did I do something wrong?

Correct repo to `git clone`

Hi, is this information in the README correct?

git clone https://github.com/liruilong940607/mint --recursive

Perhaps, it's git clone https://github.com/google-research/mint --recursive?

Thanks

代码看的很费解很吃力,求助

In the bvh_writer.py
{"pred_results":["model_name_pose": joints angle array,
"joints_3d":joints pose array]}.

What's the difference between the joints angle array and the joints pose array?

BeatAlign

Hi Ruilong,

I was wondering whether the kinematic beats are computed based on the whole body joints?
I cannot find these codes.

Can't find the data_files

data_files: "./data/_tfrecord-train"
Could you tell where can I download the data file? Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.