ericguo5513 / tm2t Goto Github PK

Official implementation of "TM2T: Stochastic and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV2022)"

Home Page: https://ericguo5513.github.io/TM2T

License: MIT License

Python 100.00%

motion-generation motion-generator pytorch-implementation motion-to-text text-to-motion

tm2t's Introduction

TM2T: Stochastical and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV 2022)

[Project Page] [Paper]

Python Virtual Environment

Anaconda is recommended to create this virtual environment.

conda create -f environment.yaml
conda activate tm2t

If you cannot successfully create the environment, here is a list of required libraries:

Python = 3.7.9   # Other version may also work but is not tested.
PyTorch = 1.6.0 (conda install pytorch==1.6.0 torchvision==0.7.0 -c pytorch)  #Other version may also work but are not tested.
scipy
numpy
tensorflow       # For use of tensorboard only
spacy
tqdm
ffmpeg = 4.3.1   # Other version may also work but are not tested.
matplotlib = 3.3.1
nlpeval (https://github.com/Maluuba/nlg-eval)     # For evaluation of motion-to-text only
bertscore (https://github.com/Tiiiger/bert_score) # For evaluation of motion-to-text only

After all, if you want to generate 3D motions from customized raw texts, you still need to install the language model for spacy.

python -m spacy download en_core_web_sm

Download Data & Pre-trained Models

If you just want to play our pre-trained models, you don't need to download datasets.

Datasets

We are using two 3D human motion-language dataset: HumanML3D and KIT-ML. For both datasets, you could find the details as well as download link [here].
Please note you don't need to clone that git repository, since all related codes have already been included in current git project.

Download and unzip the dataset files -> Create a dataset folder -> Place related data files in dataset folder:

mkdir ./dataset/

Take HumanML3D for an example, the file directory should look like this:

./dataset/
./dataset/HumanML3D/
./dataset/HumanML3D/new_joint_vecs/
./dataset/HumanML3D/texts/
./dataset/HumanML3D/Mean.mpy
./dataset/HumanML3D/Std.npy
./dataset/HumanML3D/test.txt
./dataset/HumanML3D/train.txt
./dataset/HumanML3D/train_val.txt
./dataset/HumanML3D/val.txt  
./dataset/HumanML3D/all.txt

Pre-trained Models

Create a checkpoint folder to place pre-traine models:

mkdir ./checkpoints

Download models for HumanML3D from [here]. Unzip and place them under checkpoint directory, which should be like

./checkpoints/t2m/
./checkpoints/t2m/Comp_v6_KLD005/                   # A dumb folder containing information for evaluation dataloading
./checkpoints/t2m/VQVAEV3_CB1024_CMT_H1024_NRES3/  # Motion discretizer
./checkpoints/t2m/M2T_EL4_DL4_NH8_PS/              # Motion (token)-to-Text translation model
./checkpoints/t2m/T2M_Seq2Seq_NML1_Ear_SME0_N/     # Text-to-Motion (token) generation model
./checkpoints/t2m/text_mot_match/                  # Motion & Text feature extractors for evaluation

Download models for KIT-ML [here]. Unzip and place them under checkpoint directory.

Training Models

All intermediate meta files/animations/models will be saved to checkpoint directory under the folder specified by argument "--name".

Training motion discretizer

HumanML3D

python train_vq_tokenizer_v3.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m --n_resblk 3

KIT-ML

python train_vq_tokenizer_v3.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name kit --n_resblk 3

Tokenizing all motion data for the following training

HumanML3D

python tokenize_script.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m

KIT-ML

python tokenize_script.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name kit

Training motion2text model:

HumanML3D

python train_m2t_transformer.py --gpu_id 0 --name M2T_EL4_DL4_NH8_PS --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --dataset_name t2m

KIT-ML

python train_m2t_transformer.py --gpu_id 0 --name M2T_EL3_DL3_NH8_PS --n_enc_layers 3 --n_dec_layers 3 --proj_share_weight --dataset_name kit

Training text2motion model:

HumanML3D

python train_t2m_joint_seq2seq.py --gpu_id 0 --name T2M_Seq2Seq_NML1_Ear_SME0_N --start_m2t_ep 0 --dataset_name t2m

KIT-ML

python train_t2m_joint_seq2seq.py --gpu_id 0 --name T2M_Seq2Seq_NML1_Ear_SME0_N --start_m2t_ep 0 --dataset_name kit

Motion & text feature extractors:

We use the same extractors provided by https://github.com/EricGuo5513/text-to-motion

Generating and Animating 3D Motions (HumanML3D)

Translating motions into langauge (using test set)

With Beam Search:

python evaluate_m2t_transformer.py --name M2T_EL4_DL4_NH8_PS --gpu_id 2 --num_results 20 --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --ext beam_search

With Sampling:

python evaluate_m2t_transformer.py --name M2T_EL4_DL4_NH8_PS --gpu_id 2 --num_results 20 --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --sample --top_k 3 --ext top_3

Generating motions from texts (using test set)

python evaluate_t2m_seq2seq.py --name T2M_Seq2Seq_NML1_Ear_SME0_N --num_results 10 --repeat_times 3 --sample --ext sample

where --repeat_time gives how many sampling rounds are carried out for each description. This script will results in 3x10 animations under directory ./eval_results/t2m/T2M_Seq2Seq_NML1_Ear_SME0_N/sample/.

Sampling results from customized descriptions

python gen_script_t2m_seq2seq.py --name T2M_Seq2Seq_NML1_Ear_SME0_N  --repeat_times 3 --sample --ext customized --text_file ./input.txt

This will generate 3 animated motions for each description given in text_file ./input.txt.

If you find problem with installing ffmpeg, you may not be able to animate 3d results in mp4. Try gif instead.

Quantitative Evaluations

Evaluating Motion2Text

python final_evaluation_m2t.py

Evaluating Motion2Text

python final_evaluation_t2m.py

This will evaluate the model performance on HumanML3D dataset by default. You could also run on KIT-ML dataset by uncommenting certain lines in ./final_evaluation.py. The statistical results will saved to ./m2t(t2m)_evaluation.log.

Misc

Contact Chuan Guo at [email protected] for any questions or comments.

tm2t's People

Contributors

Stargazers

Watchers

Forkers

jiahongwu1995 wangsen1312 nanditho rct-ai hologerry mummyk dwro0121 yinkangning0124 kin-7777777 kotthoff dajiaohuang hanyangclarence keneyr nthuandrew

tm2t's Issues

error

I run the following scripts.
python train_vq_tokenizer_v3.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m --n_resblk 3
python tokenize_script.py --gpu_id 0 --name VQVAEV3_CB1024_CMT_H1024_NRES3 --dataset_name t2m
python train_m2t_transformer.py --gpu_id 0 --name M2T_EL4_DL4_NH8_PS --n_enc_layers 4 --n_dec_layers 4 --proj_share_weight --dataset_name t2m
python train_t2m_joint_seq2seq.py --gpu_id 0 --name T2M_Seq2Seq_NML1_Ear_SME0_N --start_m2t_ep 0 --dataset_name t2m

when I ran the train_m2t_transformer script, I got an error.
RuntimeError: stack expects each tensor to be equal size, but got [55] at entry 0 and [65] at entry 10.

Fail to run "final_evaluation_m2t.py"

Firstly of all, thank you for sharing such an interesting work!

I have a question on the final evaluation:
When I try the python final_evaluation_m2t.py ,
I met a error message: FileNotFoundError: [Errno 2] No such file or directory: './checkpoints/t2m/Comp_v6_KLD005/opt.txt'
I found the config name Comp_v6_KLD005 in your previous work text-to-motion.
Is there any modification I need to do to make it runable?

Question on "inverse_kinematics_n"

Hi, thank you for the great work!
I have a question about the inverse_kinematics_n
I am looking for the Inverse Kinematics (input keypoint, output quaternions) for SMPL model. May I ask if inverse_kinematics_n can be used as a SMPL.IK by setting the t2m_raw_offsets as the the T-pose offset?

When I load a SMPL model with all axis-angle np.zeros(1, 24, 3), I can convert it to quaternions (1,0,0,0).repeat(24), and extract keypoint3D by SMPL.forward() and get the T-pose offset.
Since the inverse_kinematics_n can convert keypoint3D to quaternions, I try set the t2m_raw_offsets as the the T-pose offset, and use it as input pose to caculate:
quat_params = src_skel.inverse_kinematics_np(the T-pose offset, face_joint_indx),
I expected the quat_params would be quaternions (1,0,0,0).repeat(24), but it give me
array([[ 1. , 0. , 0. , 0. ], [ 0.97018707, 0.10117273, 0.02506009, 0.21879934], [ 0.9764908 , 0.08095426, -0.02427085, -0.19830072], [ 0.19227548, 0.97177869, 0.00742641, 0.13645965], [ 0.96733868, -0.09798025, -0.02706633, -0.23221391], [ 0.97394329, -0.06727229, 0.02661659, 0.21494284], [-0.02280826, -0.9866249 , 0.12702832, -0.09957472], [ 0.99863803, 0.03256986, 0.00384251, -0.04057946], [ 0.99891472, 0.01709825, -0.00162746, 0.04329439], [ 0.98927391, 0.13215074, 0.00712388, -0.06183069], [ 0.8467989 , -0.49483883, -0.06065629, 0.18543833], [ 0.84711134, -0.50181162, 0.0676151 , -0.16129398], [ 0.99846792, 0.04506058, 0.00653788, -0.03145744], [ 0.99325216, -0.06699904, 0.02710451, -0.09070143], [ 0.98986506, -0.07060396, -0.02671974, 0.12028421], [ 0.9634037 , 0.26020402, 0.00433816, -0.06426686], [ 0.98609924, 0.02350155, 0.0062857 , -0.1643672 ], [ 0.9903416 , 0.05510591, 0.00776913, 0.12699005], [ 0.99951196, 0.02822196, -0.01334839, 0.00127889], [ 0.99990761, -0.01145354, -0.00515127, 0.00520699], [ 0.99215883, 0.02453573, -0.01576269, 0.12153636], [ 0.99276286, 0.03386558, 0.02881835, -0.11155534], [ 0.99562532, -0.03552062, 0.06299305, -0.05917664], [ 0.99824488, -0.02248499, -0.03899435, 0.03848618]]).

Is there anything I did wrongly?
Thank you!

How to convert a matchmaker to a 3D person

Hi, Thanks for your awesome work! and how to convert

to

any suggestion? Thank you.

Download HumanML3D models

HI, i can't download HumanML3D model when i use TM2T code

DataError

When I use "evaluate_m2t_transformer.py" to evaluate the pretrained model on KIT dataset, it occurs this error:
No such file or directory: './dataset/KIT-ML/VQVAEV3_CB1024_CMT_H1024_NRES3/02346.txt'
I found that the "texts" folder contains a series of *.txt files, I correct the correpsonding codes about data acquisition in "dataset.py" in Line 538 (opt.m_token_dir -> opt.text_dir), it occurs this error:

Can't generate motions from texts (using test set)

When i run the command
python evaluate_t2m_seq2seq.py --name T2M_Seq2Seq_NML1_Ear_SME0_N --num_results 10 --repeat_times 3 --sample --ext sample
The output is

Traceback (most recent call last):
  File "evaluate_t2m_seq2seq.py", line 122, in <module>
    dataset = Motion2TextEvalDataset(opt, mean, std, split_file, w_vectorizer)
  File "/home/student/TM2T/data/dataset.py", line 538, in __init__
    with cs.open(pjoin(opt.m_token_dir, name + '.txt'), 'r') as f:
  File "/home/student/.conda/env/lib/python3.7/codecs.py", line 904, in open
    file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: './dataset/HumanML3D/VQVAEV3_CB1024_CMT_H1024_NRES3/004822.txt'

I have downloaded the required dataset, but there is no VQVAEV3_CB1024_CMT_H1024_NRES3 file folder in the dataset folder. I have the same problem when I run the commands in Translating motions into langauge (using test set) .What should I do to get the VQVAEV3_CB1024_CMT_H1024_NRES3 file folder？

No such file or directory: './dataset/HumanML3D/VQVAEV3_CB1024_CMT_H1024_NRES3\\004822.txt'

I will appreciate it if you could help me.
Traceback (most recent call last):
File "evaluate_m2t_transformer.py", line 131, in
dataset = Motion2TextEvalDataset(opt, mean, std, split_file, w_vectorizer)
File "D:\Graduate\my_direction\Text2gesture\code\TM2T-main\data\dataset.py", line 538, in init
with cs.open(pjoin(opt.m_token_dir, name + '.txt'), 'r') as f:
File "D:\ProgramData\anaconda3\envs\tm2t\lib\codecs.py", line 898, in open
file = builtins.open(filename, mode, buffering)
FileNotFoundError: [Errno 2] No such file or directory: './dataset/HumanML3D/VQVAEV3_CB1024_CMT_H1024_NRES3\004822.txt'

Question about training skill: 'Early Stopping!~'

Hi, thank you very much for sharing this intereting work.
I tried download the data HumanML3D and re-run the whole process.
While the final_evaluation meet error, the qualitative result looks very nice.

Here I want to ask some questions about the training process, which I am not sure if I did it right or not.

train_vq_tokenizer_v3.py

I found there is a ema and cmt opinion for quantizer, while some blog says ema is faster, may I ask if ema works better in this task?
When I run the train_vq_tokenizer_v3.py, I found that the max_epoch is 300, while it takes me 24 hours for about 50 epoch. If I increased the batch_size , and got a lower val_loss, will that faster the training process without hurt the performance?
I found that the break after 'Early Stopping!~' is comment out, is that because we need to try ckpt from longer training?

train_m2t_transformer.py

When I run the train_m2t_transformer.py, the val_los stop decrease after 20 mins as shown in the Fig below, is this normal? Or would be appreciated if you could provide us log file for double check... >.<

Thank you!

can't find T2M_Seq2Seq_NML1_Ear_SME0_N in kit.zip

Hi, i can't fiind T2M_Seq2Seq_NML1_Ear_SME0_N in kit.zip ,can you help me?

could not find the `004822.npy` file

I have downloaded the HumanML3D dataset to correct dir, but could not find the 004822.npy file while 012314.npy exists.

Loading t2m_model model: Epoch 031 Total_Iter 11904
  0%|                                                                                                                                    | 0/4384 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "evaluate_t2m_seq2seq.py", line 121, in <module>
    dataset = Motion2TextEvalDataset(opt, mean, std, split_file, w_vectorizer)
  File "C:\GITHERE\TM2T\data\dataset.py", line 532, in __init__
    motion = np.load(pjoin(opt.motion_dir, name + '.npy'))
  File "C:\Users\Bush\anaconda3\envs\TM2T\lib\site-packages\numpy\lib\npyio.py", line 390, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: './dataset/HumanML3D/new_joint_vecs\\004822.npy'

FileNotFoundError

When I run "python final_evaluations_m2t.py", the error occurs: FileNotFoundError: [Errno 2] No such file or directory: './glove/our_vab_data.npy'

ericguo5513 / tm2t Goto Github PK

tm2t's Introduction

TM2T: Stochastical and Tokenized Modeling for the Reciprocal Generation of 3D Human Motions and Texts (ECCV 2022)

Python Virtual Environment

Download Data & Pre-trained Models

Datasets

Pre-trained Models

Download models for HumanML3D from [here]. Unzip and place them under checkpoint directory, which should be like

Download models for KIT-ML [here]. Unzip and place them under checkpoint directory.

Training Models

Training motion discretizer

HumanML3D

KIT-ML

Tokenizing all motion data for the following training

HumanML3D

KIT-ML

Training motion2text model:

HumanML3D

KIT-ML

Training text2motion model:

HumanML3D

KIT-ML

Motion & text feature extractors:

Generating and Animating 3D Motions (HumanML3D)

Translating motions into langauge (using test set)

Generating motions from texts (using test set)

Sampling results from customized descriptions

Quantitative Evaluations

Evaluating Motion2Text

Evaluating Motion2Text

Misc

tm2t's People

Contributors

Stargazers

Watchers

Forkers

tm2t's Issues

Recommend Projects

Recommend Topics

Recommend Org