Giter VIP home page Giter VIP logo

tada's Introduction

TADA! Text to Animatable Digital Avatars

Tingting Liao* · Hongwei Yi* · Yuliang Xiu · Jiaxiang Tang · Yangyi Huang · Justus Thies · Michael J. Black
* Equal Contribution

3DV 2024

Paper PDF Project Page youtube views


Logo

TADA takes text as input and produce holistic animatable 3D avatars with high-quality geometry and texture. It enables creation of large-scale digital character assets that are ready for animation and rendering, while also being easily editable through natural language.

NEWS (2023.9.24):

  • Using Omnidata normal prediction model to improve the normal&image consistency.
270190232-248d70ab-f755-46f1-bb4f-1a8468f30901.mp4
270190256-d7ad2b0f-6c29-46ba-9090-d91d027a5a6b.mp4

Install

  • System requirement: Unbuntu 20.04
  • Tested GPUs: RTX4090, A100, V100
  • Compiler: gcc-7.5 / g++-7.5
  • Python=3.9, CUDA=11.5, Pytorch=1.12.1
git clone [email protected]:TingtingLiao/TADA.git
cd TADA

conda env create --file environment.yml
conda activate tada 
pip install -r requirements.txt
 
cd smplx
python setup.py install 

# download omnidata normal and depth prediction model 
mkdir data/omnidata 
cd data/omnidata 
gdown '1Jrh-bRnJEjyMCS7f-WsaFlccfPjJPPHI&confirm=t' # omnidata_dpt_depth_v2.ckpt
gdown '1wNxVO4vVbDEMEpnAi_jwQObf2MFodcBR&confirm=t' # omnidata_dpt_normal_v2.ckpt

Data

Please consider cite AIST, AIST++, TalkSHOW, MotionDiffusion if they also help on your project
@inproceedings{aist-dance-db,
  author = {Shuhei Tsuchida and Satoru Fukayama and Masahiro Hamasaki and Masataka Goto}, 
  title = {AIST Dance Video Database: Multi-genre, Multi-dancer, and Multi-camera Database for Dance Information Processing}, 
  booktitle = {Proceedings of the 20th International Society for Music Information Retrieval Conference (ISMIR) },
  year = {2019}, 
  month = {Nov} 
}

@inproceedings{li2021learn,
  title={AI Choreographer: Music Conditioned 3D Dance Generation with AIST++}, 
  author={Ruilong Li and Shan Yang and David A. Ross and Angjoo Kanazawa},
  year={2021},
  booktitle={ICCV}
}

@inproceedings{yi2023generating,
  title={Generating Holistic 3D Human Motion from Speech},
  author={Yi, Hongwei and Liang, Hualin and Liu, Yifei and Cao, Qiong and Wen, Yandong and Bolkart, Timo and Tao, Dacheng and Black Michael J},
  booktitle={CVPR}, 
  pages={469-480},
  month={June}, 
  year={2023} 
}

@inproceedings{tevet2023human,
  title={Human Motion Diffusion Model},
  author={Guy Tevet and Sigal Raab and Brian Gordon and Yoni Shafir and Daniel Cohen-or and Amit Haim Bermano},
  booktitle={ICLR},
  year={2023},
  url={https://openreview.net/forum?id=SJ1kSyO2jwu}
}

Usage

Training

The results will be saved in $workspace. Please change it in the config/*.yaml files.

# single prompt training    
python -m apps.run --config configs/tada.yaml --text "Aladdin in Aladdin" 

# with Omnidata supervision 
python -m apps.run --config configs/tada_w_dpt.yaml --text "Aladdin in Aladdin" 

# multiple prompts training
bash scripts/run.sh data/prompt/fictional.txt 1 10 configs/tada.yaml

Animation

python -m apps.anime --subject "Abraham Lincoln" --res_dir your_result_path

Tips

  • Using an appropriate learning rate for SMPL-X shape is important to learn accurate shape.
  • Omnidata normal supervision can effectively enhance the overall geometry and texture consistency; however, it demands more time for optimization.

Citation

@inproceedings{liao2024tada,
  title={{TADA! Text to Animatable Digital Avatars}},
  author={Liao, Tingting and Yi, Hongwei and Xiu, Yuliang and Tang, Jiaxiang and Huang, Yangyi and Thies, Justus and Black, Michael J.},
  booktitle={International Conference on 3D Vision (3DV)},
  year={2024}
}

Related Works

  • HumanNorm: multiple stage SDS loss and perceptual loss can help generate the lifelike texture.
  • SemanticBoost: uses TADA's rigged avatars to demonstrate the generated motions.
  • SignAvatars: uses TADA's rigged avatars to demonstrate the sign language data.
  • GALA: uses TADA's avatars for asset generation.

License

This code and model are available for non-commercial scientific research purposes as defined in the LICENSE (i.e., MIT LICENSE). Note that, using TADA, you have to register SMPL-X and agree with the LICENSE of it, and it's not MIT LICENSE, you can check the LICENSE of SMPL-X from https://github.com/vchoutas/smplx/blob/main/LICENSE; Enjoy your journey of exploring more beautiful avatars in your own application.

tada's People

Contributors

tingtingliao avatar yuliangxiu avatar yhw-yhw avatar

Stargazers

 avatar LeonGu avatar Xiong Lin avatar  avatar Ruijie Lu avatar April avatar Tianxiao avatar  avatar Zinnia avatar weimengting avatar  avatar liheng avatar  avatar iPLUS7 avatar Xueting Yang avatar DingChenYang avatar Gavin_jy avatar  avatar nChieeF avatar Zhen Xu avatar Xiaokun Sun avatar arpu avatar Edward Seo avatar Yucheol Jung avatar August1996 avatar Manish Kumar avatar  avatar  avatar  avatar Undefied avatar XieChen avatar Vilonge avatar  avatar Mohammad Hossein Yazdi avatar learner avatar Yuanchen Guo avatar  avatar sunnier avatar  avatar  avatar Vincent Xiaopeng Lu avatar Cao Yukang avatar JD Web Programmer avatar Syafiq Kamarul Azman avatar jay vaghasiya avatar  avatar  avatar Jann Moon avatar Yongjia Ma avatar Jiang Wei avatar Spike Huang avatar Guangtao Lyu ( 吕光涛 ) avatar Jingyi avatar Antlitz.ai avatar  avatar Richard avatar  avatar Yufei Liu avatar 是虹川肉 avatar 艾梦 avatar Jack G avatar Chongyang Ma avatar  avatar Yuqi HU avatar MU SHENG avatar Ao Li avatar Junjie Wang avatar wei wang avatar Yu Sun avatar Yuxi Xiao avatar  avatar  avatar Inferencer avatar 苹果的味道 avatar Donald Nguyen avatar Cassandra de la Cruz-Munoz avatar YUEBAI avatar Yang Li avatar  avatar ChenJie avatar Zidan Yang avatar Ci Li avatar Baldr_Yao avatar  avatar li-ronghui avatar  avatar  avatar Shoukang Hu avatar  avatar Mocax avatar  avatar  avatar Yifei Yang avatar Hyeongjin Nam avatar  avatar Xihe Yang avatar ZhiyuanthePony avatar Hang Ye avatar Jialiang Zhu avatar xmu-xiaoma666 avatar

Watchers

Takahiro "Poly" Horikawa avatar Lingyue Pan avatar Liang PAN avatar Andrew Carr avatar Snow avatar  avatar Sauradip Nag avatar Pyjcsx avatar  avatar Siddharth Katageri avatar  avatar  avatar Jack Goodman avatar 林杰 avatar  avatar

tada's Issues

timm package should be added to requirements

Hi, thanks for the great work!

I tried to run the code by following the instructions in the readme file
however, I faced the error below, which seems that 'timm' package is required.

Traceback (most recent call last):
  File "/root/miniconda3/envs/tada/lib/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/envs/tada/lib/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/source/sihun/TADA/apps/run.py", line 7, in <module>
    from lib.trainer import *
  File "/source/sihun/TADA/lib/trainer.py", line 21, in <module>
    from lib.dpt import DepthNormalEstimation
  File "/source/sihun/TADA/lib/dpt.py", line 8, in <module>
    import timm
ModuleNotFoundError: No module named 'timm'

I was able to solve it and run the code by simply installing timm by pip
$ pip install timm
Although this is a simple task, I thought it would be good to add it to the requirements.txt

The implementation of interpolated latent code of Equation(6)

I tried to find how to implement the interpolated latent code of Equation(6) in paper, the z = alpha z^I + (1-alpha)z^N.
I have read the code, it seems that you just use image latent code and normal latent code for loss separately? Maybe it's a Equivalent operation?

    def train_step(self, data, is_full_body):
           ......
           loss = self.guidance.train_step(dir_text_z, image_annel).mean()
            if not self.dpt: 
                # normal sds
                loss += self.guidance.train_step(dir_text_z, normal).mean()
                # latent mean sds
                # loss += self.guidance.train_step(dir_text_z, torch.cat([normal, image.detach()])).mean() * 0.1
            else:
                if p_iter < 0.3 or random.random() < 0.5: #  
                    # normal sds
                    loss += self.guidance.train_step(dir_text_z, normal).mean() # use normal map directly
                elif self.dpt is not None :
                    # normal image loss
                    dpt_normal = self.dpt(image) # estimate normal map
                    dpt_normal = (1 - dpt_normal) * alpha + (1 - alpha)

                    lambda_normal = self.opt.lambda_normal * min(1, self.global_step / self.opt.iters)
                    loss += lambda_normal * (1 - F.cosine_similarity(normal, dpt_normal).mean())

Waiting for your reply, thanks a lot!

code problem

run 'python -m apps.run --config configs/configs/tada_w_dpt.yaml.yaml --text "Aladdin in Aladdin" '
AttributeError: 'Trainer' object has no attribute 'log_ptr'

How to animate/drive the result?

Thank for your wonderful work!
I have a question about how to animate the result? The final training result in this work seems to be mesh. But i dont know how to use motion-diffusion-model or priorMDM to animate/drive the result. Could you please give me some advice ?
Thank you so much~~~

how to generate a single head

您好,我现在希望在您的代码的基础上修改一下,让他只生成一个人脸的模型(不要身体的部分)
请问可以通过修改这里得到吗:
image

About MLPTexture

It seems that directly optimizing the texture map may cause some noise. Why not optimize MLPTexure3D to get mesh texture? Have you done some experiments about that since you have implemented the choice of MLPTexture3D.

Generated mesh examples

Hi tingting,

Thanks for your excellent work. I wonder to know where I can find generated mesh examples.

mesh

Can I get a mesh with color and texture?

Running on 12 GB VRAM

Trying to execute the code on my RTX 3060 and unfortunately I get an error of insufficient memory (OOM). Is it possible to optimize the generation so that everything works as it should?
Or maybe it is possible to run on CPU?

Get sample00_rep00_smpl_params.npy

Thank you for your great work! Could you please upload one sample file like sample00_rep00_smpl_params.npy? I have some difficulties when generating motion using MDM and I hope to inference your anime.py first. Thank you very much.

How to download "ckpt_file" for anime.py?

Thank you for your work! @YuliangXiu , I am trying to run the Animation demo. May I know where to download the files for ckpt_file below? I think it's the path to the pre-trained characters. I have downloaded TADA Extra Data and unzip it as ./data, but there is no checkpoints in it. And the code breaks at assert os.path.exists(ckpt_file) (L404 of anime.py).

ckpt_file = f"{args.workspace}/tada/{args.subject}/checkpoints/tada_ep0150.pth"

Also, I have download TADA 100 Characters and there is also no checkpoints, but there is a params.pt in each character's folder. However the key in these .pt files dict_keys(['v_offsets', 'betas', 'expression']) do not match the ones in the code. There should be a 'raw_albedo' key in addition to these, according to your code.

Densify SMPLX

As the paper said, TADA densify the smplx by Adaptive Upsampling and Interpolating Skinning Weights. But in the code part( class DLMesh), I only find the subdivision of smplx 3d point. May I ask is there any part of code includes densifying the smplx? Or which part of code does the work of Adaptive Upsampling and Interpolating Skinning Weights?
Thank you.

I have question for you

Using your method, I have played around well with your model.
I have one question.
Is it absolutely necessary to go through the training process to create new character?
Is it possible to create new character's 3d mesh without undergoing the training process?

Difficulty Implementing Stable Diffusion XL in TADA Pipeline

I attempted to integrate the stable Diffusion XL model into the TADA pipeline to evaluate its performance and compare results with sd 2.1. However, upon implementation, I encountered significant challenges. The training process with sd-XL appears to converge very slowly, if at all, unlike the smoother convergence observed with sd 1.5 or 2.1.

I'm reaching out to inquire if there are specific reasons why TADA may not be performing well with sd-XL. Could there be particular components or configurations within the pipeline that are not optimized for sd-XL, leading to this discrepancy in convergence rates? Any insights or suggestions on how to address this issue would be greatly appreciated.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.