Giter VIP home page Giter VIP logo

animate-your-word's Introduction

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior


Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Zichen Liu*, Yihao Meng*, Hao Ouyang, Yue Yu, Bolin Zhao, Daniel Cohen-Or, Huamin Qu

* Denotes equal contribution

We present an automated text animation scheme, termed "Dynamic Typography," which combines two challenging tasks. It deforms letters to convey semantic meaning and infuses them with vibrant movements based on user prompts.

Strongly recommend to see our demo page.

Setup

git clone https://github.com/zliucz/animate-your-word.git
cd animate-your-word

Environment

To set up our environment in Linux, please run:

conda env create -f environment.yml

Next, you need to install diffvg:

conda activate dTypo
git clone https://github.com/BachiLi/diffvg.git
cd diffvg
git submodule update --init --recursive
python setup.py install

Generate Your Animation!

To animate a letter within a word, run the following command:

CUDA_VISIBLE_DEVICES=0 python dynamicTypography.py \
        --word "<The Word>" \
        --optimized_letter "<The letter to be animated>" \
        --caption "<The prompt that describes the animation>" \
        --use_xformer --canonical --anneal \
        --use_perceptual_loss --use_conformal_loss  \
        --use_transition_loss

For example:

CUDA_VISIBLE_DEVICES=0 python dynamicTypography.py \
        --word "father" --optimized_letter "h" \
        --caption "A tall father walks along the road, holding his little son with his hand" \
        --use_xformer --canonical --anneal \
        --use_perceptual_loss --use_conformal_loss \
        --use_transition_loss

or

CUDA_VISIBLE_DEVICES=0 python dynamicTypography.py \
        --word "PASSION" --optimized_letter "N" \
        --caption "Two people kiss each other, one holding the others chin with his hand" \
        --use_xformer --canonical --anneal \
        --use_perceptual_loss --use_conformal_loss  \
        --use_transition_loss --schedule_rate 5.0

The output animation will be saved to "videos".
The output includes the network's weights, SVG frame logs and their rendered .mp4 files (under svg_logs and mp4_logs respectively).
We save both the in-context and the sole letter animation.
At the end of training, we output a high quality gif render of the last iteration (HG_gif.gif).

We provide many example run scripts in scripts, the expected resulting gifs are in example_gifs. More results can be found on our project page.

By default, a 24-frame video will be generated, requiring about 28GB of VRAM. If there is not enough VRAM available, the number of frames can be reduced by using the --num_frames parameter.

Tips:

If your animation remains the same with the original letter's shape or deviate too much from the original letter shape, please set a lower/higher --perceptual_weight.

If your want the animation too be less/more geometrically similar to the original letter, please set a lower/higher --angles_w.

If you want to further enforce appearance consistency between frames, please set a higher --transition_weight. But please note that this will reduce the motion amplitude.

Small visual artifacts can often be fixed by changing the --seed.

Citation:

Don't forget to cite this source if it proves useful in your research!

@article{liu2024dynamic, 
	title={Dynamic Typography: Bringing Text to Life via Video Diffusion Prior}, 
	author={Zichen Liu and Yihao Meng and Hao Ouyang and Yue Yu and Bolin Zhao and Daniel Cohen-Or and Huamin Qu}, 
	year={2024}, 
	eprint={2404.11614}, 
	archivePrefix={arXiv}, 
	primaryClass={cs.CV}}

Acknowledgment:

Our implementation is based on word-as-image and live-sketch. Thanks for their remarkable contribution and released code.

animate-your-word's People

Contributors

bruceyyu avatar yihao-meng avatar zliucz avatar ken-ouyang avatar rogerz-02 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.