Auto1111 extension consisting of implementation of various text2video models, such as ModelScope and VideoCrafter, using only Auto1111 webui dependencies and downloadable models (so no logins required anywhere)
8gbs vram should be enough to run on GPU with low vram vae on at 256x256 (and we are already getting reports of people launching 192x192 videos with 4gbs of vram). 24 frames length 256x256 video definitely fits into 12gbs of NVIDIA GeForce RTX 2080 Ti. We will appreciate any help with this extension, especially pull-requests.
VideoCrafter runs with around 9.2 GBs of VRAM with the settings set on Default.
Update 2023-03-27: VAE settings and "Keep model in VRAM" moved to general webui setting under 'ModelScopeTxt2Vid' section.
Update 2023-03-26: prompt weights implemented! (ModelScope only yet, as of 2023-04-05)
Update 2023-04-05: added VideoCrafter support, renamed the extension to plainly 'sd-webui-text2video'
Prompt: flowers turning into lava
out.mp4
Prompt: cinematic explosion by greg rutkowski
vid.mp4
Prompt: really attractive anime girl skating, by makoto shinkai, cinematic lighting
gosh.mp4
Prompt: anime 1girl reimu touhou
working.mp4
Download the following files from the original HuggingFace repository. Alternatively, download half-precision fp16 pruned weights (they are smaller and use less vram on loading):
- VQGAN_autoencoder.pth
- configuration.json
- open_clip_pytorch_model.bin
- text2video_pytorch_model.pth
And put them in stable-diffusion-webui/models/ModelScope/t2v
. Create those 2 folders if they are missing.
Download pretrained T2V models either via this link or download the pruned half precision weights, and put the model.ckpt
in models/VideoCrafter/model.ckpt
.
HuggingFace space:
https://huggingface.co/spaces/damo-vilab/modelscope-text-to-video-synthesis
The model PyTorch implementation from ModelScope:
https://github.com/modelscope/modelscope/tree/master/modelscope/models/multi_modal/video_synthesis
Google Colab from the devs:
https://colab.research.google.com/drive/1uW1ZqswkQ9Z9bp5Nbo5z59cAn7I0hE6R?usp=sharing
Github: