Light

kabachuha / infinet Goto Github PK

Implementation of DiffusionOverDiffusion architecture presented in NUWA-XL in a form of ControlNet-like module on top of ModelScope text2video model for extremely long video generation.

License: Apache License 2.0

Python 100.00%

controlnet diffusionoverdiffusion latent-diffusion modelscope nuwa-xl text2video unet-3d

infinet's Introduction

🚢 Software/ML Engineer at Huawei's AI Foundation and Algorithm lab 📚
🔭 Bachelor in 🤓 Applied Math and Physics ⚛️
🎓 Currently studying for a masters
🌱 Participate in renewable energy projects ⚡
👷 ML and AI projects: Diffusion models and LLMs. 🤗
🪄 My arXiv papers (more coming soon) 👁️
🐍 Skills in Python, C/C++ and Java 🍵
✏️ Love traditional and AI-assisted art 🤖
🚀 Looking for hardware help on training multimodal models 🤔
🐧 To donate and support my Open Source projects 🧡
💬 To contact me, use Discord, @kabachuha. Pronouns: he/him

infinet's People

Contributors

Stargazers

Watchers

Forkers

hekuba13 summerflowers honeyopp ingeniousfrog db-vis happywalkers

infinet's Issues

Preparing to train

Add dataset splitter
Add a low level clip/image captioner like BLIP2 if the description is not available
Use an external LLM like OpenAI's ChatGPT or plain GPT-3, or even a local model like LLaMa via APIs to generate descriptions for higher level videos by combining the subclips with a global prompt
Improve the batching process
Do a test train

The performance of integrating InfiNet into the Damo Text-to-Video

Great work! Have there been any experimental results on integrating it into the Damo Text-to-Video system?

关于训练NUWA-XL

哥们还在维护这个project吗？最近想找开源的NUWA-XL在自己的数据集训练，但是好像这个project维持到一半停了？

Add training code for DiffusionOverDiffusion text2video dataset

Inset the ConvDown output passed trough Linear here (link in descr)

https://github.com/kabachuha/overfusion/blob/master/t2v_modules/dod_unet.py#L116-L140

Adapt Torch lightning to 2.0 version

Use xformers efficient attention or PyTorch 2 dot attention optimization

or else we'll lose a lot of compute

Pretrained Weight

Thanks for such an awesome work!
Are there any pre-trained weights available?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.