The ella_training from datacte

ella_training's Introduction

###ELLA Training Fine-tuning Project

Introduction This project is a result of our efforts to reverse engineer the ELLA training process for version 1.5. We have successfully created a fine-tuned model based on the ELLA architecture. Our goal is to adapt the training script to work with SDXL (Stable Diffusion XL) and make it accessible to the community.

Background:

We were disappointed to learn that the original creators of ELLA did not release the training code for version 1.5 or SDXL. However, instead of waiting for an official release, we decided to take matters into our own hands and reverse engineer the training process ourselves.

Project Structure The repository contains the following files and directories:

model.py: The implementation of the ELLA model architecture. train.py: The script for fine-tuning the ELLA model. requirements.txt: The list of required dependencies for running the project. README.md: This file, providing an overview of the project. Installation To set up the project locally, follow these steps:

Clone the repository:

git clone https://github.com/DataCTE/ELLA_Training.git
Navigate to the project directory:

cd ELLA_Training
Install the required dependencies:


pip install -r requirements.txt

Usage To fine-tune the ELLA model, run the following command:

python train.py

Make sure to adjust the training parameters and dataset paths in the train.py script according to your requirements.

Adapting to SDXL We are actively working on adapting the training script to work with SDXL. Our goal is to leverage the powerful capabilities of SDXL and enhance the fine-tuning process. Stay tuned for updates on this front.

Contributions We welcome contributions from the community to help improve and extend this project. If you have any ideas, suggestions, or bug fixes, please feel free to open an issue or submit a pull request.

License This project is licensed under the MIT License.

Acknowledgments We would like to acknowledge the creators of ELLA for their innovative work and the inspiration they have provided. Although we were disappointed by the lack of an official release, their efforts have motivated us to take on this challenge ourselves.

Contact If you have any questions or would like to get in touch with the project maintainers, please email us at [email protected].

ella_training's People

Contributors

Stargazers

Watchers

ella_training's Issues

Any progress on the SDXL adapter?

Hey, thanks for putting this out! Been really enjoying playing with SD1.5 Ella, but I'm very aware it isn't nearly at the level of prompt understanding shown in the paper (although it honestly blows SDXL out of the water). Just wanted to see whether there's been any progress in terms of adapting the script to SDXL (and if you're planning to put out an SDXL Ella adapter at any point).

Some questions

Hi,

Thanks for sharing the code.

Have you run a finetuning of Ella on SD1.5 ?

Also, shouldn't it be trained on only one timestep instead of a full generation ? And in the paper they mentioned a weight decay of 0.01

And perhaps using a training script from diffusers as the base could be better, it would allow using xformers, different dtypes, batch size & grad acc, adam decay, ...

I'm currently running a finetune of the existing weights for SD1.5 as a test (LR 1e-5, xformers + fp16 for the pipeline and fp16 for the T5encoder), I'll let it run for a few hours.

Recommend Projects

datacte / ella_training Goto Github PK

ella_training's Introduction

ella_training's People

Contributors

Stargazers

Watchers

Forkers

ella_training's Issues

Any progress on the SDXL adapter?

Some questions

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent