Comments (3)
I know it's been quite a while but want to share my experience in case it helps. I'm using diffusers's official example to train controlnet tile https://github.com/huggingface/diffusers/tree/main/examples/controlnet, and I'm using all memory saving techniques (e.g., gradient checkpointing, xformers memory efficient attention, 8bit adam, and fp16 mixed precision, which are all available options in that training script) to achieve an effective batch size of 256. I did observe the "sudden convergence" around 3k steps, but essentially it worked.
I uploaded my (workable) training script here https://github.com/zjysteven/controlnet_tile, in case anyone is interested.
from controlnet-v1-1-nightly.
the only description I could find was on the CN 1.1 frontpage and all the comments linked thereof. note the original tile was resized to 64x64 as opposed to your 128x128. I recently started CN training myself and documented everything in my article on civitai. right now I'm training an alternative edge detection model, where I document every experiment. I think this could be useful for you. you might also want to take a look at the SD 2 CN models from Thibaud.
some things I learned:
- use higher effective batch size, by 1. increasing batch size, 2. increasing gradient accumulation (I hadn't have any luck with batch >64 so far, but my experiments are still running). Thibaud said some higher batch sizes require A LOT more samples (could be >500k).
- run fastdup for quick and easy removal of faulty images (duplicates, error images etc.)
- advanced: you might consider partial prompt dropping, as the very function of the tile model is to fill areas without relying on the explicit prompt. It's not a domain-specific model so I think it makes sense here.
- consider training the model for SD 1 first. it's faster and you will learn the same. if you have a functional workflow, restart the training for SD 2.
random crop original images to 512x512
is this correct as SD2 uses 768x768 images? (I don't know)
After that, I try to train the control net to generate the first image from the dataset with the controlling image from the second image from the dataset.
What does this mean? could please post some examples. the way you wrote it sounds as if you want to confuse your CN on purpose by showing it the wrong image :D
After 100k iterations, the result is bad, and it feels like nothing has been learned
with effective batch size 4 you should already see some effects after 25k images. please provide:
- your parameters
- validation images of intermediate steps
- your evaluation images and prompts (try to avoid complex evaluation images and use something simple and which is already proven to work)
because in materials I only found the training of other models
should be pretty much the same
please share your experiences!
from controlnet-v1-1-nightly.
Thanks for sharing your experience! I tried training with a bigger logical batch size, and it looks better!
is this correct as SD2 uses 768x768 images?
In my opinion, it doesn't matter because if everything is fine with the training script, then it should work on 512 as well
What does this mean? could please post some examples. the way you wrote it sounds as if you want to confuse your CN on purpose by showing it the wrong image :D
While training, I use an image with resizing artifacts as a control image. To help SD generate an image similar to the image with artifacts but at a higher resolution
from controlnet-v1-1-nightly.
Related Issues (20)
- Running on CLI
- Connection errored out. HOT 1
- cost time HOT 1
- Can I use SDXL for training, such as I want to train inpaints HOT 3
- question in depth preprocess
- Can I ask how you trained the IP2P model? HOT 4
- What the "Controlnet tile" model really does and How was it trained? HOT 2
- What does it mean: "The gradio example in this repo does not include tiled upscaling scripts."?
- Will the MLSD models available for SDXL be made public? HOT 2
- Strange output HOT 1
- After upgrade at 11:00 AM on November 21, 2023
- What's the difference between 1-control and control in gradio_lineart.py and gradio_canny.py? HOT 2
- A Bunch of Questions Down Below
- Fine tuning question. HOT 1
- Unexpected long processing time HOT 6
- Exception using ControlNet model control_v11f1e_sd15_tile not compatible with SDXL models. HOT 1
- About the weight size of ControlNet between v1 and v1.1
- [Question]How the Multi-Controlnet is implemented?
- how to solve the gpu consuming problem?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from controlnet-v1-1-nightly.