tgxs002 / align_sd Goto Github PK

View Code? Open in Web Editor NEW

252.0 252.0 8.0 10.66 MB

Better Aligning Text-to-Image Models with Human Preference. ICCV 2023

Home Page: https://tgxs002.github.io/align_sd_web/

License: Apache License 2.0

Python 98.95% Shell 1.05%

stable-diffusion

align_sd's People

Contributors

Stargazers

Watchers

Forkers

undercontroller zdxiao lu-wo haizhu12 rfan-debug l-justice1998 shineyusong lotsoliu

align_sd's Issues

Count of non-preferrred images

select_training_images.py gives 42201 non-preferred images, which is inconsistent with 21108 mentioned in paper. Is there any other parameter that can be used to exactly reproduce the count of preferred and non-preferred images?

Reproducing table 2 in the paper

Hey, thanks for the great work!

I am interested in reproducing the numbers from table 2 in your paper. Could you please advise on how to do that? Can I directly use the test.json provided in your repo? What exact metric do you use for these results, it wasn't directly clear from the paper. Thanks!

does it work for stable diffusion 2.1?

Hi, have you tested on stable diffusion 2.1?

accelerate config

Thanks for your excellent work! I wonder know how you set your accelerate config , like if you use deepspeed .since I tried to use it, but unfortunately I failed. I would be grateful if you could answer my question!

When will the training code be released?

Nice work, but when will the training code be released? I'm hoping for it.

Is there a V2 adapted_model.bin

Hi, I have noticed you release adapted_model.bin in V1 align_sd. I wonder if threre is a new adapted_model in V2?

Please, add a license

Dear Authors,

Thank you for your awesome work! Could you please add a license to your repo?

Conversion for stable-diffusion-webui

Hi, great research! Impressed by the results.

For possibly your own interest, and in case anybody else come across this, you can use this conversion script to get the LoRA models functioning with AUTOMATIC1111/stable-diffusion-webui, the interface majority of the SD community uses. Credit to harrywang for the original script.

import re
import os
import argparse
import torch
from safetensors.torch import save_file

def main(args):
    if torch.cuda.is_available():
        device = 'cuda'
        checkpoint = torch.load(args.file, map_location=torch.device('cuda'))
    else:
        device = 'cpu'
        checkpoint = torch.load(args.file, map_location=torch.device('cpu'))
    
    new_dict = dict()
    for idx, key in enumerate(checkpoint):
        new_key = re.sub('\.processor\.', '_', key)
        new_key = re.sub('mid_block\.', 'mid_block_', new_key)
        new_key = re.sub('_lora.up.', '.lora_up.', new_key)
        new_key = re.sub('_lora.down.', '.lora_down.', new_key)
        new_key = re.sub('\.(\d+)\.', '_\\1_', new_key)
        new_key = re.sub('to_out', 'to_out_0', new_key)
        new_key = 'lora_unet_' + new_key

        new_dict[new_key] = checkpoint[key]

    file_name = os.path.splitext(args.file)[0]
    new_lora_name = file_name + '_converted.safetensors'
    print("Saving " + new_lora_name)
    save_file(new_dict, new_lora_name)

def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument(
        "--file",
        type=str,
        default=None,
        required=True,
    )
    
    args = parser.parse_args()
    return args

if __name__ == "__main__":
    args = parse_args()
    main(args)

when will hps classifier training code be released?

Thank you, I have successfully replicated the training of LoRA according to your code, and it does improve the performance of Stable Diffusion significantly. May I ask when the training code for hps classifier will be released? @tgxs002

About the validation prompt

Could the authors release the validation_prompt.json file? Thus we can repeat the visualization results and make a comparison with the results reported in the paper.

Thanks

about regularization_images data

I am very interested in your work and currently attempting to reproduce your results. I ran the script download_regularization_images.py to download the provided regularization_images data, which consists of image-text pairs. I would like to know how to preprocess it to incorporate it into the LoRA training.Thanks. @tgxs002

Space Demo doesn't work

It seems to be having trouble "building".

About the training data and time cost of lora fine-tuning

Thanks for your great work!
I wonder how much time it costs to train the Lora on the dataset (1M DiffusionDB + subset of Laion5B) on, for example, 4 GPUs?

training dataset for hps classifier is too large to download

The training dataset for hps classifier is too large to download. I have tried to download it many times, but all attempts have failed. Would you be able to provide an alternative download link, such as Baidu Cloud or another platform? @tgxs002

positive and negative sample data for LoRA training

Hello, I am very interested in your work and I am trying to reproduce your results. Would it be possible for you to share positive and negative sample data you used for LoRA training? @tgxs002

Regarding the validity of the Human Preference Classifier

Hi @tgxs002 , thanks for your work, and making the dataset and classifier open-sourced!

As a sanity check, I evaluated your trained HPC on the examples in the training data that are preferred by humans (S1), and the examples in the training data that are unpreferred by humans (S2).

I found that the average HPS in the setting S1 ~ 21.0 whereas the HPC in the setting S2 ~ 20.26. For a good classifier, I was hoping that the scores in the setting S2 be very low as compared to S1, but it is not the case. Does it mean that the HPC is not trained properly, but it seems contradictory because the paper claims that the HPC has good agreement with humans?

And if my evaluation numbers look too off, can you let me know what you are getting at your end?