Giter VIP home page Giter VIP logo

dda4210_group_project's Introduction

DDA4210-project - SAM(Simpson Artistic Memory)

Group Name: Original Logic

Group Member: Huihan Yang; Jinrui Lin; Rongxiao Qu; Haoming Mo

NO BUSINESS USAGE

MODEL

Our models can be found in πŸ€—JerryMo/db-simpsons-asim-style and πŸ€—Foxintohumanbeing/simpson-lora.

JerryMo/db-simpsons-asim-style is fine-tuned from SAM and Foxintohumanbeing/simpson-lora is fine-tuned from LoRA, whose performance is also ok.

The QR code of our APP is here!Enjoy!πŸ‘‹

Sample images can be found in Sample Generate Imags.

Model Checkpoint

The fine-tune parameters are stored in stored_parameters_for_models. Here we provide three fine-tune results.

  • The fine-tuned parameters of LoRA is stored in stored_parameters_for_models\sd-model-lora\pytorch_lora_weights.bin.

  • The fine-tuned parameters of Dreambooth is stored in GoogleDrive due to its large size.

  • The fine-tuned parameters of SAM is stored in GoogleDrive due to its large size.

Note: The checkpoint may not able to be directly utilized in the inference stage using code. You may need to check model's structure on huggingface(as given) for further utilization.

Data

  • We preprocess the data (for this part the details will be provided later) and make our own dataset πŸ€—JerryMo/image-caption-blip-for-training. The dataset contains around 2500 pictures with 135MB.

  • We also create a Dataset_App for give better prompts for pictures. We manually captioned 1000 images and make the second dataset πŸ€—JerryMo/Modified-Caption-Train-Set.

  • For Dreambooth model, it need caption in great detail but smaller sample size. So we create the third dataset specifically for Dreambooth model πŸ€—JerryMo/db-simpsons-dataset. Notice that we use 'Asim' as keyword in the caption.

  • For less fine-tuning time, you may also need a dataset with smaller size. Here are two ways you could do:

    1. Tune the parameter max_train_samples.

    2. Use πŸ€—Skiracer/simpsons_blip_captions, which is a relatively small dataset.

Running command

Note: Before start, we strongly recommend you to validate our result on hugging face instead of direct coding. Diffuers updates frequently so codes need to be kepted updating. Moreover, to use the ckpt file we provided efficiently, we already built API on πŸ€—Hugging Face. If you want to run inference on your own, you may need to form the folder of ckpt file on your own following the instructions from hugging face documentation.

Our model is fine-tuned on πŸ€—CompVis/stable-diffusion-v1-4.

Requirement

For preprocessing and requirement of packages, you need to refer to the installzation from πŸ€—huggingface/diffusers. For any problem occurs, please first check whether the versions of huggingface, diffusers, torch and CUDA match.

git clone https://github.com/foxintohumanbeing/DDA4210_Group_project.git
cd DDA4210_Group_project

Training Command

python fine_tuning_files/train/train_dreambooth_lora_unfreezed.py --config_path="configuration_file/config_train.json"

Testing Command

python fine_tuning_files/inference/inference_dreambooth_lora_unet.py --config_path="configuration_file/config_test.json"

PLEASE make sure that the parameter output_dir and pretrained_model_name_or_pat is the SAME as the parameter output_dir and pretrained_model_name_or_path in config_train.json.

Measurement Command

  • Frechet Inception Distance (FID)

Instructions can be found in mseitzer/pytorch-fid.

  • Language Drifting Measurement (LDM)

We use the πŸ€—openai/clip-vit-large-patch14. Realized code can be found in utils/LDM.py.

File Explaination

Fine-tuning Files

Train

Training codes.

  • train_dreambooth_lora_unfreezed.py: code of SAM model.

  • train_dreambooth_lora.py: code of model utilizing LoRA in DreamBooth.

  • train_dreambooth.py: code fine-tuning simply use DreamBooth method.

  • train_text_to_image_lora.py: code fine-tuning simply use LoRA method.

  • train_text_to_image.py: code fine-tuning without any technique.

Inference

  1. Files are stored in fine_tuning_files
  • inference_dreambooth_lora_unet.py: code of SAM model.

  • inference_dreambooth_lora.py: code of model utilizing LoRA in DreamBooth.

  • inference_dreambooth.py: code fine-tuning simply use DreamBooth method.

  • inference_lora.py: code fine-tuning simply use LoRA method.

  • inference_simple.py: code fine-tuning without any technique.

Configuration File

  • config_train.json stores the parameters you need to change during training.

  • config_test.json stores the parameters you need to change during testing.

Utils

  • Contains some tools to train and evaluate.

For any questions, please CONTACT Huihan Yang ASAP!

dda4210_group_project's People

Contributors

huihanyang-arya avatar rongxiaoqu avatar mhmjerry avatar

Stargazers

 avatar Bob avatar  avatar  avatar Apollo0217 avatar

Watchers

 avatar

Forkers

rongxiaoqu

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.