Giter VIP home page Giter VIP logo

leonardo's Introduction

Leonardo

Recreate the image in "Futuristic style, trending on artstation"
Input Images | Output Images | Benchmarks

Souce Image Output Image

Introduction

This repo contains code that processes images using Huggingface img2img pipelines. For each image, a random diffusion model is selected and used to produce the output image.

  • The Google Colab notebook used to generate the images in data/output can be found here.
  • The above Google Colab has sufficient memory to load all models at the same time. In environments with lower GPU memory, pipelines can be fetched with get_pipeline(model, low_memory=True), and the model will be moved to GPU only during inference. Inference speed is ~halved in this mode.
  • The CLI interface described below can be run locally, and it will use a GPU if available. Running with --output-width=128 will allow the pipeline to run even on CPU in reasonable time (~10 secs per image), but results will not look great.

Further notes:

  • Infrastructure: The setup.py/requirements.txt files are the only infrastructure pieces (no dockerization)
  • Scaling: In a production setting, we would not be loading all models in a single python process; rather, we'd spawn separate containers, one with only one model each, sitting behind a REST API (e.g. Flask or MLFlow serve). Processing images would send a request to a separate endpoint each time. Containers would be scaled depending on load.
  • Batch processing: It is possible to perform batch processing by supplying a list of images and prompts to pipe() instead of individual ones. However, the input images would have to be of the same dimensions for batching to be possible. Since the aspect ratio of the input images provided varies, we wouldn't be able to batch much work in this case. It is something one would want to do in production to ensure as much GPU power as possible is being utilized.

Usage

Clone repo and install app as the editable (-e) leonardo package:

git clone [email protected]:robin-vjc/leonardo.git
cd leonardo
pip install -e .

To run the pipeline on the set of images in data/images/

# view options
>> python leonardo/cli.py --help
Usage: cli.py [OPTIONS] INPUT_PATH OUTPUT_PATH PROMPT

Arguments:
  INPUT_PATH   [required]
  OUTPUT_PATH  [required]
  PROMPT       [required]

Options:
  --output-width INTEGER          [default: 128]
  --strength FLOAT                [default: 0.2]
  --guidance-scale FLOAT          [default: 1.5]
  --low-memory / --no-low-memory  [default: no-low-memory]
  --install-completion [bash|zsh|fish|powershell|pwsh]
                                  Install completion for the specified shell.
  --show-completion [bash|zsh|fish|powershell|pwsh]
                                  Show completion for the specified shell, to
                                  copy it or customize the installation.
  --help                          Show this message and exit.

# run image processing
>> python leonardo\cli.py data\images\ data\output\ "Futuristic style, trending on artstation" --output-width=128 --strength=0.2 --guidance-scale=1.5

ToDos

  • upload repo to git
  • process images folder pipeline (input_folder, prompt, output_folder)
  • store images in data/
  • make repo pip-installable
  • check on colab GPU processing works correctly
  • track memory/cpu usage
  • refactor so we do a randomized chunk of images with one model, then process the other chunks
  • clean up docstrings everywhere
  • update README installation / usage

leonardo's People

Contributors

robin-vjc avatar

Stargazers

 avatar

Watchers

 avatar Kostas Georgiou avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.