Giter VIP home page Giter VIP logo

Comments (2)

Arcitec avatar Arcitec commented on August 27, 2024

Small bonus guide: Converting the *.ipynb "Notebook" files to normal scripts.

  1. Install the necessary tools for the conversion. These have a lot of dependencies and take a few minutes.
pipenv install jupyter nbconvert
  1. Convert all notebooks to Python files. This must be executed from the top directory of your project (because if you run it inside the cloned git repo, it will treat it as a different project folder and would create another Pipfile instead):
cd ..  # return to parent folder (if you're still in the src/ folder)
pipenv run jupyter nbconvert --to script lib/glide-text2im/notebooks/*.ipynb
  1. Now you can move those .py files out of lib/glide-text2im/notebooks/ and into your src/ folder as a basis for your own project.
mv lib/glide-text2im/notebooks/*.py src/
  1. You have to edit the demos to remove the "IPython" stuff such as the "get_ipython" call and the image-display code and instead use something like OpenCV's image displayer to show the result, because the example code outputs the result to your "Notebook" (Jupyter etc). First install OpenCV2 instead.
pipenv install opencv-contrib-python
  1. Edit the demos to remove these lines:
get_ipython().system('pip install git+https://github.com/openai/glide-text2im')
from PIL import Image
from IPython.display import display
  1. Add this line:
import cv2
  1. Replace this line:

Old:

    display(Image.fromarray(reshaped.numpy()))

New:

    # Resize to 4x larger and display with OpenCV2
    img = reshaped.numpy()
    img = cv2.cvtColor(img, cv2.COLOR_RGB2BGR)  # Convert to OpenCV's BGR color order.
    img = cv2.resize(img, None, fx=4, fy=4, interpolation=cv2.INTER_LANCZOS4)
    cv2.imshow("Result", img)
    cv2.waitKey(0)  # Necessary for OS threading/rendering of the GUI.
    cv2.destroyAllWindows()
  1. Very important: When you see a generated image, you must press a keyboard key to close the window. Don't close it with the "X" with your mouse because that will hang python on "waitKey" since it will wait for a key that never arrives. Displaying images with cv2 is impossible without waitKey since the OS thinks the window is dead if you skip that. So your only option for this demo is to close the windows by pressing a keyboard key such as space!

PS: "GLIDE (filtered)" is definitely a fun toy, but the results are pretty bad, blurry and nonsensical (unrelated to what you wrote) with the public model unfortunately, as mentioned here:

#21 (comment)

Most of the output you're gonna get is useless. But some of it can be fun for inspiration/ideas for projects or art. The main benefit of this model is actually that it generates results extremely fast compared to previous CLIP-based generators.

I would honestly say that the old CLIP-based generators that are out there are much better and more usable. Sure, the coherence of the image itself and the objects is better in GLIDE (filtered), but it responds really poorly to your input most of the time.

If you decide that you want to use this project anyway ("GLIDE (filtered)"), I recommend the clip_guided code. It's better than text2im at understanding things with the limited free training data we've been given. See this topic: #19

The main issue with the free version of GLIDE is that the filtered training data seems to have been mostly "freaking dogs!!". Which may explain why the default prompt demo is "an oil painting of a corgi"... It also produces extremely blurry output.

from glide-text2im.

Arcitec avatar Arcitec commented on August 27, 2024

Bonus: If someone wants a full, more detailed guide about installing PyTorch in Pipenv correctly, then you can find that guide here:

pypa/pipenv#4961 (comment)

All relevant commands and most of the explanations from that guide are already here in this GLIDE guide, but if you want a deeper understanding of how Pipenv's 3rd party repo support works compared to Pip, you'll want to check out that guide too.

from glide-text2im.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.