Giter VIP home page Giter VIP logo

vishal-v / stackgan Goto Github PK

View Code? Open in Web Editor NEW
34.0 3.0 9.0 207 KB

TensorFlow implementation of "Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks" by Han Zhang, et al.

Home Page: https://arxiv.org/abs/1612.03242

License: MIT License

Python 74.53% Jupyter Notebook 25.47%
gans generative-adversarial-network stack-gan tensorflow-2 keras conditioning-augmentation cub-200

stackgan's Introduction

StackGAN

Text to Photo-Realistic Image Synthesis


Dependencies

tensorflow==2.1.0
numpy==1.16.4
absl_py==0.7.0
matplotlib==2.2.3
pandas==0.23.4
Pillow==6.1.0

Downloads

  • To download all the dependencies, simply execute
pip install -r requirements.txt
  • To download the CUB 200 dataset, simply execute the data_download.py file
python data_download.py
  • Download the Char-RNN-CNN embeddings from this link: download link and unzip it in place.
unzip birds.zip

Training

  • The model.py file contains the bare minimum code to run the stage 1 and stage 2 architecture. It automatically stores the weights after the specified/default number of epochs have completed. Note that the weights will be stored at the same directory level as model.py.
python model.py

Architecture

  • Stage 1
    • Text Encoder Network
      • Text description to a 1024 dimensional text embedding
      • Learning Deep Representations of Fine-Grained Visual Descriptions Arxiv Link
    • Conditioning Augmentation Network
      • Adds randomness to the network
      • Produces more image-text pairs
    • Generator Network
    • Discriminator Network
    • Embedding Compressor Network
    • Outputs a 64x64 image

  • Stage 2
    • Text Encoder Network
    • Conditioning Augmentation Network
    • Generator Network
    • Discriminator Network
    • Embedding Compressor Network
    • Outputs a 256x256 image

Reference Papers

  1. StackGAN: Text to photo-realistic image synthesis [Arxiv Link]
  2. Improved Techniques for Training GANs [Arxiv Link]
  3. Generative Adversarial Text to Image Synthesis [Arxiv Link]
  4. Learning Deep Representations of Fine-Grained Visual Descriptions [Arxiv Link]

Note

This is the code I have submitted to TensorFlow for Google Summer of Code. Hence the attributions and the License is for "TensorFlow Authors" and not "Vishal V". This code is under the MIT License.

stackgan's People

Contributors

vishal-v avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

stackgan's Issues

discriminator loss error

Data cardinality is ambiguous:
x sizes: 64, 1
y sizes: 64
Make sure all arrays contain the same number of samples.

It can`t train well

Thanks for your code. I have tried this program, the loss of discriminator does not decrease as well as the generator. After the training, the resulting image is still messy. How can I train it successfully?

Missing input files(?)

The bird data seems to be missing some files:
./birds/test/
./birds/train/
are both empty and model.py appears to be looking for pickle files. Is there a preprocessor I should have run?

misc/preprocess_birds.py missing

Hello, when I run model.py it gives me this error:

FileNotFoundError: [Errno 2] No such file or directory: 'test/gen_1_0_0.png'

The 'readme' in 'birds' folder says run "python ./misc/preprocess_birds.py" but there is no such file, which I guess is needed to generate gen_1_0_0.png. Can you check again please?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.