Light

vishal-v / stackgan Goto Github PK

View Code? Open in Web Editor NEW

34.0 3.0 9.0 207 KB

TensorFlow implementation of "Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks" by Han Zhang, et al.

Home Page: https://arxiv.org/abs/1612.03242

License: MIT License

Python 74.53% Jupyter Notebook 25.47%

gans generative-adversarial-network stack-gan tensorflow-2 keras conditioning-augmentation cub-200

stackgan's Introduction

StackGAN

Text to Photo-Realistic Image Synthesis

Dependencies

tensorflow==2.1.0
numpy==1.16.4
absl_py==0.7.0
matplotlib==2.2.3
pandas==0.23.4
Pillow==6.1.0

Downloads

To download all the dependencies, simply execute

pip install -r requirements.txt

To download the CUB 200 dataset, simply execute the data_download.py file

python data_download.py

Download the Char-RNN-CNN embeddings from this link: download link and unzip it in place.

unzip birds.zip

Training

The model.py file contains the bare minimum code to run the stage 1 and stage 2 architecture. It automatically stores the weights after the specified/default number of epochs have completed. Note that the weights will be stored at the same directory level as model.py.

python model.py

Architecture

Stage 1
- Text Encoder Network
  - Text description to a 1024 dimensional text embedding
  - Learning Deep Representations of Fine-Grained Visual Descriptions Arxiv Link
- Conditioning Augmentation Network
  - Adds randomness to the network
  - Produces more image-text pairs
- Generator Network
- Discriminator Network
- Embedding Compressor Network
- Outputs a 64x64 image

Stage 2
- Text Encoder Network
- Conditioning Augmentation Network
- Generator Network
- Discriminator Network
- Embedding Compressor Network
- Outputs a 256x256 image

Reference Papers

StackGAN: Text to photo-realistic image synthesis [Arxiv Link]
Improved Techniques for Training GANs [Arxiv Link]
Generative Adversarial Text to Image Synthesis [Arxiv Link]
Learning Deep Representations of Fine-Grained Visual Descriptions [Arxiv Link]

Note

This is the code I have submitted to TensorFlow for Google Summer of Code. Hence the attributions and the License is for "TensorFlow Authors" and not "Vishal V". This code is under the MIT License.

stackgan's People

Contributors

Stargazers

Watchers

Forkers

yuziquan mllab-skku-lab metr0jw danigy zhangqianren gui199 winswae aswins-99 hebaoxianga

stackgan's Issues

discriminator loss error

Data cardinality is ambiguous:
x sizes: 64, 1
y sizes: 64
Make sure all arrays contain the same number of samples.

It can`t train well

Thanks for your code. I have tried this program, the loss of discriminator does not decrease as well as the generator. After the training, the resulting image is still messy. How can I train it successfully?

Missing input files(?)

The bird data seems to be missing some files:
./birds/test/
./birds/train/
are both empty and model.py appears to be looking for pickle files. Is there a preprocessor I should have run?

Generate image from my captions

After training the network, how to generate image from my own captions?

misc/preprocess_birds.py missing

Hello, when I run model.py it gives me this error:

FileNotFoundError: [Errno 2] No such file or directory: 'test/gen_1_0_0.png'

The 'readme' in 'birds' folder says run "python ./misc/preprocess_birds.py" but there is no such file, which I guess is needed to generate gen_1_0_0.png. Can you check again please?

Thanks for your code. Are you ever run your code 600 epochs to generate small and big images successfully? Thanks.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.