Giter VIP home page Giter VIP logo

scrabblegan's Introduction

ScrabbleGAN - Handwritten Text Generation

A PyTorch implementation of the ScrabbleGAN: Semi-Supervised Varying Length Handwritten Text Generation paper. Parts of the code have been adapted from the official implementation of the paper. The purpose of this repository is to provide a clear and simple way to understand and replicate the results of the paper.

Image generated by our trained model

Requirements

  • PyTorch v1.6.0 - for all the deep learning components
  • PyTorch-FID - for FID score calculation
  • OpenCV 3 - for image processing (not required for generating new images)

A complete requirements.txt file will be added soon.

Steps for training the ScrabbleGAN model from scratch

  1. Download the IAM dataset or the RIMES database and keep them in the data /data/ directory as shown below:

    ├── data
    |   ├── IAM
    |       └──ascii
    |           └──words.txt
    |       └──words
    |           └──a01
    |           └──a02
    |           .
    |           .
    |       └──original_partition
    |           └──te.lst, tr.lst, va1.lst, va2.lst
    |   ├── RIMES
    |       └──ground_truth_training_icdar2011.txt
    |       └──training
    |           └──lot_1
    |           └──lot_2
    |           .
    |           .
    |       └──ground_truth_validation_icdar2011.txt
    |       └──validation
    |           └──lot_14
    |           └──lot_15
    |           └──lot_16
    |       .
    |       .
    |   └── prepare_data.py 
  2. Modify the /config.py file to change dataset, model architecture , image height, etc. The default parameters indicate the ones used in the paper.

  3. From the data directory, run:

    python prepare_data.py

    This will process the ground-truth labels and images, and create a pickle file to be used for training.

  4. Start model training by running the below command from the main directory:

    python train.py

    This will start training the model. A sample generated image will be saved in the output directory after every epoch. Tensorboard logging has also been enabled.

Steps for generating new images

The easiest way to generate images is to use this demo; it has options for generating random text, specific text, random styles, consistent style, etc. Another option is to download these files:

  1. Pretrained models for English (IAM) or French (RIMES).
  2. Character mapping for English (IAM) or (French (RIMES).
  3. Lexicon files for English or French.

After downloading the required files, follow the below steps:

  1. Change the dataset and lexicon_file path in config.py.
  2. Run:
    python generate_images.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
    This will generate random images. You can also check the arguments in generate_images.py to see more options.

Steps to check FID score

Create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch". Also, either download the model checkpoints for English (IAM) or French (RIMES), or train your own model and save the checkpoints. To check the FID score, run: bash python calculate_metrics.py -c 'path_to_checkpoint_file'

Steps for training HTR models

One of the motivation in the paper was to boost the HTR performance using synthetic data generated by ScrabbleGAN. The code for HTR training has not been provided in this repository for consistency with the author's approach of using this code for HTR training. You can follow the below steps for HTR training:

  1. Create your own models or download all the files listed in "Steps for generating new images". Also, create the preprocessed data file as described in steps 1-3 of "Steps for training the ScrabbleGAN model from scratch".
  2. If required, change dataset, partition, data_file, lexicon_file in config.py
  3. To create LMDB data files required for HTR training, run:
    python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file'
    to create lmdb dataset without any synthetic images, or
    python create_lmdb_dataset.py -c 'path_to_checkpoint_file' -m 'path_to_character_mapping_file' -n 100000
    to add generated images to the original dataset.
  4. Train the HTR model as described here

References

scrabblegan's People

Contributors

arshjot avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

scrabblegan's Issues

Need help with calculating FID scores

Hey,

Your project is a marvellous representation of the working of the paper. I really appreciate your hard work.

I know it's been a while, but I would really appreciate if you would guide through it.
Just wanted a little bit of help with the FID scores calculations. I cannot replicate your process there and I wanted to ask if you did anything differently in that.
Running the command bash python calculate_metrics.py -c 'path_to_checkpoint_file' is giving me a value error as follows

Screenshot 2022-06-11 at 3 10 44 PM

Can you give me some advice to rectify this issue. It'll mean the world to be.

Thanks and have a great day ahead.

Generate a sentence

Hello,
When you generate images, each word is generated in each style but I want to generate words in the same style, such as a sentence: 'Have a nice day'.
How do you solve this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.