Giter VIP home page Giter VIP logo

taco-box's Introduction

Tiling and Corruption (TACo)

License PyPI PRs

TACo is a simple and effective data augmentation technique for the task of Optical Character Recognition (OCR) or Handwritten Text Recognition (HTR) (check reference).

And, taco-box is an implementation of TACo algorithm. This is currently under the Apache 2.0, Please feel free to use for your project. Enjoy!

Installing

First, you need to have python 3 installed in your system.

Next, you can Install taco-box with pip or your favorite PyPi package manager.

pip install taco-box

Usage

Checkout this jupyter notebook on usage - Notebook

Here is an example:

from tacobox import Taco

# creating Taco object. (Note: parameters are at their default value.)
mytaco = Taco(cp_vertical=0.25,
                cp_horizontal=0.25,
                max_tw_vertical=100,
                min_tw_vertical=20,
                max_tw_horizontal=50,
                min_tw_horizontal=10
                )

# apply random vertical corruption
augmented_img = mytaco.apply_vertical_taco(input_img, corruption_type='random')
mytaco.visualize(augmented_img)
    -------Understanding Arguments--------
    :cp_vertical:        corruption probability of vertical tiles
    :cp_horizontal:      corruption probability for horizontal tiles
    :max_tw_vertical:    maximum possible tile width for vertical tiles in pixels
    :min_tw_vertical:    minimum tile width for vertical tiles in pixels
    :max_tw_horizontal:  maximum possible tile width for horizontal tiles in pixels
    :min_tw_horizontal:  minimum tile width for horizontal tiles in pixels

Expected results

Below picture shows the variations of TACo augmentation algorithm from current implementation:-

Example Output

Contributing

This project is in very early stages of development. If there is an issue or feature request, feel free to open an issue. Additionally, a PR is always welcome.

Reference

TACo algorithm is part of a research project on Handwritten Text Recognition. Link to the original paper will be posted soon!!

taco-box's People

Contributors

kartikgill avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.