Giter VIP home page Giter VIP logo

chamanti_ocr's Introduction

చామంతి

Mission

This project aims to build a very ambitious OCR framework, that should work on any language. It will not rely on segmentation algorithms (at the glyph level), making it ideal for highly agglutinative scripts like Arabic, Devanagari etc. We will be starting with Telugu however. The core technology behind this is going to be Recurrent Neural Networks using CTC from the repo rnn_ctc.

Dependencies

  1. numpy
  2. scipy
  3. theano
  4. libffi
  5. cffi
  6. cairocffi

Setup

Clone this repo and run installation script.

git clone https://github.com/rakeshvar/chamanti_ocr
cd chamanti_ocr
./scripts/install.sh

Fonts

You will need a lot of fonts for a language you want to train on. You can get numerous Telugu fonts from here. Just copy all the fonts to your ~/.fonts directory.

Running

Checking

Given the complicated dependencies, you can first check if you have all the dependencies as

cd tests
python3 test_scribe_random.py
python3 test_scribe_all_fonts.py <(echo 'క్రైః') > kraih.txt
# The output should contain the text rendered in various fonts

Training an RNN

You can now train an RNN to read Telugu! Although you can not save it yet!

python3 train.py

Troubleshooting

Dependencies

You should have libffi, cffi and cairocffi installed. These are constantly changing and are works in progress. More over you might need root privileges to install libraries (libffi).

If cffi is complaining that it needs libffi then try to install it as

Ubuntu

sudo apt-get install libffi-dev

RHEL, CentOS

yum install libffi

But then if you are not root on an RHEL machine (which is the case if you are on a server) then try

mkdir ~/software/
cd ~/software/
wget ftp://sourceware.org/pub/libffi/libffi-3.2.1.tar.gz
tar -xvf libffi-3.2.1.tar.gz
cd libffi-3.2.1/
./configure --prefix=/home/<NAME>/usr
make -j4
make check
make install

Open .bashrc file and add these lines

export PATH=$PATH:~/usr/bin
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:~/usr/lib:~/usr/lib64
export C_INCLUDE_PATH=$C_INCLUDE_PATH:~/usr/include:~/usr/lib/libffi-3.2.1/include
export CPLUS_INCLUDE_PATH=$CPLUS_INCLUDE_PATH:~/usr/include:~/usr/lib/libffi-3.2.1/include

Try installing those packages again

LDFLAGS=-L/home/<NAME>/usr/lib64 pip3 install cffi
pip3 install cairocffi

Other Problems

  • If your PIL / Pillow is not able to open tiff image files. Follow this: http://stackoverflow.com/a/10109941 If you do not have root priveleges are installing libtiff etc. locally, make sure your LD_LIBRARY_PATH points to something like ~/usr/lib that has libtiff etc.

chamanti_ocr's People

Contributors

chillaranand avatar rakeshvar avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.