Giter VIP home page Giter VIP logo

binarization_2017's Introduction

Binarization

This repo contains the code and models described in Document Image Binarization with Fully Convolutional Neural Networks. There are two sets of 5 models. One trained on DIBCO images, and the other trained on Palm Leaf Manuscripts (PML). Additional info on these models can be found here. You may also be interested in my submission to the DIBCO 2017 competition, located here.

This code depends on a number of python libraries: numpy, scipy, cv2 (python wrapper for opencv), and caffe (my custom fork).

Docker

For those who don't want to install the dependencies, I have created a docker image to run this code. You must have the nvidia-docker plugin installed to use it though you can still run our models on CPU (not recommended).

The usage for the docker container is

nvidia-docker run -v $HOST_WORK_DIRECTORY:/data tensmeyerc/icdar2017:binarization_gpu python binarize_dibco.py /data/input_file.jpg /data/output_file.png $DEVICE_ID
nvidia-docker run -v $HOST_WORK_DIRECTORY:/data tensmeyerc/icdar2017:binarization_gpu python binarize_plm.py /data/input_file.jpg /data/output_file.png $DEVICE_ID

$HOST_WORK_DIRECTORY is a directory on your machine that is mounted on /data inside of the docker container (using -v). It's the only way to expose images to the docker container. $DEVICE_ID is the ID of the GPU you want to use (typically 0). If omitted, then the models are run in CPU mode. There is no need to download the containers ahead of time. If you have docker and nvidia-docker installed, running the above commands will pull the docker image (~2GB) if it has not been previously pulled.

For some reason, the docker image gives a error (related to CuDNN) for binarize_plm.py with all 5 models, so the image was created using only 4 of the models (differeces should be negligible).

Citation

If you find this code useful to your research, please cite our paper:

@inproceedings{tensmeyer2017_binarization,
  title={Document Image Binarization with Fully Convolutional Neural Networks},
  author={Tensmeyer, Chris and Martinez, Tony},
  booktitle={ICDAR},
  year={2017},
  organization={IEEE}
}

binarization_2017's People

Contributors

ctensmeyer avatar

Stargazers

Thun avatar  avatar  avatar  avatar Zhao-Penghai avatar  avatar Vladimir Grigoryev avatar  avatar Thang Nguyen avatar Sakamoto, Kazunori avatar Mayukh Mukherjee avatar  avatar  avatar fyyg8 avatar ufukhurriyet avatar Mikhail Masyagin avatar GeorgeJoe avatar Alexander Karpich avatar Doran Wilde avatar  avatar  avatar Housen Cheng avatar Chris Z avatar absaravanan avatar 4-byte Unicode avatar KojiKobayashi avatar Ph.D. Wei XIONG, Associate Professor avatar Eric avatar

Watchers

James Cloos avatar  avatar  avatar Venkata Kolagotla avatar  avatar  avatar paper2code - bot avatar

binarization_2017's Issues

Retrain models

Would it be possible to share the code to retrain the models?

caffe make all error

Hi,Chris,I just want to install caffe ,on ubuntu18.04,anaconda 3,and python2.7(cpu),and i use your caffe-master,but athe error really confuse me ,as following,
.build_release/tools/extract_features.o:在函数‘std::string* google::MakeCheckOpString<int, int>(int const&, int const&, char const*)’中:
extract_features.cpp:(.text._ZN6google17MakeCheckOpStringIiiEEPSsRKT_RKT0_PKc[_ZN6google17MakeCheckOpStringIiiEEPSsRKT_RKT0_PKc]+0x43):对‘google::base::CheckOpMessageBuilder::NewString()’未定义的引用
.build_release/tools/extract_features.o:在函数‘std::string* google::MakeCheckOpString<unsigned int, int>(unsigned int const&, int const&, char const*)’中:
extract_features.cpp:(.text._ZN6google17MakeCheckOpStringIjiEEPSsRKT_RKT0_PKc[_ZN6google17MakeCheckOpStringIjiEEPSsRKT_RKT0_PKc]+0x43):对‘google::base::CheckOpMessageBuilder::NewString()’未定义的引用
.build_release/tools/extract_features.o:在函数‘std::string* google::MakeCheckOpString<unsigned long, unsigned long>(unsigned long const&, unsigned long const&, char const*)’中:
extract_features.cpp:(.text._ZN6google17MakeCheckOpStringImmEEPSsRKT_RKT0_PKc[_ZN6google17MakeCheckOpStringImmEEPSsRKT_RKT0_PKc]+0x44):对‘google::base::CheckOpMessageBuilder::NewString()’未定义的引用
.build_release/lib/libcaffe.so:对‘leveldb::DB::Open(leveldb::Options const&, std::string const&, leveldb::DB**)’未定义的引用
.build_release/lib/libcaffe.so:对‘leveldb::Status::ToString() const’未定义的引用
collect2: error: ld returned 1 exit status
Makefile:561: recipe for target '.build_release/tools/extract_features.bin' failed
make: *** [.build_release/tools/extract_features.bin] Error 1

It always exits,my gcc and g++ is 4.8.5,protobuf is 2.5,and I have g++ 7.3,but I lowered in order to make ,could you help me ,if so ,I will appreciate it very much.

test failed

Duringtesting two models, I encountered a question that results in failure. The question is:
F0831 14:32:22.325904 15088 layer_factory.hpp:81] Check failed : registry.count(type) == 1(0 vs. 1) Unknown layer type: BilinearInterpolation(known type: AbsVal, Accuracy, AryMax, BNLL, ....., Tile, WindowData)

*** Check failure stack trace:***
Aborted

problem about nvidia-docker

xxx@xxx:~$ sudo nvidia-docker run -v /home/xxx:/data tensmeyerc/icdar2017:binarization_gpu python binarize_plm.py /data/input_file.jpg /data/output_file.png 0
libdc1394 error: Failed to initialize libdc1394
Traceback (most recent call last):
File "binarize_plm.py", line 257, in
main(in_image, out_image)
File "binarize_plm.py", line 191, in main
rd_im = relative_darkness(image)
File "binarize_plm.py", line 35, in relative_darkness
if im.ndim == 3:
AttributeError: 'NoneType' object has no attribute 'ndim'
Loading Image
Computing RD features

Do you know how to fix it?

binarization_dibco container fails

Hello!
I'm trying to run your binarization_dibco.py tool in nvidia-docker container and getting following error:

F1021 20:23:03.879026 1 math_functions.cu:81] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered
*** Check failure stack trace: ***
Loading Image
Computing RD features
Concating inputs
Preprocessing
(489, 2625, 4)
Tiling input
Starting predictions for network 1/5
Progress 2%
Progress 5%
Progress 7%
Progress 10%
Progress 12%
Progress 15%
Progress 17%
Progress 20%
Progress 23%
Progress 25%
Progress 28%
Progress 30%
Progress 33%
Progress 35%
Progress 38%
Progress 41%
Progress 43%
Progress 46%
Progress 48%
Progress 51%
Progress 53%
Progress 56%
Progress 58%
Progress 61%
Progress 64%
Progress 66%
Progress 69%
Progress 71%
Progress 74%
Progress 76%
Progress 79%
Progress 82%
Progress 84%
Progress 87%
Progress 89%
Progress 92%
Progress 94%
Progress 97%
Progress 100%
Reconstructing whole image from binarized tiles
Starting predictions for network 2/5.

binarization_plm.py doesn't fail and works great :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.