Giter VIP home page Giter VIP logo

calamari_models_experimental's Introduction

calamari_models_experimental

Intermediate results of experiments dealing with the training of mixed models for historical printed and handwritten material which seem to be somewhat promising/useful.

Models

The deep3 prefix means that the model was trained using a considerably deeper network structure compared to the default one. Cf. the upcoming selection of default networks for details.

Printings

Most of the models focus on printings using Latin script and Antiqua/Fraktur types. Training data included works from the 15th to 19th century. Please see the corresponding paper for details. The basic model is LSH-4 (Latin Script Historical) which was also used as a starting point to refine the other Antiqua and Fraktur models.

Manuscripts

The "htr" models focus on the recognition of medieval German manuscripts in gothic and bastard fonts. More details about the training data etc. to follow.

calamari_models_experimental's People

Contributors

chreul avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

calamari_models_experimental's Issues

Install Calamari

I can't get Calamari 2.2.2. working properly through my terminal I did the pip the source and the conda I have anaconda on the pc with Ububtu 20.04 and I also want to use GPU with my NVIDIA 3090 Videocard

I followed the manual but that does not work

virtualenv -p python3 PATH_TO_VENV_DIR # (e.g. virtualenv -p python3 calamari_venv)
source PATH_TO_VENV_DIR/bin/activate
pip install calamari-ocr

To install the package without a virtual environment simply run
Installation from Source๏ƒ

To install the package from its source, download the source code and install it. Optionally (but recommended) install in a virtual env.

git clone https://github.com/calamari-OCR/calamari
cd calamari
python setup.py install

Conda users can alternatively call

conda env create -f environment_master.yml

Text regularisation: rules not applied

Too many chars in the codec:

"\u00a7",
"\u00ab",
"\u00b0",
"\u00b6",
"\u00b7",
"\u00b9",
"\u00bb",
"\u00bc",
"\u00bd",
"\u00be",
"\u00c0",
"\u00c4",
"\u00c6",
"\u00c7",
"\u00c8",
"\u00c9",
"\u00ca",
"\u00cb",
"\u00ce",
"\u00cf",
"\u00d2",
"\u00d3",
"\u00d4",
"\u00d6",
"\u00dc",
"\u00df",
"\u00e0",
"\u00e1",
"\u00e2",
"\u00e3",
"\u00e4",
"\u00e5",
"\u00e6",
"\u00e7",
"\u00e8",
"\u00e9",
"\u00ea",
"\u00eb",
"\u00ec",
"\u00ed",
"\u00ee",
"\u00ef",
"\u00f0",
"\u00f1",
"\u00f2",
"\u00f3",
"\u00f4",
"\u00f5",
"\u00f6",
"\u00f9",
"\u00fa",
"\u00fb",
"\u00fc",
"\u00ff",
"\u0101",
"\u0107",
"\u0111",
"\u0113",
"\u0115",
"\u0119",
"\u0129",
"\u012b",
"\u0133",
"\u013a",
"\u0142",
"\u014b",
"\u014d",
"\u0152",
"\u0153",
"\u0159",
"\u0169",
"\u016b",
"\u016f",
"\u017f",
"\u01b7",
"\u01bf",
"\u01f5",
"\u0207",
"\u0223",
"\u0233",
"\u0292",
"\u02d6",
"\u0301",
"\u0303",
"\u0304",
"\u0308",
"\u030a",
"\u0315",
"\u0342",
"\u0357",
"\u0364",
"\u0365",
"\u0366",
"\u0391",
"\u0392",
"\u0393",
"\u0394",
"\u0398",
"\u039b",
"\u03a0",
"\u03a3",
"\u03a4",
"\u03a6",
"\u03ac",
"\u03ad",
"\u03ae",
"\u03af",
"\u03b1",
"\u03b2",
"\u03b3",
"\u03b4",
"\u03b5",
"\u03b6",
"\u03b7",
"\u03b8",
"\u03b9",
"\u03ba",
"\u03bb",
"\u03bc",
"\u03bd",
"\u03be",
"\u03bf",
"\u03c0",
"\u03c1",
"\u03c2",
"\u03c3",
"\u03c4",
"\u03c5",
"\u03c6",
"\u03c7",
"\u03c8",
"\u03c9",
"\u03ca",
"\u03cb",
"\u03cc",
"\u03cd",
"\u03ce",
"\u03d1",
"\u03d6",
"\u03df",
"\u03e7",
"\u03f0",
"\u03f1",
"\u0451",
"\u0452",
"\u0454",
"\u05d0",
"\u05d1",
"\u05d4",
"\u05d5",
"\u05d6",
"\u05d7",
"\u05d9",
"\u05db",
"\u05de",
"\u05e0",
"\u05e1",
"\u05e4",
"\u05e7",
"\u05e8",
"\u05e9",
"\u05ea",
"\u1d10",
"\u1dce",
"\u1dd1",
"\u1dd3",
"\u1de3",
"\u1e7d",
"\u1e85",
"\u1e9c",
"\u1ebd",
"\u1ef9",
"\u1f00",
"\u1f04",
"\u1f08",
"\u1f10",
"\u1f11",
"\u1f30",
"\u1f38",
"\u1f40",
"\u1f41",
"\u1f50",
"\u1f70",
"\u1f74",
"\u1f76",
"\u1f78",
"\u1fbd",
"\u1fc6",
"\u1fd6",
"\u1fe4",
"\u1fe5",
"\u1fe6",
"\u1ff3",
"\u1ff6",
"\u201a",
"\u201b",
"\u201e",
"\u201f",
"\u2020",
"\u2022",
"\u2023",
"\u203a",
"\u204a",
"\u2081",
"\u2086",
"\u2114",
"\u2116",
"\u211e",
"\u2132",
"\u2133",
"\u214e",
"\u2154",
"\u2159",
"\u2183",
"\u2184",
"\u2218",
"\u2219",
"\u2292",
"\u2299",
"\u23d1",
"\u23d3",
"\u25a0",
"\u25bd",
"\u2609",
"\u263d",
"\u263f",
"\u2640",
"\u2642",
"\u2643",
"\u2644",
"\u2648",
"\u2649",
"\u264a",
"\u264b",
"\u264c",
"\u264d",
"\u264e",
"\u264f",
"\u2650",
"\u2651",
"\u2652",
"\u2653",
"\u2e17",
"\u2e4a",
"\ua751",
"\ua753",
"\ua757",
"\ua758",
"\ua759",
"\ua75d",
"\ua76b",
"\ua76d",
"\ua770",
"\ua776",
"\ue44d",
"\ue5b1",
"\ue5d2",
"\ue5dc",
"\ue665",
"\ue8bf",
"\ueada",
"\ueba1",
"\ueba3",
"\ueba5",
"\ueba6",
"\ueec5",
"\ueed9",
"\uf02f",
"\uf1bb",
"\uf1cc",
"\uf4f9",
"\ufb00",
"\ufb02",
"\ufb06",
"\ufe0e",
"\u271d",
"\u2720",
"\u29ec"

Especially ligatures like fb00 should not be in there.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.