Giter VIP home page Giter VIP logo

gt-fraktur's Introduction

GT-FRAKTUR

gt-fraktur is the Ground Truth (GT) data for Fraktur/Gothic prints from the 19th Century, released by UB, Uni-Tübingen as Open Data under the CC0 public license.


§1. GT Data

This repository contains transcriptions of selected pages from 19th Century books as listed below. The original TIFF images used for OCR transcription of the following publications are published on Archive.org under the CC0 public license.

§1.1. Shelfmark

The Shelfmark / DigitalID's of the 19th Century Fraktur prints selected for transcribing:

# FolderName NumberOfPages URL-Shelfmark-DigitalID Comments
01. agtck_1834_02 15 pgs http://idb.ub.uni-tuebingen.de/opendigi/agtck_1834_02
02. akzs_1860 24 pgs http://idb.ub.uni-tuebingen.de/opendigi/akzs_1860
03. artl_001 20 pgs http://idb.ub.uni-tuebingen.de/opendigi/artl_001
04. artl_002 18 pgs http://idb.ub.uni-tuebingen.de/opendigi/artl_002 Error in 1 image.
05. drey1834 5 pgs http://idb.ub.uni-tuebingen.de/opendigi/drey1834
06. harless1834 7 pgs http://idb.ub.uni-tuebingen.de/opendigi/harless1834
07. kath_1830_035 18 pgs http://idb.ub.uni-tuebingen.de/opendigi/kath_1830_035
08. litrdsch_1875 38 pgs http://idb.ub.uni-tuebingen.de/opendigi/litrdsch_1875 Errors in 2 images.
09. stml_1871_01 22 pgs http://idb.ub.uni-tuebingen.de/opendigi/stml_1871_01
10. thlblb_1866 25 pgs http://idb.ub.uni-tuebingen.de/opendigi/thlblb_1866 Errors in 3 images.
11. zpkt_1832_01 8 pgs http://idb.ub.uni-tuebingen.de/opendigi/zpkt_1832_01
12. zpk_1838_01 7 pgs http://idb.ub.uni-tuebingen.de/opendigi/zpk_1838_01

§1.2. Quality Issues

Details of the page quality issues observed during the transcription process:

# Shelfmark-DigitalID Quality Bugs
1. artl_002 artl_002_00010.tif has bad alignment
2. litrdsch_1875 Misprint
3. litrdsch_1875 Misprint: litrdsch_1875_0146.tif (page 28); line 6-38 in the left column
4. thlblb_1866 Image "thlblb_1866_00037.tif", has a crossed 'o' (eg. ø, Unicode: U+00F8) in the word "Redaction" in multiple places on the page, which were manually corrected to a regular "o" during transcription.
5. thlblb_1866 thlblb_1866_00121.tif, right column - it seems like the long ſ was corrected manually
6. thlblb_1866 thlblb_1866_00425.tif, left column – the word "fünfte" is somehow blurred - seems like there are two "f".

§2. LICENSE

  • This data is is released by UB, Uni-Tuebingen as Open Data under the CC0 public license.

gt-fraktur's People

Contributors

stweil avatar svaksha avatar obrandt avatar lena-hinrichsen avatar

Stargazers

 avatar Konstantin Baierer avatar  avatar  avatar Benjamin Rosemann avatar  avatar Robert Sachunsky avatar

Watchers

 avatar Dr. Johannes Ruscheinski avatar Florian Wagner avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.