Giter VIP home page Giter VIP logo

data's Introduction

CDLI Daily Bulk Data Dump

The repository contains a daily dump of all public catalogue and text data from the Cuneiform Digital Library Initiative.

Format

Text Data

The CDLI transliterations dump is offered in plain text UTF-8 ATF format. For more information about ATF, visit : http://oracc.museum.upenn.edu/doc/help/editinginatf/cdliatf/index.html (Scroll down for an example).

Catalogue data

The catalogue is offered in a UTF-8 comma separated format. Most fields are thoroughly explained here: https://cdli.ucla.edu/?q=cdli-search-information
Our data schema is currently being remodeled, get in touch if you would like a sneak peak!

To view a sample of the catalogue, you can use the head command on a Unix machine using this syntax, while you are in the directory where the file is stored:

head cdli_catalogue_1of2.csv

With Windows Power Shell, try

Get-Content *filename* -Head *n*

Files reconstitution

The catalogue file is split in two because of file size limitations at Github. To merge the catalogue files into one, use:

cat cdli_catalogue_1of2.csv cdli_catalogue_2of2.csv > cdli_catalogue.csv

in the Unix command line. Under windows, try:

copy cdli_catalogue_1of2.csv+cdli_catalogue_2of2.csv cdli_catalogue.csv

Before October 18 2017, the catalogue and transliterations were provided in .zip format.

EPP [email protected]

data's People

Contributors

epageperron avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.