Giter VIP home page Giter VIP logo

Jarod Jacobs's Projects

ancient_hebrew_and_mal icon ancient_hebrew_and_mal

The Menzerath-Altmann Law and Ancient Hebrew: Does the Bible Break the Law? In this paper, I explore the application of the Menzerath-Altmann law (MAL) to Ancient Hebrew. The MAL is named after Paul Menzerath and Bariel Altmann and was proposed first in 1928 and later developed in 1980. Essentially, the MAL is a linguistic law that states an increase in the size of a linguistic construct corresponds to a decrease in the size of its constituents. An example of this is: the longer a word (measured in syllables), the shorter the syllables (measured in sounds). For this paper, I examine the sentence, clause, phrase, and word levels to determine if Ancient Hebrew conforms to the MAL. My corpus is comprised of the Masoretic Text of the Hebrew Bible as well as the War Scroll and the Community Rule. I utilize the Eep Talstra Centre for Bible and Computer (ETCBC) database of the Hebrew Bible and Dead Sea Scrolls as my source for data. While the application of the MAL has been studied in numerous languages, to my knowledge there is no published material focused on this law and Hebrew. The conclusions of this paper will thus be important both for general quantitative linguistics as well as Ancient Hebrew studies.

datacamp_projects icon datacamp_projects

This repository contains all of the DataCamp projects I have completed. DataCamp provides guided projects that help Data Scientists practice skills they are learning in DataCamp courses.

ethiopic_psalster_analysis icon ethiopic_psalster_analysis

Computational Codicology: An Exploration of Ethiopian Psalters In his project “The Social Lives of Ethiopian Psalters,” Steve Delamarter has identified over 110 features of codicology and scribal practice and book culture which are executed in differing and various ways by scribes, craftsmen, and book users. He selected 29 of these for further study and then created typologies of those variation, assigned numbers to each type of the variation and built a database (containing more than 100 fields) to receive data from Psalters. He has analyzed over 700 Psalters—ranging from the earliest extant ones of the fourteenth century up through the 20th century—and collected data regarding the variations in each. Each of the Psalters is dated; over 200 of the Psalters can be located in terms of the area of Ethiopia in which they were produced. The data produced from the project promises to open many windows on aspects of Ethiopian scribal and book culture. But this will only happen if the data is analyzed well. This presentation will focus on the ways in which we have applied specialized statistical analyses, such as cluster and random forest analysis, to the data. The raw data already suggest some fascinating stories; but a thorough statistical analysis is need to draw firm conclusions. Codicologists have already noted a few features that have changed over time; this dataset promises to greatly increase those findings. But perhaps even more interesting is the potential of identifying features that differ depending on where the manuscript was produced. Of all the manuscripts the place of origin is recorded and this information is also taken into account in the analysis. It is to be expected that a comprehensive analysis of both chronological and geographical data will lead to a much better understanding of Ethiopic scribal culture.

ethiopic_psalters_ml icon ethiopic_psalters_ml

Computational Codicology: An Exploration of Ethiopian Psalters In his project “The Social Lives of Ethiopian Psalters,” Steve Delamarter has identified over 110 features of codicology and scribal practice and book culture which are executed in differing and various ways by scribes, craftsmen, and book users. He selected 29 of these for further study and then created typologies of those variation, assigned numbers to each type of the variation and built a database (containing more than 100 fields) to receive data from Psalters. He has analyzed over 700 Psalters—ranging from the earliest extant ones of the fourteenth century up through the 20th century—and collected data regarding the variations in each. Each of the Psalters is dated; over 200 of the Psalters can be located in terms of the area of Ethiopia in which they were produced. The data produced from the project promises to open many windows on aspects of Ethiopian scribal and book culture. But this will only happen if the data is analyzed well. This presentation will focus on the ways in which we have applied specialized statistical analyses, such as cluster and random forest analysis, to the data. The raw data already suggest some fascinating stories; but a thorough statistical analysis is need to draw firm conclusions. Codicologists have already noted a few features that have changed over time; this dataset promises to greatly increase those findings. But perhaps even more interesting is the potential of identifying features that differ depending on where the manuscript was produced. Of all the manuscripts the place of origin is recorded and this information is also taken into account in the analysis. It is to be expected that a comprehensive analysis of both chronological and geographical data will lead to a much better understanding of Ethiopic scribal culture.

lbh17 icon lbh17

data for my presentation at the Linguistics and Biblical Hebrew section of the 2017 Annual of the Society of Biblical Literature

mal_lbh_sbl_18 icon mal_lbh_sbl_18

Code, Data, and Paper for Jarod Jacobs's SBL presentation in the Linguistics and Biblical Hebrew section of the Annual Meeting of the Society of Biblical Literature

poverty_in_america icon poverty_in_america

Poverty in America - 2016: Thinkful Data Science Prep Course Capstone. In this report, I will explore poverty rates in America. I will first start by describing the dataset that I will analyze, then I will analyze the data, and I will conclude by proposing some further research.

pythonprogram_search_words_in_etcbc icon pythonprogram_search_words_in_etcbc

This is a python program that asks the user to input a Hebrew word that is then search for in the ETCBC database. The resulting search results are then populated into a csv file with book, chapter, and verse for each hit.

rnns-lstm_pos_blog_post icon rnns-lstm_pos_blog_post

Earlier this year the CACCHT project (Creating Annotated Corpora of Classical Hebrew Texts), which is a joint project of the ETCBC and the Theological Seminary at Andrews University, has started.The project participants are Jarod Jacobs, Martijn Naaijer, Robert Rezetko, Oliver Glanz and Wido van Peursen and this project focuses on statistically analyzing Ancient Hebrew texts. Of course we make use of the BHSA and the extrabiblical module, but for a comprehensive analysis we would like to use more texts, especially the Dead Sea Scrolls and Rabbinic texts. The first step has been made now and you can find the results on the [ETCBC github page](https://github.com/ETCBC/dss): a brand new Text-Fabric module containing the Dead Sea Scrolls with morphological encoding. The DSS texts and morphological data connected with them were generously provided by Martin Abegg, which consist of two foundational sets of data: transcriptions and morphological tagging. The transcriptions come from various sources, but primarily reflect what is found in the Discoveries in the Judean Desert series. Abegg started morphologically tagging the Qumran texts in the mid-90s with the assistance of several people. Over the following decades, Abegg completed full morphological tagging of nearly every Hebrew and Aramaic scroll found in the Judaean Desert between 1947 and today. The data were converted to Text-Fabric by Dirk Roorda.

searchetcbc icon searchetcbc

This is a python program that asks the user to input a Hebrew word that is then search for in the ETCBC database. The resulting search results are then populated into a csv file with book, chapter,…

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.