jarodjacobs,Jarod Jacobs,github

ancient_hebrew_and_mal

The Menzerath-Altmann Law and Ancient Hebrew: Does the Bible Break the Law? In this paper, I explore the application of the Menzerath-Altmann law (MAL) to Ancient Hebrew. The MAL is named after Paul Menzerath and Bariel Altmann and was proposed first in 1928 and later developed in 1980. Essentially, the MAL is a linguistic law that states an increase in the size of a linguistic construct corresponds to a decrease in the size of its constituents. An example of this is: the longer a word (measured in syllables), the shorter the syllables (measured in sounds). For this paper, I examine the sentence, clause, phrase, and word levels to determine if Ancient Hebrew conforms to the MAL. My corpus is comprised of the Masoretic Text of the Hebrew Bible as well as the War Scroll and the Community Rule. I utilize the Eep Talstra Centre for Bible and Computer (ETCBC) database of the Hebrew Bible and Dead Sea Scrolls as my source for data. While the application of the MAL has been studied in numerous languages, to my knowledge there is no published material focused on this law and Hebrew. The conclusions of this paper will thus be important both for general quantitative linguistics as well as Ancient Hebrew studies.

coursera_projects

datacamp_projects

This repository contains all of the DataCamp projects I have completed. DataCamp provides guided projects that help Data Scientists practice skills they are learning in DataCamp courses.

dataprep

deploying-machine-learning-models

Example Repo for the Udemy Course "Deployment of Machine Learning Models"

drill_descriptive_statistics_and_normality

DRILL - Descriptive Statistics and Normality

drill_exploring_the_central_limit_theorem

DRILL - Exploring the Central Limit Theorem

ethiopic_psalster_analysis

Computational Codicology: An Exploration of Ethiopian Psalters In his project “The Social Lives of Ethiopian Psalters,” Steve Delamarter has identified over 110 features of codicology and scribal practice and book culture which are executed in differing and various ways by scribes, craftsmen, and book users. He selected 29 of these for further study and then created typologies of those variation, assigned numbers to each type of the variation and built a database (containing more than 100 fields) to receive data from Psalters. He has analyzed over 700 Psalters—ranging from the earliest extant ones of the fourteenth century up through the 20th century—and collected data regarding the variations in each. Each of the Psalters is dated; over 200 of the Psalters can be located in terms of the area of Ethiopia in which they were produced. The data produced from the project promises to open many windows on aspects of Ethiopian scribal and book culture. But this will only happen if the data is analyzed well. This presentation will focus on the ways in which we have applied specialized statistical analyses, such as cluster and random forest analysis, to the data. The raw data already suggest some fascinating stories; but a thorough statistical analysis is need to draw firm conclusions. Codicologists have already noted a few features that have changed over time; this dataset promises to greatly increase those findings. But perhaps even more interesting is the potential of identifying features that differ depending on where the manuscript was produced. Of all the manuscripts the place of origin is recorded and this information is also taken into account in the analysis. It is to be expected that a comprehensive analysis of both chronological and geographical data will lead to a much better understanding of Ethiopic scribal culture.

ethiopic_psalters_ml

Computational Codicology: An Exploration of Ethiopian Psalters In his project “The Social Lives of Ethiopian Psalters,” Steve Delamarter has identified over 110 features of codicology and scribal practice and book culture which are executed in differing and various ways by scribes, craftsmen, and book users. He selected 29 of these for further study and then created typologies of those variation, assigned numbers to each type of the variation and built a database (containing more than 100 fields) to receive data from Psalters. He has analyzed over 700 Psalters—ranging from the earliest extant ones of the fourteenth century up through the 20th century—and collected data regarding the variations in each. Each of the Psalters is dated; over 200 of the Psalters can be located in terms of the area of Ethiopia in which they were produced. The data produced from the project promises to open many windows on aspects of Ethiopian scribal and book culture. But this will only happen if the data is analyzed well. This presentation will focus on the ways in which we have applied specialized statistical analyses, such as cluster and random forest analysis, to the data. The raw data already suggest some fascinating stories; but a thorough statistical analysis is need to draw firm conclusions. Codicologists have already noted a few features that have changed over time; this dataset promises to greatly increase those findings. But perhaps even more interesting is the potential of identifying features that differ depending on where the manuscript was produced. Of all the manuscripts the place of origin is recorded and this information is also taken into account in the analysis. It is to be expected that a comprehensive analysis of both chronological and geographical data will lead to a much better understanding of Ethiopic scribal culture.

fizzbuzz

gittest

hebrewmorphology

Here you find a parser of the Biblical Hebrew morphology.

lbh17

data for my presentation at the Linguistics and Biblical Hebrew section of the 2017 Annual of the Society of Biblical Literature

mal_lbh_sbl_18

Code, Data, and Paper for Jarod Jacobs's SBL presentation in the Linguistics and Biblical Hebrew section of the Annual Meeting of the Society of Biblical Literature

menzerath-s_law_and_ancient_hebrew

Society of Biblical Literature - Linguistics and Biblical Hebrew 2018 presentation - Draft:

poverty_in_america

Poverty in America - 2016: Thinkful Data Science Prep Course Capstone. In this report, I will explore poverty rates in America. I will first start by describing the dataset that I will analyze, then I will analyze the data, and I will conclude by proposing some further research.

psalter_tableau_dashboard

This is a Tableau Dashboard that displays the data for the Social Lives of the Ethiopic Psalter project

pythonprogram_search_words_in_etcbc

This is a python program that asks the user to input a Hebrew word that is then search for in the ETCBC database. The resulting search results are then populated into a csv file with book, chapter, and verse for each hit.

rnns-lstm_pos_blog_post

Earlier this year the CACCHT project (Creating Annotated Corpora of Classical Hebrew Texts), which is a joint project of the ETCBC and the Theological Seminary at Andrews University, has started.The project participants are Jarod Jacobs, Martijn Naaijer, Robert Rezetko, Oliver Glanz and Wido van Peursen and this project focuses on statistically analyzing Ancient Hebrew texts. Of course we make use of the BHSA and the extrabiblical module, but for a comprehensive analysis we would like to use more texts, especially the Dead Sea Scrolls and Rabbinic texts. The first step has been made now and you can find the results on the [ETCBC github page](https://github.com/ETCBC/dss): a brand new Text-Fabric module containing the Dead Sea Scrolls with morphological encoding. The DSS texts and morphological data connected with them were generously provided by Martin Abegg, which consist of two foundational sets of data: transcriptions and morphological tagging. The transcriptions come from various sources, but primarily reflect what is found in the Discoveries in the Judean Desert series. Abegg started morphologically tagging the Qumran texts in the mid-90s with the assistance of several people. Over the following decades, Abegg completed full morphological tagging of nearly every Hebrew and Aramaic scroll found in the Judaean Desert between 1947 and today. The data were converted to Text-Fabric by Dirk Roorda.

jarodjacobs Goto Github PK

Jarod Jacobs's Projects

Recommend Projects

Recommend Topics

Recommend Org