Comments (153)
Refactoring allows for something a little more elegant:
const heb = require("./dist/index");
const rules = require("./dist/rules");
const result = heb.transliterate("בְּרֵאשִׁ֖ית וַיַּבְדֵּל", {
ADDITIONAL_FEATURES: [
{
// matches any sheva in a syllable that is NOT preceded by a vowel character
HEBREW: "(?<![\u{05B1}-\u{05BB}\u{05C7}].*)\u{05B0}",
FEATURE: "syllable",
TRANSLITERATION: function (syllable, _hebrew, schema) {
const next = syllable.next;
// discrepancy here: in havarotjs SHEVA is simply the character
// whereas transliteration is concerned with a specific sheva, a vocal sheva
const nextVowel = next.vowelName === "SHEVA" ? "VOCAL_SHEVA" : next.vowelName;
if (next && nextVowel) {
const vowel = schema[nextVowel] || "";
// replaceAndTransliterate is an internal helper function
return rules.replaceAndTransliterate(syllable.text, new RegExp("\u{05B0}", "u"), vowel, schema);
}
return syllable.text;
}
}
]
});
// bērēʾšît wayyabdēl
Though the regex is a little more complicated, it ensures that the sheva being matched is likely a vocal one.
thinking out loud: the ADDITIONAL_FEATURES
property was originally designed with orthographic features in mind. Perhaps an ADDITIONAL_RULES
could be a possible future property where the rule could match on something simpler like
syl.vowelName === "SHEVA"
from hebrew-transliteration.
Checkout this branch for tiberian.
If you could look through the tests, and let me know what is incorrect.
Feel free to push changes or just comment here
from hebrew-transliteration.
Yes, you are correct!
Forgot about DIVINE_NAME: "yhwh"
, it was pronounced according to the vowels written:
יֱהוִה֙ [ʔɛloːˈhiːim]
יְהוָֹ֤ה [ʔaðo:ˈnɔ:j]
from hebrew-transliteration.
Splendid! Now we are even closer. Good work @charlesLoder
Congratulations on the baby!
from hebrew-transliteration.
Also, don't forget about the SHEVA rules when you got time.
In the Tiberian tradition, when vocalic shewa occurs before a guttural consonant or the letter yod, it was realized with a different quality through an assimilatory process
(i) before a guttural (אהחע) it was realized as a short vowel with the quality of the vowel on the guttural
e.g. בְּעֶרְכְּךָ [bɛʕɛʀ̟kʰaˈχɔː] "by your evaluation"
וְהָיָה [vɔhɔːˈjɔː] "and it became"
בְּאֵר [beˈʔeːeʀ̟] "well"
מְאוֹד [moˈʔoːoð] "very"
מְחִיר [miˈħiːiʀ̟] "price"
מְעוּכָה [muʕuːˈχɔː] "pressed"
(ii) before yod, it was realized as a short vowel with the quality of short ḥireq [i]
e.g. בְּיוֹם [biˈjoːom] "on the day"
לְיִשְׂרָאֵל [lijisrˁɔːˈʔeːel] "to Israel"
תְּדַמְּיוּן [tʰaðammiˈjuːun] "you liken (mpl)"
from hebrew-transliteration.
Another round of work.
More furtive tests
Take a look at these. They should be correct in terms of being preceded by a vav or yod. The long vowels aren't correct in this commit
db97c62
Epenthetic vowel
When a long vowel occurs in a closed syllable, an epenthetic vowel is inserted after the long vowel before the syllable final consonant
These long vowels are going to be tricky....
See the updated tests here
37bbdc1
Could you comment on each line whether it is correct or not. A simple 👍 if it's correct, and if it's not correct, then comment with the correct value.
from hebrew-transliteration.
Just updated the branch.
I'm struggling a bit with the vowel length stuff.
The most recent commit fro Gen 1:1-5 produces:
- baʀ̟eːˈʃiːijθ bɔːˈʀ̟ɔːɔ ʔɛːloːˈhiːijm ˈʔeːeθ haʃɔːˈmaːjim vaˈʔeːeθ hɔːʔɔːˈʀ̟ɛːɛsˁ
- vahɔːˈʔɔːʀ̟ɛsˁ hɔjˈθɔːh ˈθoːˈhuː vɔːˈvoːhuː vaˈħoːʃɛχ ʕal-pʰaˈneːj θaˈhoːovm vaˈʀ̟uːwaħ ʔɛːloːˈhiːijm maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneːj hamɔːˈjiːim
- vaˈɟɟoː֥mɛʀ̟ ʔɛːloːˈhiːijm jaˈhiːj ˈʔoːovʀ̟ vaːjahiːj-ˈʔoːovʀ̟
- vaˈɟɟa֧ʀ̟ ʔɛːloːˈhiːijm ʔɛθ-hɔːˈʔoːovʀ̟ kʰiːj-ˈtˁoːovv vaɟɟavˈdeːel ʔɛːloːˈhiːijm ˈbeːejn hɔːˈʔoːovʀ̟ uːˈveːejn haːħoːˈʃɛχ
- vaɟɟiq̟ˈʀ̟ɔːɔ ʔɛːloːˈhiːijm lɔːˈʔoːovʀ̟ ˈjoːovm valaːˈħoːʃɛχ ˈq̟ɔːʀ̟ɔ ˈlɔjlɔːh vaːjahiːj-ˈʕɛːʀ̟ɛv vaːjahiːj-ˈvoːq̟ɛʀ̟ ˈjoːovm ʔɛːˈħɔːɔð
Someways it's closer, other ways it's way off
from hebrew-transliteration.
Last round after comparing my output to Khan's transcriptions:
[
{ text: 'הָיְתָ֥ה', expected: 'hɔːɔjˈθɔː', received: 'hɔjˈθɔː' },
{ text: 'בֵּ֥ין', expected: 'beːen', received: 'ˈbeːen' },
{ text: 'קָ֣רָא', expected: 'ˈq̟ɔʀ̟ɔː', received: 'ˈq̟ɔːʀ̟ɔː' },
{ text: 'בֵּ֤ין', expected: 'beːen', received: 'ˈbeːen' },
{ text: 'עֹ֤שֶׂה', expected: 'ˈʕoːsɛˑ', received: 'ˈʕoːsɛː' },
{ text: 'פְּרִי֙', expected: 'ppʰaˈʀ̟iː', received: 'pʰaˈʀ̟iː' },
{ text: 'עֹֽשֶׂה', expected: 'ˈʕoːsɛˑ', received: 'ˌʕoːsɛː' },
{ text: 'פְּרִ֛י', expected: 'ppʰaˈʀ̟iː', received: 'pʰaˈʀ̟iː' },
{
text: 'וַֽיְהִי־עֶ֥רֶב',
expected: 'ˌvaˑjhiː-ˈʕɛːʀ̟ev',
received: 'ˌvaˑjhiː-ˈʕɛːʀ̟ɛv'
}
]
The simplest issues are those where we believe there may be a typo in the transcription — הָיְתָ֥ה, קָ֣רָא, וַֽיְהִי־עֶ֥רֶב — for וַֽיְהִי־עֶ֥רֶב the final segol seems to be incorrectly transcribed as 'e'.
The next simplest is בֵּ֤ין which should not be stressed, but it is in the output — I'm willing to live with that.
For פְּרִ֛י' see I.2.8.1.2. This will take some knowledge of word boundaries, and I'm not sure if the syllabification package has all that is needed. Either way, a smaller matter I can live w/o implementing.
I have to research עֹֽשֶׂה.
It feels about 90% there!
from hebrew-transliteration.
Left is ours, right is Khan's
The left hand side is from this package?
I've been using this script to compare the two:
const heb = require("./dist/index");
const tiberian = require("./dist/schemas/tiberian").tiberian;
const KGen1 = [
"baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ",
"vɔhɔːˈʔɔːʀ̟ɛsˁ hɔːɔjˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim",
"vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjhiː-ˈʔoːoʀ̟",
"vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim beːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ",
"vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim lɔːˈʔoːoʀ̟ ˈjoːom valaːˈħoːʃɛχ ˈq̟ɔʀ̟ɔː ˈlɔːɔjlɔː ˌvaˑjhiː-ˈʕɛːʀ̟ɛv ˌvaˑjhiː-ˈvoːq̟ɛʀ̟ ˈjoːom ʔɛːˈħɔːɔð",
"vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ʀ̟ɔːˈq̟iːjaʕ baˈθoːoχ hamˈmɔːjim viːˈhiː mavˈdiːil ˈbeːen ˈmaːjim lɔːˈmɔːjim",
"vaɟˈɟaːʕas ʔɛloːˈhiːim ʔɛθ-hɔːʀ̟ɔːˈq̟iːjaʕ vaɟɟavˈdeːel beːen hamˈmaːjim ʔaˈʃɛːɛʀ̟ mitˈtʰaːħaθ lɔːʀ̟ɔːˈq̟iːjaʕ wuˈveːen hamˈmaːjim ʔaˈʃɛːɛʀ̟ meːˈʕaːal lɔːʀ̟ɔːˈq̟iːjaʕ ˌvaˑjhiː-ˈχeːen",
"vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim ˌlɔːʀ̟ɔːˈq̟iːjaʕ ʃɔːˈmɔːjim ˌvaˑjhiː-ˈʕɛːʀ̟ɛv ˌvaˑjhiː-ˈvoːq̟ɛʀ̟ ˈjoːom ʃeːˈniː",
"vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiq̟q̟ɔːˈvuː hamˈmaːjim mitˈtʰaːħaθ haʃʃɔːˈmaːjim ʔɛl-mɔːˈq̟oːom ʔɛːˈħɔːɔð vaθeːʀ̟ɔːˈʔɛː haɟɟabbɔːˈʃɔː ˌvaˑjhiː-ˈχeːen",
"vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim laɟɟabbɔːˈʃɔː ˈʔɛːʀ̟ɛsˁ wulmiq̟ˈveː hamˈmaːjim q̟ɔːˈʀ̟ɔː jamˈmiːim vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim kʰiː-ˈtˁoːov",
"vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim ˌtʰaˑðˈʃeː hɔːˈʔɔːʀ̟ɛsˁ ˈdɛːʃɛː ˈʕeːsɛv mɑzˈrˁiːjaʕ ˈzɛːʀ̟aʕ ˈʕeːesˁ pʰaˈʀ̟iː ˈʕoːsɛˑ ppʰaˈʀ̟iː lamiːˈnoː ʔaˈʃɛːɛʀ̟ zɑrˁʕoː-ˈvoː ʕal-hɔːˈʔɔːʀ̟ɛsˁ ˌvaˑjhiː-ˈχeːen",
"vattʰoːˈsˁeː hɔːˈʔɔːʀ̟ɛsˁ ˈdɛːʃɛː ˈʕeːsɛv mɑzˈrˁiːjaʕ ˈzɛːʀ̟aʕ lamiːˈneːhuː veˈʕeːesˁ ˈʕoːsɛˑ ppʰaˈʀ̟iː ʔaˈʃɛːɛʀ̟ zɑrˁʕoː-ˈvoː lamiːˈneːhuː vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim kʰiː-ˈtˁoːov",
"ˌvaˑjhiː-ˈʕɛːʀ̟ev ˌvaˑjhiː-ˈvoːq̟ɛʀ̟ ˈjoːom ʃaliːˈʃiː"
];
const HGen1 = [
"בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃",
"וְהָאָ֗רֶץ הָיְתָ֥ה תֹ֨הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַל־פְּנֵ֣י תְה֑וֹם וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃",
"וַיֹּ֥אמֶר אֱלֹהִ֖ים יְהִ֣י א֑וֹר וַֽיְהִי־אֽוֹר׃",
"וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָא֖וֹר כִּי־ט֑וֹב וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָא֖וֹר וּבֵ֥ין הַחֹֽשֶׁךְ׃",
"וַיִּקְרָ֨א אֱלֹהִ֤ים לָאוֹר֙ י֔וֹם וְלַחֹ֖שֶׁךְ קָ֣רָא לָ֑יְלָה וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר י֥וֹם אֶחָֽד׃",
"וַיֹּ֣אמֶר אֱלֹהִ֔ים יְהִ֥י רָקִ֖יעַ בְּת֣וֹךְ הַמָּ֑יִם וִיהִ֣י מַבְדִּ֔יל בֵּ֥ין מַ֖יִם לָמָֽיִם׃",
"וַיַּ֣עַשׂ אֱלֹהִים֮ אֶת־הָרָקִ֒יעַ֒ וַיַּבְדֵּ֗ל בֵּ֤ין הַמַּ֙יִם֙ אֲשֶׁר֙ מִתַּ֣חַת לָרָקִ֔יעַ וּבֵ֣ין הַמַּ֔יִם אֲשֶׁ֖ר מֵעַ֣ל לָרָקִ֑יעַ וַֽיְהִי־כֵֽן׃",
"וַיִּקְרָ֧א אֱלֹהִ֛ים לָֽרָקִ֖יעַ שָׁמָ֑יִם וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר י֥וֹם שֵׁנִֽי׃",
"וַיֹּ֣אמֶר אֱלֹהִ֗ים יִקָּו֨וּ הַמַּ֜יִם מִתַּ֤חַת הַשָּׁמַ֙יִם֙ אֶל־מָק֣וֹם אֶחָ֔ד וְתֵרָאֶ֖ה הַיַּבָּשָׁ֑ה וַֽיְהִי־כֵֽן׃",
"וַיִּקְרָ֨א אֱלֹהִ֤ים לַיַּבָּשָׁה֙ אֶ֔רֶץ וּלְמִקְוֵ֥ה הַמַּ֖יִם קָרָ֣א יַמִּ֑ים וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֽוֹב׃",
"וַיֹּ֣אמֶר אֱלֹהִ֗ים תַּֽדְשֵׁ֤א הָאָ֙רֶץ֙ דֶּ֗שֶׁא עֵ֚שֶׂב מַזְרִ֣יעַ זֶ֔רַע עֵ֣ץ פְּרִ֞י עֹ֤שֶׂה פְּרִי֙ לְמִינ֔וֹ אֲשֶׁ֥ר זַרְעוֹ־ב֖וֹ עַל־הָאָ֑רֶץ וַֽיְהִי־כֵֽן׃",
"וַתּוֹצֵ֨א הָאָ֜רֶץ דֶּ֠שֶׁא עֵ֣שֶׂב מַזְרִ֤יעַ זֶ֙רַע֙ לְמִינֵ֔הוּ וְעֵ֧ץ עֹֽשֶׂה פְּרִ֛י אֲשֶׁ֥ר זַרְעוֹ־ב֖וֹ לְמִינֵ֑הוּ וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֽוֹב׃",
"וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר י֥וֹם שְׁלִישִֽׁי׃"
];
/**
*
* @param {string[]} expected
* @param {string[]} hebrew
* @returns
*/
function getResults(expected, hebrew) {
const khan = expected
.map((x) => x.split(" "))
.flat()
.map((x) => x.trim());
return hebrew
.map((x) => x.split(" "))
.flat()
.map((x) => x.trim())
.map((v, i) => {
const t = heb.transliterate(v, tiberian);
if (t === khan[i]) return false;
return {
text: v,
expected: khan[i],
received: t
};
})
.filter(Boolean);
}
console.log(getResults(KGen1, HGen1));
That's how I got the results above.
I had to remove maqqefs from the Hebrew text as maqqefs aren't transcribed in Khan's.
For עֹֽשֶׂה it it only occurs in the transcriptions as עֹֽשֶׂה־פְּרִ֛י in the Hebrew.
I'm not sure if the half lenth marker is due to it being in construct or the meteg/gaya. If the latter, it seems odd as the meteg/gaya in on the first syllable.
from hebrew-transliteration.
Long vowels in closed syllables have an inserted epenthetic. See the section on syllable structure.
Anyway, I'm lost... I don't know why qamets is long because in Tiberian Hebrew the vowels have quality not quantity, quantity comes from the accent and syllable (open or closed). Is qamets always long?
You're right in that the characters only differentiate quality.
On p268, he says:
Vowels represented by basic vowel signs are long when they are either (i) in a stressed syllable or (ii) in an unstressed open syllable.
I would think that the first qamats is in a unstressed closed syllable which would not make is long, but I did notice in L that the tav is marked with a rafe.
That would indicate that the first syllable is open. It could be that the yod does not close the syllable, but that does not seem the be the case as in לָ֑יְלָה [ˈlɔːɔj.lɔː] the first syallable has ɔːɔ which would indicate is is closed and stressed (p269).
I'm going to keep researching, but eventually, the best thing may be to release it and then let people tell point out where it's wrong!
from hebrew-transliteration.
This is some great research!
No doubt, הָֽיְתָ֥ה and שָֽׁמְרָ֣ה are analogous, but the latter is listed as an exception b/c of the merkha.
But, I think 347-8 hold our clues:
When shewa occurred within a word after a long vowel, it was as a general rule silent,116 e.g.
יֵשְׁבוּ֙ [jeːeʃˈvuː] (Gen. 47.6 ‘let them dwell)
...
As can be seen in the transcriptions above, we should assume that an epenthetic vowel of the same quality of long vowel occurred before the consonant with the silent shewa after the long vowel. The presence of the epenthetic in such word medial syllables is demonstrated by the fact that the first syllable can take a secondary stress in the form of a conjunctive accent,
Note, 1.2.2.4 he does not give any of the above as an example.
The main issue, however, is that it is impossible to know that the qamats, holem, and tsere are supposed to be long w/o prior lexical knowledge. I can explain that more if needed
I think I have a hacky work around:
const longerVowels = ["HOLAM", "TSERE", "QAMATS"];
if (!isAccented && isClosed && !syllable.isFinal && longerVowels.includes(vowelName)) {
const syllableSeparator = schema["SYLLABLE_SEPARATOR"] || "";
const vowelRealization = determinePatachRealization(vowel);
return noMaterText.replace(
vowel,
`${vowelRealization + lengthMarker + syllableSeparator + vowelRealization}`
);
}
from hebrew-transliteration.
See preview branch — https://deploy-preview-77--hebrewtransliteration.netlify.app/#
from hebrew-transliteration.
See this latest preview deploy:
https://652fdfe114553543b0247089--hebrewtransliteration.netlify.app/#
In short, in order for the site to be customizable, the tiberian schema ends up breaking...a lot, and in strange ways some times.
The package works correctly, but not the site
from hebrew-transliteration.
I tried with tiberian schema (hebrew-transliteration/dist/schemas/tiberianKhan.js
), still working on it, much to do:
"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.tiberianKhan = void 0;
const additionalFeatureTransliteration = require("../rules").additionalFeatureTransliteration;
exports.tiberianKhan = {
VOCAL_SHEVA: "ǝ",
HATAF_SEGOL: "ɛ",
HATAF_PATAH: "a",
HATAF_QAMATS: "o",
HIRIQ: "i",
TSERE: "e",
SEGOL: "ɛ",
PATAH: "a",
QAMATS: "ɔ",
HOLAM: "o",
QUBUTS: "u",
DAGESH: "",
DAGESH_CHAZAQ: true,
MAQAF: "-",
PASEQ: "",
SOF_PASUQ: "",
QAMATS_QATAN: "ɔ",
FURTIVE_PATAH: "a",
HIRIQ_YOD: "i:",
TSERE_YOD: "e:",
SEGOL_YOD: "ɛ:",
SHUREQ: "u:",
HOLAM_VAV: "o:",
QAMATS_HE: "ɔ:",
SEGOL_HE: "ɛ:",
TSERE_HE: "e:",
MS_SUFX: "ɔw",
ALEF: "ʔ",
BET: "v",
BET_DAGESH: "b",
GIMEL: "ʁ",
GIMEL_DAGESH: "g",
DALET: "ð",
DALET_DAGESH: "d",
HE: "h",
VAV: "v",
ZAYIN: "z",
HET: "ħ",
TET: "tˁ",
YOD: "j",
FINAL_KAF: "χ",
KAF: "χ",
KAF_DAGESH: "kʰ",
LAMED: "l",
FINAL_MEM: "m",
MEM: "m",
FINAL_NUN: "n",
NUN: "n",
SAMEKH: "s",
AYIN: "ʕ",
FINAL_PE: "f",
PE: "f",
PE_DAGESH: "pʰ",
FINAL_TSADI: "sˁ",
TSADI: "sˁ",
QOF: "q̟",
RESH: "ʀ̟",
SHIN: "ʃ",
SIN: "s",
TAV: "θ",
TAV_DAGESH: "tʰ",
DIVINE_NAME: "yhwh",
STRESS_MARKER: { location: "before-syllable", mark: "ˈ" },
/*ADDITIONAL_FEATURES: [
{ FEATURE: "syllable", HEBREW: "[\u05D0]$", TRANSLITERATION: "" },
//{ FEATURE: "syllable", HEBREW: "[\u05B4]$", TRANSLITERATION: "i:" },
//{ FEATURE: "syllable", HEBREW: "[\u05B5]$", TRANSLITERATION: "e:" },
//{ FEATURE: "syllable", HEBREW: "[\u05B6]$", TRANSLITERATION: "ɛ:" },
//{ FEATURE: "syllable", HEBREW: "[\u05B7]$", TRANSLITERATION: "a:" },
//{ FEATURE: "syllable", HEBREW: "[\u05B8]$", TRANSLITERATION: "ɔ:" },
//{ FEATURE: "syllable", HEBREW: "[\u05B9]$", TRANSLITERATION: "o:" },
//{ FEATURE: "syllable", HEBREW: "[\u05BB]$", TRANSLITERATION: "u:" },
/////{ FEATURE: "cluster", HEBREW: "[\u05B1]", TRANSLITERATION: "ɛ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B4", TRANSLITERATION: "iʔi" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B5", TRANSLITERATION: "eʔe" },
//{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B6", TRANSLITERATION: "ɛʔɛ" }, // !!! //
//{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B7", TRANSLITERATION: "aʔa" }, // !!! //
{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B8", TRANSLITERATION: "ɔʔɔ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05B9", TRANSLITERATION: "oʔo" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D0\u05BB", TRANSLITERATION: "uʔu" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B4", TRANSLITERATION: "ihi" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B5", TRANSLITERATION: "ehe" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B6", TRANSLITERATION: "ɛhɛ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B7", TRANSLITERATION: "aha" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B8", TRANSLITERATION: "ɔhɔ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05B9", TRANSLITERATION: "oho" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D4\u05BB", TRANSLITERATION: "uhu" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B4", TRANSLITERATION: "iħi" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B5", TRANSLITERATION: "eħe" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B6", TRANSLITERATION: "ɛħɛ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B7", TRANSLITERATION: "aħa" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B8", TRANSLITERATION: "ɔħɔ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05B9", TRANSLITERATION: "oħo" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D7\u05BB", TRANSLITERATION: "uħu" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B4", TRANSLITERATION: "iʕi" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B5", TRANSLITERATION: "eʕe" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B6", TRANSLITERATION: "ɛʕɛ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B7", TRANSLITERATION: "aʕa" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B8", TRANSLITERATION: "ɔʕɔ" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05B9", TRANSLITERATION: "oʕo" },
{ FEATURE: "word", HEBREW: "\u05B0\u05E2\u05BB", TRANSLITERATION: "uʕu" },
{ FEATURE: "word", HEBREW: "\u05B0\u05D9", TRANSLITERATION: "i:" }
],*/
ADDITIONAL_FEATURES: [
{
FEATURE: "cluster",
HEBREW: "\u05B0",
TRANSLITERATION: (cluster, transliteration, schema) => {
const shewa = new RegExp(transliteration, "u");
const clusterText = cluster.text;
/**
* @type {Cluster}
*/
const next = cluster.next;
const gutturalYodVowel = /[אהחעי]([\u{05B1}-\u{05BB}\u{05C7}])/u;
const match = next.text.match(gutturalYodVowel);
if (shewa.test(clusterText) && match) {
return additionalFeatureTransliteration(clusterText, shewa, match[1], schema);
}
return clusterText;
}
}
],
longVowels: false,
qametsQatan: false,
sqnmlvy: true,
wawShureq: false,
article: true,
allowNoNiqqud: false,
strict: true
};
from hebrew-transliteration.
Sample for what we should accomplish:
Genesis 1:1-4
- baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ
- vɔhɔːˈʔɔːʀ̟ɛsˁ hɔːɔjˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim
- vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjhiː-ˈʔoːoʀ
- vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim beːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ
from hebrew-transliteration.
Thanks for all this!
In the branch with the new callback function for additional features, the callback gives access to the Word
, Syllable
, or Cluster
objects and their newly added properties in v0.13.x.
Right now, I'm running into a bit of a wall. Calling something like syllable.vowelName
could return something that matches a schema property. I was envisioning it being used like this:
{
FEATURE: "syllable",
HEBREW: "\u{05B0}",
TRANSLITERATION: (syllable, hebrew, schema) => {
const next = syllable.next;
if(next && next.vowelName) {
// renamed function below from additionalFeatureTransliteration
return replaceAndTransliterate(syllable.text, new Regex(hebrew, "u"), schema[next.vowelName], schema);
}
}
}
The problem, however, is this schema[next.vowelName]
which lacks type safety...
Not totally sure how to resolve other than merging these two packages into a monorepo or heavily refactoring the schema interface — probably the latter
from hebrew-transliteration.
Probably the latter I think too.
from hebrew-transliteration.
bērēʾšît wayyabdēl
would be wrong because shewa is a short vowel and the b
in the second word is spirantizated to v
, in Tiberian transcription proposed by Khan we should have baʀ̟eːˈʃiːiθ waɟɟav'deːel
or if you want something like barē'šît wayyav'dēl
from hebrew-transliteration.
from hebrew-transliteration.
Quite close!!! Need some more little work but we are almost there:
hebrew-transliteration output:
- bǝʀ̟eʔʃi:θ bɔˈʀ̟ɔʔ ʔɛloˈhi:m ˈʔeθ haʃʃˈmajim vǝˈʔeθ hɔʔɔʀ̟ɛsˁ
- vǝhɔˈʔɔʀ̟ɛsˁ hɔjˈθɔ: ˈθohu: vɔˈvohu: vǝˈħoʃɛχ ʕal-pʰǝˈne: θǝˈho:m vǝʀ̟u:aħ ʔɛloˈhi:m mǝʀ̟aˈħɛfɛθ ʕal-pʰǝˈne: hammɔjim
- vaˈjjoʔmɛʀ̟ ʔɛloˈhi:m jǝˈhi: ˈʔo:ʀ̟ vajǝhi:-ʔo:ʀ̟
- vaˈjjaʀ̟ʔ ʔɛloˈhi:m ʔɛθ-hɔˈʔo:ʀ̟ kʰi:-ˈtˁo:v vajjavˈdel ʔɛloˈhi:m ˈbe:n hɔˈʔo:ʀ̟ u:ˈve:n haħoʃɛχ
- vajjiq̟ˈʀ̟ɔʔ ʔɛloˈhi:m lɔˈʔo:ʀ̟ ˈjo:m vǝlaˈħoʃɛχ ˈq̟ɔʀ̟ɔʔ ˈlɔjlɔ: vajǝhi:-ˈʕɛʀ̟ɛv vajǝhi:-ˈvoq̟ɛʀ̟ ˈjo:m ʔɛħɔð
Geoffrey Khan:
- baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ
- vɔhɔːˈʔɔːʀ̟ɛsˁ hɔːɔjˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim
- vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjhiː-ˈʔoːoʀ
- vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim beːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ
- vaɟɟiqˈʀ̟ɔː ʔɛloːˈhiːim lɔːˈʔoːoʀ̟ ˈjoːom valaːˈħoːʃɛχ ˈq̟ɔʀ̟ɔː ˈlɔːɔjlɔː ˌvaˑjhiː-ˈʕɛːʀ̟ɛv ˌvaˑjhiː-ˈvoːqɛ̟ʀ̟ ˈjoːom ʔɛːˈħɔːɔð
NOTES:
We should:
- ammend in the schema
VOCAL_SHEVA: "ǝ"
toVOCAL_SHEVA: "a"
(my bad!) - YOD with DAGGESH is pronounced
ɟɟ
and notjj
- in Tiberian Hebrew vocalization the vowels represent qualitative distinctions not quantitative, the vowels are long when:
(i) in a stressed syllable or
(ii) in an open unstressed syllable.
That's why vocal SHEVA even exists, is a full wovel that can't be really accentuated or made long (can't even form a sillable by
itself - NOTE: even though, we can make syllables with it but as a strict Tiberian rule we shouldn't).
eg. If there wasn't a vocal shewa invented, they would have writtenבְּרֵאשִׁית
asבַּרֵאשִׁית
but with the rule of vowel
lenghtening that would have give the reciter something likeba:ʀ̟eːˈʃiːiθ
with long PATACH in open syllable, with SHEVA we
havebaʀ̟eːˈʃiːiθ
. - in an closed accentuated syllable the vowel is extra long (
iːi
inʔɛloːˈhiːim
oreːe
inˈʔeːeθ
), in Khan's words "when a long vowel occurs in a closed syllable, an epenthetic vowel is inserted after the long vowel before the syllable final consonant", e.g. דָּבָר [dɔːˈvɔ:ɔʀ̟], [ʃɔːˈmɑːɑʀ̟]. - the epenthetic vowel in glide was pronounced like: רוּחַ [ˈʀ̟uːwaħ], שִׂיחַ [ˈsiːjaħ] etc.
- quiescent ALEPH in
bǝʀ̟eʔʃi:θ bɔˈʀ̟ɔʔ
(and elsewhere) should be dropped. - not lastly the rules of SHEVA, I quote Khan:
The shewa (שְׁוָא) sign (אְ) in the Tiberian vocalization system was read either as a vowel or as zero
When shewa was read as vocalic, its quality in the Tiberian tradition was by default the same as that of the pataḥ vowel sign, i.e., the maximally low vowel [a]
e.g. תְּכַסֶּה [tʰaχasˈsɛː] "you (ms) cover"
מְדַּבְּרִים [maðabbaˈʀ̟iːim] "speaking (mpl)"In the Tiberian tradition, when vocalic shewa occurs before a guttural consonant or the letter yod, it was realized with a different quality through an assimilatory process
(i) before a guttural (אהחע) it was realized as a short vowel with the quality of the vowel on the guttural
e.g. בְּעֶרְכְּךָ [bɛʕɛʀ̟kʰaˈχɔː] "by your evaluation"
וְהָיָה [vɔhɔːˈjɔː] "and it became"
בְּאֵר [beˈʔeːeʀ̟] "well"
מְאוֹד [moˈʔoːoð] "very"
מְחִיר [miˈħiːiʀ̟] "price"
מְעוּכָה [muʕuːˈχɔː] "pressed"
(ii) before yod, it was realized as a short vowel with the quality of short ḥireq [i]
e.g. בְּיוֹם [biˈjoːom] "on the day"
לְיִשְׂרָאֵל [lijisrˁɔːˈʔeːel] "to Israel"
תְּדַמְּיוּן [tʰaðammiˈjuːun] "you liken (mpl)"The shewa sign is combined with some of the basic vowel signs to form the so-called ḥaṭef signs
(i) ḥaṭef pataḥ (אֲ) [a]
(ii) ḥaṭef segol (אֱ) [ɛ]
(iii) ḥaṭef qameṣ (אֳ) [ɔ]
In such signs the vocalic reading of the shewa is made explicit and also its quality
The default pronunciation of vocalic shewa with the quality of [a] was equivalent to that of the ḥaṭef pataḥ sign (אֲ)
Both the vocalic shewa and the vowels expressed by ḥaṭef signs were short vowels that, in principle, had the same quantity as short vowels in closed unstressed syllables, which were represented in standard Tiberian vocalization by a simple vowel sign.
from hebrew-transliteration.
Let me take these a little at a time.
ammend in the schema VOCAL_SHEVA: "ǝ" to VOCAL_SHEVA: "a" (my bad!)
Ok, that one is easy.
YOD with DAGGESH is pronounced ɟɟ and not jj
I think I got this correct, see test
quiescent ALEPH in bǝʀ̟eʔʃi:θ bɔˈʀ̟ɔʔ (and elsewhere) should be dropped.
That makes sense. See tests on the following lines, and let me know if they're correct at least in regards to the aleph:
and
The rest will take a little more time to get to.
from hebrew-transliteration.
the epenthetic vowel in glide was pronounced like: רוּחַ [ˈʀ̟uːwaħ], שִׂיחַ [ˈsiːjaħ] etc.
See test:
Forgot about DIVINE_NAME: "yhwh", it was pronounced according to the vowels written:
That one is easy enough:
hebrew-transliteration/src/schemas/tiberian.ts
Lines 66 to 67 in da40956
Still have to work on the long vowels and sheva.
Had a baby a few months ago, hence the stop-and-go work on this
from hebrew-transliteration.
Just realizing I forgot to add a test for שִׂיחַ [ˈsiːjaħ]
from hebrew-transliteration.
Take a look at all these, and let me know if I'm missing something.
hebrew-transliteration/test/schemas/tiberian.test.ts
Lines 52 to 64 in 16480b5
What about a vav/yod before a he (not even sure if that happens)?
from hebrew-transliteration.
All seem right, besides the long vowels of course.
גָּבֹ֗הַּ gɔˈvo:ah כִּשְׁמֹ֤עַ kʰiʃˈmo:aʕ נֹ֖חַ ˈno:aħ
Summary:
- SHEVA never long, never accented
- any vowel long when accented even if the syllable is closed
- any vowel long when in open syllable
NOTE:
A vowel in an unstressed closed syllable was, in principle, short. If, however, it was followed by a series of contiguous consonants of relatively weak articulation (e.g. אהעחינל ʾhʿḥynl), then the vowel was sometimes lengthened, even when not stressed. This occurred in certain prefixes of the verbs היה hyh ‘be’ and חיה ḥyh ‘live’, namely the ḥireq of prefixes before he or ḥet, e.g. יִהְיֶ֫ה [jiːhˈjɛː] ‘he will be’, and the pataḥ of the conjunctive prefix וַ wa- before yod, e.g. וַיְהִ֫י [vaːjˈhiː] ‘and it was’.
Such lengthening is occasionally found elsewhere and is marked by the gaʿya sign, e.g. הֲשָׁ֣מַֽע עָם֩ [haˈʃɔːmaːʕ ˈʕɔːm] ‘did any people hear?’ (Deut. 4.33), שְׁמַֽע־נָ֤א [ʃamaːʕ-ˈnɔː] ‘listen’ (1 Sam. 28.22). The intention of the lengthening of the unstressed vowel in such contexts was, it seems, to ensure that adjacent weak letters were not elided in the reading.
When a long vowel occurs in a closed syllable, an epenthetic vowel is inserted after the long vowel before the syllable final consonant
e.g. דָּבָר [dɔːˈvɔ:ɔʀ̟]
שָׁמַר [ʃɔːˈmɑːɑʀ̟]
What about a vav/yod before a he (not even sure if that happens)?
Not sure I'm following.
NOTE:
Many words carry a secondary stress in addition to the main stress (fortunatelly this is noted with the cantillation marks), e.g. הָ֣אָדָ֔ם [ˌhɔːʔɔːˈðɔːm] ‘the man’ (Gen. 2.19), נִֽתְחַכְּמָ֖ה [ˌniːθḥakkaˈmɔː] ‘let us deal wisely’ (Exod. 1.10).
from hebrew-transliteration.
What about a vav/yod before a he (not even sure if that happens)?
Not sure I'm following.
The furtive patach tests have a vav or yod before a chet or ayin. I'm trying to think if there are any words with a furtive patach before a he (e.g. גָּבֹ֗הַּ), where the he is preceded by a vav or yod.
Many words carry a secondary stress in addition to the main stress (fortunatelly this is noted with the cantillation marks), e.g. הָ֣אָדָ֔ם [ˌhɔːʔɔːˈðɔːm] ‘the man’ (Gen. 2.19), נִֽתְחַכְּמָ֖ה [ˌniːθḥakkaˈmɔː] ‘let us deal wisely’ (Exod. 1.10).
This would be a feature to build out. I also really need to update the isAccented
property on the Syllable
object.
Will look at vowel length next
from hebrew-transliteration.
What about a vav/yod before a he (not even sure if that happens)?
It happens: מַגְבִּ֥יהַּ תַּגְבִּ֣יהַּ יַגִּ֥יהַּ יַגְבִּ֣יהַּ אֱלֹ֨והַּ
I will try to find with vav too, I think there are. EDIT: found in BHS only אֱלֹ֨והַּ.
Other patach furtives: מָנֹ֜וחַ לָשׂ֥וּחַ יֵשׁ֡וּעַ אֲבִישׁ֥וּעַ וּמַלְכִּישׁ֑וּעַ שְׁלִ֔יחַ רֵ֣יח
from hebrew-transliteration.
I have commented on not correct ones, I hope I didn't make any mistakes, I could ask Khan to correct but maybe a little later.
from hebrew-transliteration.
What's the latest branch with Tiberian Schema?
from hebrew-transliteration.
Tried on the latest. Genesis 1
- baʀ̟eʃiːθ bɔˈʀ̟ɔ ʔɛloˈhiːm ˈʔeθ haʃʃˈmajim vaˈʔeθ hɔʔɔʀ̟ɛsˁ
- vahɔˈʔɔʀ̟ɛsˁ hɔjˈθɔː ˈθoˈhuː vɔˈvohuː vaˈħoʃɛχ ʕal-pʰaˈneː θaˈhoːm vaˈʀ̟uːwaħ ʔɛloˈhiːm maʀ̟aˈħɛfɛθ ʕal-pʰaˈneː hammɔjim
- vaˈɟɟo֥mɛʀ̟ ʔɛloˈhiːm jaˈhiː ˈʔoːʀ̟ vajahiː-ʔoːʀ̟
- vaˈɟɟa֧ʀ̟ ʔɛloˈhiːm ʔɛθ-hɔˈʔoːʀ̟ kʰiː-ˈtˁoːv vaɟɟavˈdel ʔɛloˈhiːm ˈbeːn hɔˈʔoːʀ̟ uːˈveːn haħoʃɛχ
- vaɟɟiq̟ˈʀ̟ɔ ʔɛloˈhiːm lɔˈʔoːʀ̟ ˈjoːm valaˈħoʃɛχ ˈq̟ɔʀ̟ɔ ˈlɔjlɔː vajahiː-ˈʕɛʀ̟ɛv vajahiː-ˈvoq̟ɛʀ̟ ˈjoːm ʔɛħɔð
Khan:
- baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ
- vɔhɔːˈʔɔːʀ̟ɛsˁ hɔːɔjˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim
- vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjhiː-ˈʔoːoʀ
- vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim beːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ
- vaɟɟiqˈʀ̟ɔː ʔɛloːˈhiːim lɔːˈʔoːoʀ̟ ˈjoːom valaːˈħoːʃɛχ ˈq̟ɔʀ̟ɔː ˈlɔːɔjlɔː ˌvaˑjhiː-ˈʕɛːʀ̟ɛv ˌvaˑjhiː-ˈvo:q̟ɛʀ̟ ˈjoːom ʔɛːˈħɔːɔð
from hebrew-transliteration.
Yes, way closer! We are on the right path :)
from hebrew-transliteration.
Same branch gave me this for Gen 1:1-5:
- baʀ̟eːʃiːθ bɔːˈʀ̟ɔːɔ ʔɛːloːˈhiːijm ˈʔeːeθ haʃʃˈmaːjim vaˈʔeːeθ hɔːʔɔːʀ̟ɛsˁ
- vahɔːˈʔɔːʀ̟ɛsˁ hɔjˈθɔːh ˈθoːˈhuː vɔːˈvoːhuː vaˈħoːʃɛχ ʕal-pʰaˈneːj θaˈhoːovm vaˈʀ̟uːwaħ ʔɛːloːˈhiːijm maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneːj hamɔːjim
- vaˈɟɟoː֥mɛʀ̟ ʔɛːloːˈhiːijm jaˈhiːj ˈʔoːovʀ̟ vaːjahiːj-ʔoːʀ̟
- vaˈɟɟa֧ʀ̟ ʔɛːloːˈhiːijm ʔɛθ-hɔːˈʔoːovʀ̟ kʰiːj-ˈtˁoːovv vaɟɟavˈdeːel ʔɛːloːˈhiːijm ˈbeːejn hɔːˈʔoːovʀ̟ uːˈveːejn haːħoːʃɛχ
- vaɟɟiq̟ˈʀ̟ɔːɔ ʔɛːloːˈhiːijm lɔːˈʔoːovʀ̟ ˈjoːovm valaːˈħoːʃɛχ ˈq̟ɔːʀ̟ɔ ˈlɔjlɔːh vaːjahiːj-ˈʕɛːʀ̟ɛv vaːjahiːj-ˈvoːq̟ɛʀ̟ ˈjoːovm ʔɛːħɔð
One note (or two), the prolonged vowel appears only in accented closed syllable so not in hɔːʔɔːˈʀ̟ɛːɛsˁ
that should be hɔː'ʔɔːʀ̟ɛsˁ
, in bɔːˈʀ̟ɔːɔ
we should have only bɔːˈʀ̟ɔː
because Aleph is quiescent so it doesn't prolong the already long vowel.
We should also get rid of the Yod as mater e.g. uːˈveːejn
that should be wuˈveːen
or ʔɛːloːˈhiːijm
that should be ʔɛloːˈhiːim
etc.
Also the quality of the Sheva before gutturals and Yod: not vaˈħoːʃɛχ
but voˈħoːʃɛχ
, not vaˈʔeːeθ
but veˈʔeːeθ
etc.
from hebrew-transliteration.
Ok, some more progress is being made, but now I'm hitting up against some deeper issues related to the syllabification package:
And some other issues I'm still trying to figure out.
I'm going to remove this from the v2.4.0
milestone so I can create another release and update the site.
Once I make more substantial changes to the syllabification package, I'll return to this.
It is, however, getting much closer! For Gen 1:1-5 I'm seeing a lot of the same issues occur, so much of it should be resolved soon.
I'm also working on a book project soon so that may take time away from this (too many irons in the fire! 🔥 )
from hebrew-transliteration.
I'm confused by "הָיְתָ֥ה" which is transcribed as "hɔːɔjˈθɔː". He says regarding the first vowel:
Insertion of epenthetic in closed syllable with a long vowel: §I.2.4.
In that section he says:
When long vowels with the main stress occur in closed syllables, there is evidence that an epenthetic with the same quality as that of the long vowel occurred before the final consonant in its phonetic realization
The first vowel, however, does not take the main stress.
Thoughts?
from hebrew-transliteration.
from hebrew-transliteration.
The latest commit produces "hɔjˈθɔː", which I think it should be
from hebrew-transliteration.
Any response from Khan?
I've locally been slowly updating the tests to match the output, but only when I can infer that they are correct.
Finishing some projects this next week, then I can shift some attention back to this
from hebrew-transliteration.
from hebrew-transliteration.
This issue is "הָיְתָ֥ה" which is transcribed as "hɔːɔjˈθɔː" in the book, but definitely seems like it should be "hɔjˈθɔː"
from hebrew-transliteration.
from hebrew-transliteration.
Ok, I'll keep updating the tests
from hebrew-transliteration.
I made some updates to the tests, primarily with regard to vowel length.
There are, of course, some remaining issues:
- stress markers in wrong place with doubled consonants
- for a word like
וַיֹּ֥אמֶר
it is transliterated asvaˈɟɟoː֥mɛʀ̟
- it should be
vaɟˈɟoːmɛʀ̟
- a fix is possible, I just haven't quite figured it out
- for a word like
- possbile typos in transcriptions
- we've already discussed
הָיְתָ֥ה
being transliteratedhɔjˈθɔː
instead of the transcribedhɔːɔjˈθɔː
- another is
וַֽיְהִי
being transliterated asvaːjihiː
instead of the transcribedvaˑjhiː
- both of these issues contain a yod, so I'll research more
- another is
קָ֣רָא
being transliterated asˈq̟ɔːʀ̟ɔː
instead of the transcribedˈq̟ɔʀ̟ɔː
- the resh may be an expection
- we've already discussed
Other than that, it's getting close!
from hebrew-transliteration.
from hebrew-transliteration.
At the root of the repo, I have a test file like this:
const heb = require("./dist/index");
const tiberian = require("./dist/schemas/tiberian").tiberian;
// the first 5 verses of Gen 1
const khan = [
"baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ",
"vɔhɔːˈʔɔːʀ̟ɛsˁ hɔːɔjˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim",
"vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjhiː-ˈʔoːoʀ",
"vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim beːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ",
"vaɟɟiqˈʀ̟ɔː ʔɛloːˈhiːim lɔːˈʔoːoʀ̟ ˈjoːom valaːˈħoːʃɛχ ˈq̟ɔʀ̟ɔː ˈlɔːɔjlɔː ˌvaˑjhiː-ˈʕɛːʀ̟ɛv ˌvaˑjhiː-ˈvo:q̟ɛʀ̟ ˈjoːom ʔɛːˈħɔːɔð"
]
.map((x) => x.split(" "))
.flat();
const inputs = [
"בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃",
"וְהָאָ֗רֶץ הָיְתָ֥ה תֹ֨הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַל־פְּנֵ֣י תְה֑וֹם וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃",
"וַיֹּ֥אמֶר אֱלֹהִ֖ים יְהִ֣י א֑וֹר וַֽיְהִי־אֽוֹר׃",
"וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָא֖וֹר כִּי־ט֑וֹב וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָא֖וֹר וּבֵ֥ין הַחֹֽשֶׁךְ׃",
"וַיִּקְרָ֨א אֱלֹהִ֤ים לָאוֹר֙ י֔וֹם וְלַחֹ֖שֶׁךְ קָ֣רָא לָ֑יְלָה וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר י֥וֹם אֶחָֽד׃"
]
.map((x) => x.split(" "))
.flat();
const results = inputs
.map((v, i) => {
const t = heb.transliterate(v, tiberian);
if (t === khan[i]) return false;
return {
text: v,
expected: khan[i],
received: t
};
})
.filter(Boolean);
console.log(results);
Which is helpful for finding all the incorrect ones:
[
{ text: 'הָיְתָ֥ה', expected: 'hɔːɔjˈθɔː', received: 'hɔjˈθɔː' },
{ text: 'תֹ֨הוּ֙', expected: 'ˈθoːhuː', received: 'ˈθoːˈhuː' },
{
text: 'הַמָּֽיִם׃',
expected: 'hamˈmɔːjim',
received: 'haˈmmɔːjim'
},
// ... etc.
]
from hebrew-transliteration.
from hebrew-transliteration.
I thought that too, but ˈq̟ɔʀ̟ɔː
is the transcription at the end of the book...
I may reach out to Khan soon once I try to tackle the stress marker
from hebrew-transliteration.
from hebrew-transliteration.
Ok, so huge progress!
Using the script above,
there are only a handful of different ones!:
[
{ text: 'הָיְתָ֥ה', expected: 'hɔːɔjˈθɔː', received: 'hɔjˈθɔː' },
{ text: 'תֹ֨הוּ֙', expected: 'ˈθoːhuː', received: 'ˈθoːˈhuː' },
{ text: 'בֵּ֥ין', expected: 'beːen', received: 'ˈbeːen' },
{ text: 'קָ֣רָא', expected: 'ˈq̟ɔʀ̟ɔː', received: 'ˈq̟ɔːʀ̟ɔː' }
]
There are 2 categories.
potential typos
We've already discussed them, but הָיְתָ֥ה and קָ֣רָא may potentially be typos in TPTBH and the results from the package are actually correct.
accents
The havarotjs package doesn't mark stress in the most rigorous way. If the syllable has a taam, it is marked as accented (i.e. stressed). This is why תֹ֨הוּ֙ has two stress markers as ˈθoːˈhuː and why בֵּ֥ין has a stress marker.
next steps
I'm willing to let the accents be less than perfect for now.
I'll test a little more, push a dev release of this, make a preview site, and ask Khan for some feedback.
from hebrew-transliteration.
May actually be able to fix the "תֹ֨הוּ֙" in this
from hebrew-transliteration.
from hebrew-transliteration.
The next major hurdle has been the resh. There are two realizations — the standard "ʀ̟" and the pharyngealized "rˁ".
I have the pharyngealized almost figured out, except for one case when the resh
is in the same syllable, or at least the same foot, as a preceding alveolar
p229
He does not give an example, however, of a word like תְּפַר where the resh is in a different syllable but same foot as an alveolar, but the resh is not directly preceded by an alveolar. I believe this should be a ʀ̟ but I'm awaiting confirmation
from hebrew-transliteration.
I'm awaiting confirmation
The said they were unaware.
Given the examples and explanations in the book, I'm assuming he means that the resh must be in direct contact with the alveolar. I'll research a little more, but working off the assumption it should be a ʀ̟ for now
from hebrew-transliteration.
from hebrew-transliteration.
Ok, so I got the resh figured out.
The next step is figuring out the patach (I.2.1.2), which is very confusing. He says:
The back quality [ɑ] would have been induced in particular by the environment of consonants involving retraction of the tongue root, especially pharyngeals and pharyngealized consonants.
which makes sense at first glance.
But then on p249, he gives the transcription of [ˈbɑːʕaˌra] for בָּעֲרָ֥ה where the qamets, not patach, is transcribed with [ɑ]. Perhaps another typo? Are those transcriptions from Morag?
Then in a similar word on p621, בַּעֲצַת, he transcribes it as [baːʕɑˈsˁɑːɑθ] with the hatef-patach as [ɑ]. Is the ayin or the tsade causing the conditioned realization?
For the transcription of מַזְרִ֣יעַ on p618–19, he has [mɑzˈrˁiːjaʕ], with a note:
Pataḥ is pronounced as back [ɑ] in the environment of pharyngealized consonants: §I.2.1.3.
The first patach isn't even in the same syllable or foot as the resh, so why not [a], and the second patach precedes a pharyngeal, so why not [ɑ]???
I'm completely lost on this one...
from hebrew-transliteration.
from hebrew-transliteration.
So Khan said:
If I transcribed בָּעֲרָ֥ה as [ˈbɑːʕaˌra], this is a mistake. The transcription should be [bɔːʕaˌrɔː].
Which makes sense with regard to the qamets, but I still would have expected an [ɑ].
I asked a follow up regarding מַזְרִ֣יעַ but have not received an answer back.
from hebrew-transliteration.
from hebrew-transliteration.
With regards to realization of the patach, it is very confusing and the rules are not well defined, but I think I may have found some sort of pattern
On tiberianhebrew.com, the pronunciation of the patch the transcriptions seem inconsistent.
It says:
In the environment of pharyngealized consonants (i.e., טצ and emphatic "heavy" ר), pataḥ was realized with a back allophone [ɑ]. In other words, when pataḥ is preceded or followed by טצ or "heavy" ר, pronounce it further back in the mouth.
which is narrower than Khan's "environment of consonants involving retraction of the tongue root, especially pharyngeals and pharyngealized consonants.".
It seems patach is transcribed as [ɑ] in words with other pharyngealized consonants:
Hebrew | Transcription |
---|---|
חֲטָאָ֣ה | [ħɑtˁɔːˈʔɔː] |
BUT this may be misleading.
In short, "environment" does not seem well defined.
In the words — בַּעֲצַת, מַזְרִ֣יעַ, and חֲטָאָ֣ה — the ט,צ, or, ר all are in the next syllable.
Perhaps, if the ט,צ, or, ר:
- is the onset or coda of a syllable w/ a patach
- or is the onset of a syllable following a syllable w/ a patach
This also applied to hatef-patach as a vocal sheva (e.g. צְרוּפַָ֔ה [sˁɑ.rˁuː.ˈfɔː] p.230)
The examples on pages 248–50 represent transcriptions from other traditions, not Tiberian. This makes sense in light of:
Indirect evidence for this is found in the modern reading traditions of Middle Eastern communities
But then he doesn't give any clear transcriptions of patach as ɑ in Tiberian.
Overall, I think the section is confusing.
To recap:
- the examples on p248–50 are not relevant to Tiberian transcription
- "environment of consonants involving retraction of the tongue root" seems to only be ט,צ, and, ר (when pharyngealized); saying "especially pharyngeals" was misleading as ayin and chet don't retract the tongue though pharnygealized
- "environment" seem vague as well and may only apply when ט,צ, or, ר is the onset or coda of a syllable w/ a patach or is the onset of a syllable following a syllable w/ a patach
Who know, I may be totally wrong too...
from hebrew-transliteration.
from hebrew-transliteration.
from hebrew-transliteration.
For me our 'hɔjˈθɔː' is correct and Khan's 'hɔːɔjˈθɔː' is a typo. The same with 'ˈq̟ɔːʀ̟ɔː' against Khan's 'ˈq̟ɔʀ̟ɔː'. About עֹֽשֶׂה־פְּרִ֛י I need to study more.
EDIT:
I talked to Khan, I quote:
Sorry to disturb, following the rules in your book, as far as I can understant, but I can be mistaken, "הָיְתָ֥ה" should be "hɔjˈθɔː" but is transcribed as "hɔːɔjˈθɔː", any idea? Am I wrong? If so, why the qamas' is "ɔːɔ" here? Thank you!
Answer:
Long vowels in closed syllables have an inserted epenthetic. See the section on syllable structure.
Anyway, I'm lost... I don't know why qamets is long because in Tiberian Hebrew the vowels have quality not quantity, quantity comes from the accent and syllable (open or closed). Is qamets always long?
from hebrew-transliteration.
I would think that the first qamats is in a unstressed closed syllable which would not make is long, but I did notice in L that the tav is marked with a rafe.
That would indicate that the first syllable is open. It could be that the yod does not close the syllable, but that does not seem the be the case as in לָ֑יְלָה [ˈlɔːɔj.lɔː] the first syallable has ɔːɔ which would indicate is is closed and stressed (p269).
I'm going to keep researching, but eventually, the best thing may be to release it and then let people tell point out where it's wrong!
100% agree!
P.S. Interesting thing, the Yemenite Jews read הָֽיְתָ֥ה as hɔjæ'θɔ:
with mobile shewa, the Samaritan tradition with full vowel and with accent on it (they don't have such a thing like shewa, and the accent is on penultimae) as ayyā̊tå
EDIT: I think the shewa is vocal here. The ת without a dagesh shows that a vocal shewa precedes, the meteg/gaya under הָֽ again signals a vocal shewa (there's even a rule for that "the meteg, or ga’ayah, has actually two functions: (1) It extends the sound of the vowel; (2) It makes any šewa that is written immediately after the vowel a mobile šewa") and its a 3FS qatal of הָיָה.
https://forums.accordancebible.com/topic/34597-transliteration-of-%D7%94%D6%B8%D7%99%D6%B0%D7%AA%D6%B8%D6%A5%D7%94-in-gen-12/?do=findComment&comment=172284
So הָיְתָ֥ה has two open vowels (hɔː
and jaˈθɔː
and not hɔːɔj
and θɔː
) and not as Khan says one closed and one open, so it should be: hɔːjaˈθɔː
Cf. הָיָ֥ה the same verb first same vowel: hɔː'jɔː
(in the image you can see a BIG shewa to distinguish from a silent one in a Yemenite Miqra/Targum reading notation in Tiberian).
As a miscellaneous fact Gen.1-1:6 in the Yemenite reading tradition of Sanaa (transcribed by me from audio tape so must have errors):
1 bære:'ʃi:θ bɔ:'rɔː ʔælø:'hi:m ʔe:θ hæʃʃɔ'ma:jim we'ʔe:θ hɔ:'ʔɔ:ræsˤ
2 wɔhɔ:'ʔɔ:ræsˤ hɔ:jæ'θɔ: 'θøːhu: wɔ:'vøːhu: wø'ħøːʃæx ʕal-pæ'ne: θø'hø:m wæ'ru:wwæħ ʔælø:'hi:m mæræ'ħæ:fæθ ʕal-pæ'ne: hæmmɔ:'jim
3 wæj'jø:mær ʔælø:'hi:m ji'hi:-ʔø:r ˌwajhi:-'ʔø:r
4 wæj'jæ:r ʔælø:'hi:m ʔæθ-hɔ:'ʔøːr ki:-'tˤøːv wæjjæv'de:l ʔælø:'hi:m be:n hɔ:'ʔøːr u:ve:n hæ:'ħø:ʃæx
5 wajji:g'rɔː ʔælø:'hi:m lɔ:'ʔø:r 'jø:m wælæ'ħø:ʃæx gɔ:'rɔː 'lɔ:jlɔ: waj'hi:-'ʕæ:ræb waji'hi:-'vø:gær 'jø:m ʔæ:'ħɔːð
6 wæj'jø:mær ʔælø:'hi:m ji'hi: rɔ:'gi:jaʕ bæ'θøːx hæmmɔ:'jim wi:'hi: mæv'di:l 'be:n 'ma:jim lɔ:'mɔ:jim
The Yemenite reading tradition being a continuation of the babylonian notation doesn't distinguish between patach and segol both being pronounced as æ
but sometimes close to some pharyngeals sounds more like an ɑ
. Qof is realized as g
, gimel with dagesh like d͡ʒ
(the Arabic ج j
- jim), gimel without dagesh as ɣ
(is the only Biblical Hebrew reading tradition that keeps intact the double realization of BGDKFT letters), holem like ø
(German ö
) etc.
from hebrew-transliteration.
the meteg/gaya under הָֽ again signals a vocal shewa
That makes sense, but I can't find any editions the have a gaya underneath the first qamets.
Where did you find the text you posted?
So הָיְתָ֥ה has two open vowels (hɔː and jaˈθɔː and not hɔːɔj and θɔː) and not as Khan says one closed and one open, so it should be: hɔːjaˈθɔː
That makes sense, but I have a hard time disagreeing with Khan!
And interesting facts about Yemenite too.
p.s. if you edit your comments, I don't get notifications!
from hebrew-transliteration.
That makes sense, but I can't find any editions the have a gaya underneath the first qamets.
Any reliable full edition of the Hebrew text with cantillation marks like https://mechon-mamre.org/c/ct/cu0101.htm
Is true, some resource you find online have the meteg removed.
Where did you find the text you posted?
https://nosachteiman.co.il
If you can't find, I can happily share the PDFs with you.
That makes sense, but I have a hard time disagreeing with Khan!
Khan's work is a pioneering magnificent work but needs to be corrected, I look forward for a second edition.
from hebrew-transliteration.
Some Torah recitations in Yemeni tradition: https://youtube.com/playlist?list=PLewLZK8IRsFA1LLBQKpjl4pTPSeYVH9uZ&si=XGg6NeX67t_golX1
EDIT:
Something clearer - https://www.masoret.co/%D7%A4%D7%A8%D7%A9%D7%AA-%D7%A9%D7%91%D7%95%D7%A2/%d7%91%d7%a8%d7%90%d7%a9%d7%99%d7%aa/
from hebrew-transliteration.
I've just checked a scan of Codex Leningradensis and היתה doesn't have a meteg under ה. The yemenite Taj, Mikraot Gedolot, Koren Jerusalem Bible, Mechon Mamre and other resources have a meteg. I'll look more to find the source. As far as I'm concerned I'm very sure for myself that there should be a meteg by all tiberian rules.
from hebrew-transliteration.
More sources on meteg under ה in היתה:
The Koren Jerusalem Bible
Mechon Mamre as mentioned above
Berlin Ms. or. fol. 1
Vatican Barb.or.161
from hebrew-transliteration.
I get a response from Khan, I'm still not convinced.
Question:
I'm very confused about the word הָיְתָ֥ה in Genesis 1:2. You transcribe it as hɔːɔjˈθɔː why is that? I mean why is qamets long because the syllable is closed and not stressed. Maybe I'm wrong but I think the word should be transcribed as hɔ:ja'θɔ: and the syllables as hɔ:-ja'θɔ:, two open syllables. I support my hypothesis on the following facts:
- the shewa is vocal because follows a qamets with meteg/gaaya (not in Leningradensis), the tav is spirant and not plosive following the vocal shewa (not necessarily though, there are historically vocal shewa that became quiescent but the spirant realization remained), the word is 3fs qatal of הָיָה
- the yemenite tradition reads it as hɔ:jæ'θɔ:
- the samaritan tradition has ayyā̊tå
- I found many manuscripts or printed versions that not like Codex Leningradensis have meteg/gaaya under first he (Vatican Barb.or.161, Yemenite Taj's, Berlin Ms. or. fol. 1, the online text of Mechon Mamre, the Koren Jerusalem Bible etc.)
What do you think?
Answer:
The main basis of the transcription is that the Shewa is silent in this word in the Tiberian tradition. The ga'ya is a major ga'ya and does not indicate that the Shewa is vocalic. Other traditions have different realizations of the Shewa.
As I know, from masoretic treaties "Meteg is primarily used in Biblical Hebrew to mark secondary stress and vowel length" and "Meteg is also sometimes used in Biblical Hebrew to mark a long vowel" etc.
Need to clarify this, I'll study further the matter.
from hebrew-transliteration.
Again הָֽיְתָ֥ה in all versions, so I think Leningradensis has a typo in Gen. 1:2:
- Gen. 9:16 וְהָֽיְתָ֥ה הַקֶּ֖שֶׁת בֶּֽעָנָ֑ן
- Gen. 38:21 וַיֹּ֣אמְר֔וּ לֹא־הָֽיְתָ֥ה בָזֶ֖ה קְדֵשָֽׁה
- Gen. 38:22 אַנְשֵׁ֤י הַמָּקֹום֙ אָֽמְר֔וּ לֹא־הָֽיְתָ֥ה
- Exod. 9:24 מֵאָ֖ז הָֽיְתָ֥ה לְגֹֽוי
- Exod. 29:9 וְהָֽיְתָ֥ה לָהֶ֛ם כְּהֻנָּ֖ה
- Exod. 36:7 וְהַמְּלָאכָ֗ה הָֽיְתָ֥ה דַיָּ֛ם
- Lev. 5:13 וְהָֽיְתָ֥ה לַכֹּהֵ֖ן
- Lev. 16:29 וְהָֽיְתָ֥ה לָכֶ֖ם לְחֻקַּ֣ת עֹולָ֑ם
- Num. 19:21 וְהָֽיְתָ֥ה לָּהֶ֖ם לְחֻקַּ֣ת עֹולָ֑ם
- Deut. 21:13 וְהָֽיְתָ֥ה לְךָ֖ לְאִשָּֽׁה
- Deut. 24:2 וְהָֽיְתָ֥ה לְאִֽישׁ־אַחֵֽר
- Josh. 17:6 וְאֶ֙רֶץ֙ הַגִּלְעָ֔ד הָֽיְתָ֥ה לִבְנֵֽי־מְנַשֶּׁ֖ה
- Esther 8:16 לַיְּהוּדִ֕ים הָֽיְתָ֥ה אֹורָ֖ה וְשִׂמְחָ֑ה וְשָׂשֹׂ֖ן וִיקָֽר
- Ezekiel 36:17 כְּטֻמְאַת֙ הַנִּדָּ֔ה הָיְתָ֥ה דַרְכָּ֖ם לְפָנָֽי
And so on, at least 212 occurences in Tanakh, all with meteg under qamets.
from hebrew-transliteration.
So my mind is dizzy...I'm going to take the lazy way out and just use a Word
based rule:
{
FEATURE: "word",
HEBREW: /\u{5D4}\u{5B8}\u{5BD}?\u{5D9}\u{5B0}\u{5EA}\u{5B8}\u{5D4}/u,
TRANSLITERATION: "hɔːɔjˈθɔː"
}
Which now leaves the following "incorrect" words:
[
{ text: 'בֵּ֥ין', expected: 'beːen', received: 'ˈbeːen' },
{ text: 'עֹ֤שֶׂה', expected: 'ˈʕoːsɛˑ', received: 'ˈʕoːsɛː' },
{ text: 'פְּרִי֙', expected: 'ppʰaˈʀ̟iː', received: 'pʰaˈʀ̟iː' },
{
text: 'וַֽיְהִי־עֶ֥רֶב',
expected: 'ˌvaˑjhiː-ˈʕɛːʀ̟ev',
received: 'ˌvaˑjhiː-ˈʕɛːʀ̟ɛv'
}
]
The first, בֵּ֥ין shouldn't receive an accent; again, that's minor, so I'm fine leaving it.
For וַֽיְהִי־עֶ֥רֶב the expected ˌvaˑjhiː-ˈʕɛːʀ̟ev is most certainly a simple typo with a "e" being used for a segol instead of "ɛ".
For עֹ֤שֶׂה פְּרִי֙ I have an open issue for havarotjs which should make this doable.
So one more issue to go!
Well, actually there's another one too, but if it's not 100% accurate, I'm ok
from hebrew-transliteration.
Found the solution for the הָיְתָ֥ה headache:
Khan's "The Tiberian Pronunciation Tradition of Biblical Hebrew" vol. 1 pp 404-405
So here in Leningradensis הָיְתָ֥ה should be הָֽיְתָ֥ה and is applicable the same rule as for שָֽׁמְרָ֣ה that is 3fs qal too. In this case the meteg is noting the qamets being long in unstressed closed syllable that's why it gets a furtive patach.
from hebrew-transliteration.
The main issue, however, is that it is impossible to know that the qamats, holem, and tsere are supposed to be long w/o prior lexical knowledge.
In the case of הָֽיְתָ֥ה we know qamats is long because of the meteg
Anyway, interesting fact that Yemeni, Sefardi and Ashkenazi read the shewa in הָֽיְתָ֥ה as vocal.
from hebrew-transliteration.
I ran into small snafu where a word like קָדְשֵׁ֧י was being transcribed as q̟ɔːɔðˈʃeː instead of the correct q̟ɔðˈʃeː. This was because this code:
const longerVowels = ["HOLAM", "TSERE", "QAMATS"];
if (!isAccented && isClosed && !syllable.isFinal && longerVowels.includes(vowelName)) {
const syllableSeparator = schema["SYLLABLE_SEPARATOR"] || "";
const vowelRealization = determinePatachRealization(vowel);
return noMaterText.replace(
vowel,
`${vowelRealization + lengthMarker + syllableSeparator + vowelRealization}`
);
}
would match.
But, the primary times a qamets is in a closed, unaccented syllable, it a qamets qatan, which I can already account for!
Had the change the qametsQatan option to true, and it worked! Added some mores tests to check.
I can't think of a tsere or holam that would occur in a closed, unaccented, non-final position. I'm pretty sure they can't anyways, but there's always an exception
from hebrew-transliteration.
This word is interesting וַיֵּ֨לְכ֜וּ it is transliterated as vaɟˈɟeːelˈχuː which is mostly right, except for the two stress markers, maybe.
from hebrew-transliteration.
This word is interesting וַיֵּ֨לְכ֜וּ it is transliterated as vaɟˈɟeːelˈχuː which is mostly right, except for the two stress markers, maybe.
The transliteration is right, but the accents should only be on the second syllable. Should be easily fixable
from hebrew-transliteration.
Very very good! Still some issues with accent and some double consonants showing as simple.
בְּרֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ
וְהָאָ֗רֶץ הָֽיְתָ֥ה תֹ֨הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַל־פְּנֵ֣י תְה֑וֹם וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם
baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˌʔɔːʀ̟ɛsˁ
vɔhɔːˈʔɔːʀ̟ɛsˁ ˌhɔːjaˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː haˌmɔːjim
EDIT!
Interesting enough, if I add more text the issue doesn't show up :-|
1 בְּ֯רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַשָּׁמַ֖יִם וְאֵ֥ת הָאָֽרֶץ׃ 2 וְהָאָ֗רֶץ הָֽיְתָ֥ה תֹ֙הוּ֙ וָבֹ֔הוּ וְחֹ֖שֶׁךְ עַל־פְּנֵ֣י תְהֹ֑ום וְר֣וּחַ אֱלֹהִ֔ים מְרַחֶ֖פֶת עַל־פְּנֵ֥י הַמָּֽיִם׃ 3 וַיֹּ֥אמֶר אֱלֹהִ֖ים יְהִ֣י אֹ֑ור וַֽיְהִי־אֹֽור׃ 4 וַיַּ֧רְא אֱלֹהִ֛ים אֶת־הָאֹ֖ור כִּי־טֹ֑וב וַיַּבְדֵּ֣ל אֱלֹהִ֔ים בֵּ֥ין הָאֹ֖ור וּבֵ֥ין הַחֹֽשֶׁךְ׃ 5 וַיִּקְרָ֨א אֱלֹהִ֤ים ׀ לָאֹור֙ יֹ֔ום וְלַחֹ֖שֶׁךְ קָ֣רָא לָ֑יְלָה וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר יֹ֥ום אֶחָֽד׃ פ 6 וַיֹּ֣אמֶר אֱלֹהִ֔ים יְהִ֥י רָקִ֖יעַ בְּתֹ֣וךְ הַמָּ֑יִם וִיהִ֣י מַבְדִּ֔יל בֵּ֥ין מַ֖יִם לָמָֽיִם׃ 7 וַיַּ֣עַשׂ אֱלֹהִים֮ אֶת־הָֽרָקִיעַ֒ וַיַּבְדֵּ֗ל בֵּ֤ין הַמַּ֙יִם֙ אֲשֶׁר֙ מִתַּ֣חַת לָרָקִ֔יעַ וּבֵ֣ין הַמַּ֔יִם אֲשֶׁ֖ר מֵעַ֣ל לָרָקִ֑יעַ וַֽיְהִי־כֵֽן׃ 8 וַיִּקְרָ֧א אֱלֹהִ֛ים לָֽרָקִ֖יעַ שָׁמָ֑יִם וַֽיְהִי־עֶ֥רֶב וַֽיְהִי־בֹ֖קֶר יֹ֥ום שֵׁנִֽי׃ פ 9 וַיֹּ֣אמֶר אֱלֹהִ֗ים יִקָּו֨וּ הַמַּ֜יִם מִתַּ֤חַת הַשָּׁמַ֙יִם֙ אֶל־מָקֹ֣ום אֶחָ֔ד וְתֵֽרָאֶ֖ה הַיַּבָּשָׁ֑ה וַֽיְהִי־כֵֽן׃ 10 וַיִּקְרָ֨א אֱלֹהִ֤ים ׀ לַיַּבָּשָׁה֙ אֶ֔רֶץ וּלְמִקְוֵ֥ה הַמַּ֖יִם קָרָ֣א יַמִּ֑ים וַיַּ֥רְא אֱלֹהִ֖ים כִּי־טֹֽוב׃
1 baʀ̟eːˈʃiːiθ bɔːˈʀ̟ɔː ʔɛloːˈhiːim ˈʔeːeθ haʃʃɔːˈmaːjim veˈʔeːeθ hɔːˈʔɔːʀ̟ɛsˁ 2 vɔhɔːˈʔɔːʀ̟ɛsˁ ˌhɔːjaˈθɔː ˈθoːhuː vɔːˈvoːhuː voˈħoːʃɛχ ʕal-pʰaˈneː θoˈhoːom vaˈʀ̟uːwaħ ʔɛloːˈhiːim maʀ̟aːˈħɛːfɛθ ʕal-pʰaˈneː hamˈmɔːjim 3 vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ˈʔoːoʀ̟ ˌvaˑjihiː-ˈʔoːoʀ̟ 4 vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim ʔɛθ-hɔːˈʔoːoʀ̟ kʰiː-ˈtˁoːov vaɟɟavˈdeːel ʔɛloːˈhiːim ˈbeːen hɔːˈʔoːoʀ̟ wuˈveːen haːˈħoːʃɛχ 5 vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim lɔːˈʔoːoʀ̟ ˈjoːom valaːˈħoːʃɛχ ˈq̟ɔːʀ̟ɔː ˈlɔːɔjlɔː ˌvaˑjihiː-ˈʕɛːʀ̟ɛv ˌvaˑjihiː-ˈvoːq̟ɛʀ̟ ˈjoːom ʔɛːˈħɔːɔð f 6 vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiˈhiː ʀ̟ɔːˈq̟iːjaʕ baˈθoːoχ hamˈmɔːjim viːˈhiː mavˈdiːil ˈbeːen ˈmaːjim lɔːˈmɔːjim 7 vaɟˈɟaːʕas ʔɛloːˈhiːim ʔɛθ-ˌhɔːʀ̟ɔːq̟iːjaʕ vaɟɟavˈdeːel ˈbeːen hamˈmaːjim ʔaˈʃɛːɛʀ̟ mitˈtʰaːħaθ lɔːʀ̟ɔːˈq̟iːjaʕ wuˈveːen hamˈmaːjim ʔaˈʃɛːɛʀ̟ meːˈʕaːal lɔːʀ̟ɔːˈq̟iːjaʕ ˌvaˑjihiː-ˈχeːen 8 vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim ˌlɔːʀ̟ɔːˈq̟iːjaʕ ʃɔːˈmɔːjim ˌvaˑjihiː-ˈʕɛːʀ̟ɛv ˌvaˑjihiː-ˈvoːq̟ɛʀ̟ ˈjoːom ʃeːˈniː f 9 vaɟˈɟoːmɛʀ̟ ʔɛloːˈhiːim jiq̟q̟ɔːˈvuː hamˈmaːjim mitˈtʰaːħaθ haʃʃɔːˈmaːjim ʔɛl-mɔːˈq̟oːom ʔɛːˈħɔːɔð vaˌθeːʀ̟ɔːˈʔɛː haɟɟabbɔːˈʃɔː ˌvaˑjihiː-ˈχeːen 10 vaɟɟiq̟ˈʀ̟ɔː ʔɛloːˈhiːim laɟɟabbɔːˈʃɔː ˈʔɛːʀ̟ɛsˁ wulmiq̟ˈveː hamˈmaːjim q̟ɔːˈʀ̟ɔː jamˈmiːim vaɟˈɟaːaʀ̟ ʔɛloːˈhiːim kʰiː-ˈtˁoːov
from hebrew-transliteration.
Also, I'm realizing the cookies need to be cleared! The function which populates the modal from local storage is wacky now. Needless to say, no other schema has utilized the callbacks as much as tiberian, so there are edge cases all over the place
from hebrew-transliteration.
Latest previw deploy, word תִגְּע֖וּ got this "Hmmm...it seems something went wrong. Check the Tips button for best practices."
Any idea why? (Gen. 3:3)
from hebrew-transliteration.
By the way, what's the latest branch of hebrew-transliteration
with Tiberian schema?
from hebrew-transliteration.
The latest commit on the Tiberian branch is this.
See the log here. If you pull it down locally, you'll have to use -f
.
As for the site, in a different repo, here is the Tiberian branch and the open PR.
I plan on pushing some updates tonight, which should update the deploy branch.
I would use an incognito tab and then switch to Tiberian. The issues are not with the package, but the UI.
from hebrew-transliteration.
from hebrew-transliteration.
All good for what I tested till now, only some minor adjustments, I think you can publish then tweak on the road.
Some tweaks:
- remove accent from one-syllable word (ˈbeːen -> beːen)
- gemmination after vowel following Dehiq, not hard since there's a dagesh after vowel (ˈʕoːsɛˑ pʰaˈʀ̟iː -> ˈʕoːsɛˑ ppʰaˈʀ̟i), cf. Khan §I.2.8.1.2
I'll think of others while testing if there are.
from hebrew-transliteration.
Found another issue for זַרְעֹו־בֹ֖ו we got zaʀ̟ʕoː-voː and should be zɑrˁʕoː-ˈvoː - Gen. 1:12
EDIT: I was wrong, the issue is only on the web version.
from hebrew-transliteration.
Found another issue for זַרְעֹו־בֹ֖ו we got zaʀ̟ʕoː-voː and should be zɑrˁʕoː-ˈvoː - Gen. 1:12
EDIT: I was wrong, the issue is only on the web version.
Well, that's good to know but also frustrating! I can't seem to figure out how to get the UI to work with schema correctly
from hebrew-transliteration.
I can get it to work on the web app, but again, clearing cookies, etc.
from hebrew-transliteration.
Reproduction:
- clear cookies
- choose Tiberian
- transliterates זַרְעֹו־בֹ֖ו correctly as zɑrˁʕoː-ˈvoː
- refresh page
- incorrectly transliterates זַרְעֹו־בֹ֖ו correctly as zaʀ̟ʕoː-voː
from hebrew-transliteration.
Gen. 3:3 תִגְּע֖וּ getting an error: Error: Syllable גְּ has a sheva as a vowel, but the next syllable ע֖ does not have a vowel
, 'ayn has a vowel :)
Should be θigguˈʕuː I believe.
Gen. 3:5 וְנִפְקְח֖וּ getting an error Error: Syllable קְ has a sheva as a vowel, but the next syllable ח֖ does not have a vowel
Should be vanifq̟u'ħu: if I'm not mistaken.
Gen. 3:8 וַֽיִּשְׁמְע֞וּ Error: Syllable מְ has a sheva as a vowel, but the next syllable ע֞ does not have a vowel
Sould be vaɟɟiʃmuˈʕu:
Gen. 3:16 וְה֖וּא Error: Syllable וְ has a sheva as a vowel, but the next syllable ה֖ does not have a vowel
Should be vuˈhu:
Gen. 4:18 אֶת־מְחֽוּיָאֵ֑ל Error: Syllable מְ has a sheva as a vowel, but the next syllable חֽ does not have a vowel
Should be ʔɛθ-muħu:jɔːˈʔe:el
I think is a problem with Shureq not being defined as vowel after pharyngeals and glottals, same with Holem? I didn't check.
from hebrew-transliteration.
My last comment about Syllable ${syllable.text} has a sheva as a vowel, but the next syllable ${nextSylFirstCluster} does not have a vowel
I think can be easily corrected in the schema, or is it havarotjs? SHUREQ
isn't defined as vowel?
from hebrew-transliteration.
Any ideas @charlesLoder?
from hebrew-transliteration.
@johnlockejrr sorry just moved! I have a sense of what the issue is. I'll take a look
from hebrew-transliteration.
from hebrew-transliteration.
or is it havarotjs? SHUREQ isn't defined as vowel?
This is fixed in this here.
but that's now causing some issues in this repo
from hebrew-transliteration.
Ok, one more time!
https://deploy-preview-77--hebrewtransliteration.netlify.app/#
from hebrew-transliteration.
from hebrew-transliteration.
Now the SHUREQ is seen as a vowel but then the SHEWA is made long vowel:
Genesis 3:3 תִגְּע֖וּ I get θigguːˈʕuː but should be θigguˈʕuː
Obadia 7 שִׁלְּח֗וּךָ I get ʃilluːˈħuːχɔː but should be ʃilluˈħuːχɔː
Another anomaly I found just now:
Obadia 7 הִשִּׁיא֛וּךָ I get hiʃʃiːˈuːχɔː but should be hiʃʃiːˈʔuːχɔː because the ALEPH is not silent.
from hebrew-transliteration.
Branch is updated
![image](https://private-user-images.githubusercontent.com/32489748/279840790-04ea4d8a-57d3-43e2-a936-9139fe8503ea.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTgwMjEwNDUsIm5iZiI6MTcxODAyMDc0NSwicGF0aCI6Ii8zMjQ4OTc0OC8yNzk4NDA3OTAtMDRlYTRkOGEtNTdkMy00M2UyLWE5MzYtOTEzOWZlODUwM2VhLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjEwVDExNTkwNVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWUyYzhiOGU4OTNjZTkxOTMwNDFmOTIyNzNkYWU2MGQ2ZTQ1NjgzYWIyZWRhYTQ2MDQ4N2Y3ZDQ4ODE5YWNjODQmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.yz9tfQJgbahyunwgsTUCZX68zX9Tcgcm61zl9mq6NMk)
There was some wonky versioning issues I caused with npm, so the latest version is 2.5.2, but the latest tiberian canary is 2.5.1-tiberian.6
The preview site is up-to-date. As always, try clearing cookies something doesn't work. I still don't have all the kinks out
from hebrew-transliteration.
I'll test it and get back ;)
from hebrew-transliteration.
Just tested a little, now a strange thing happens, the ALEPH with HATEF at the beggining of a word is dropped, at least in this example:
Obadia 1
חֲזֹ֖ון עֹֽבַדְיָ֑ה כֹּֽה־אָמַר֩ אֲדֹנָ֨י יְהוִ֜ה לֶאֱדֹ֗ום שְׁמוּעָ֨ה שָׁמַ֜עְנוּ מֵאֵ֤ת יְהוָה֙ וְצִיר֙ בַּגֹּויִ֣ם שֻׁלָּ֔ח ק֛וּמוּ וְנָק֥וּמָה עָלֶ֖יהָ לַמִּלְחָמָֽה׃
ħɑˈzoːon ˌʕoːvaðˈjɔː ˌkʰkʰoː-ˈʔɔːmaʀ̟ ðoːˈnɔːɔj ʔaðoːˈnɔːj lɛːʔɛˈðoːom ʃamuːˈʕɔː ʃɔːˈmaːaʕnuː meːˈʔeːeθ ʔaðoːˈnɔːj vɑˈsˁiːiʀ̟ baggoːˈjiːim ʃulˈlɔːɔħ ˈq̟uːmuː vanɔːˈq̟uːmɔː ʕɔːˈlɛːhɔː lammilħɔːˈmɔː
אֲדֹנָ֨י יְהוִ֜ה
should be read as ʔaðoːˈnɔːj ʔɛloːˈhiːim because YHWH bears the vowels of ʔɛloːˈhiːim (hatef segol under Y is transformed to simple SHEVA), this happens because they considered bad to the ear to pronounce ʔaðoːˈnɔːj ʔaðoːˈnɔːj
from hebrew-transliteration.
Confirmed, in the web version the same, אֲדֹנָ֨י is transliterated as ðoːˈnɔːɔj
from hebrew-transliteration.
The preview site has been updated to v2.5.1-tiberian.7 with updated UI.
Fingers crossed!
from hebrew-transliteration.
Related Issues (20)
- Problem with בִּנְגִינ֗וֹת HOT 1
- Upgrade havarotjs
- Option for `STRESS_MARKER` to be excluded on default accent
- Improve documentation HOT 1
- Add `PASS_THROUGH` option HOT 3
- Paseq has stress marker
- Mid-word coda consonant without shva nah is silent (e.g. "יִשָּׂשכָר") HOT 6
- double marks HOT 4
- Maqqaf after shureq is dropped HOT 2
- SHEVA in Schema HOT 4
- Consider optimizing `ADDITIONAL_FEATURES` HOT 1
- Update furtive patach regex
- `ADDITIONAL_FEATURES` leaving stray shin/sin dot characters HOT 1
- Vocal Shewa HOT 23
- Add Journal of Semitic Studies schema HOT 5
- Add Sephardic IPA schema HOT 2
- Tiberian Jerusalem is incorrect
- Tiberian Issachar
- Help reproducing transliteration scheme from Reconstructionist siddur? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hebrew-transliteration.