charlesloder / hebrew-transliteration Goto Github PK

View Code? Open in Web Editor NEW

35.0 6.0 14.0 2.26 MB

A tool for transliterating Hebrew

Home Page: https://www.npmjs.com/package/hebrew-transliteration

License: MIT License

JavaScript 0.23% TypeScript 99.67% Shell 0.10%

hebrew hebrew-bible hebrew-english transliteration

hebrew-transliteration's People

Contributors

Stargazers

Watchers

Forkers

jck0 quanglam2807 chandrachandu210 benemanuel cosby5 johnlockejrr asherlporetz m-yac lyubomyr-rudko jacobweinbren lukasgeppert abalewis abrahamlewis4867 bibliae

hebrew-transliteration's Issues

SHEVA in Schema

I'm trying to change scheme to be able to produce IPA representation. Could you please recommend how to deal with SHEVA Diphthong: usually it should be represented as empty sound, but in some words like 'מְתוּקִים' 'נְקֻדָּה' 'צְהֻבִּים' it should act as 'e'. Haven't found the way to differentiate cases - or it's very specific?

Also what is the best way to define j sound ('ג'ון') - as additional_feature?

Qamets Qatan/Gadol

It's really weak. Needs to be worked on.

Update furtive patach regex

The furtive patach regex needs to be updated with an optional check for a sof pasuq character. E.g. appending (\u{05C3})

hebrew-transliteration/src/rules.ts

Lines 227 to 242 in b6b7d96

 if (syl.isFinal && !syl.isClosed) { 

 const furtiveChet = /\u{05D7}\u{05B7}$/mu; 

 if (furtiveChet.test(clusterText)) { 

 return replaceWithRegex(clusterText, furtiveChet, "\u{05B7}\u{05D7}"); 

 } 

 const furtiveAyin = /\u{05E2}\u{05B7}$/mu; 

 if (furtiveAyin.test(clusterText)) { 

 return replaceWithRegex(clusterText, furtiveAyin, "\u{05B7}\u{05E2}"); 

 } 

 const furtiveHe = /\u{05D4}\u{05BC}\u{05B7}$/mu; 

 if (furtiveHe.test(clusterText)) { 

 return replaceWithRegex(clusterText, furtiveHe, "\u{05B7}\u{05D4}\u{05BC}"); 

 } 

 }

Shin dot lost in remove()

Shin dots are lost using heb.remove().

In transliterate.ts, the cantillation is stripped using Remove() including the shin dots (U+05C1), but not the śin dots (U+05C2). This creates a dichotomy between ש (U+05E9) w/o a dot (i.e. shin) and w/ a dot (i.e. śin) used for transliteration.

However, the top level heb.remove() should not remove shin dot, especially if niqqud is present.

Will add the options to remove shin and sin dots, which is likely the best option

Update premade schemas in docs

Note: schemas are not endorsed by publishers.

The available schemas are:

brillAcademic
brillSimple
sblAcademicSpirantization
sblSimple

Just delete that

Alphabetic Presentation Block Incorrect Transliteration

The alphabetic presentation block in hebCharsTrans.js was not updated in v1.1.7.

All instances of 9 used in the older method need to be replaced with \u05BC.

Failing with two holem maters

In a word with two holem maters, it was failing.

For עֲוֹנֹותֵינוּ, sequenced as (ayin + patach) + (waw + holem) + (nun + holem + waw), it was correctly transliterated as ʿăwōnôtênû.

But for עֲוֹנוֹתֵינוּ, sequenced as (ayin + patach) + (waw + holem) + (nun + waw + holem), it was incorrectly transliterated as ʿăwōnwōtênû.

The sequence.js doesn't catch this because it chunks and sequences at each consonant, and nun and waw are both consonants though the waw isn't acting as one here.

A simpler change to testEach,js would have been:

if (/?<!ǝ|ĕ|ă|ŏ|i|ē|e|a|ā|u)wō/.test(element)){
           element = changeElementSplit(element, /?<!ǝ|ĕ|ă|ŏ|i|ē|e|a|ā|u)wō/, 'ô');
}

But Firefox's lack of support makes this not ta good choice

Shebang Line

Should add a shebang line to index.js so package can be run globally

Add `PASS_THROUGH` option

For an ADDITIONAL_FEATURE, writing the callback can get messy:

const heb = require("./dist/index");
const rules = require("./dist/rules");

const result = heb.transliterate("בְּרֵאשִׁ֖ית וַיַּבְדֵּל", {
  ADDITIONAL_FEATURES: [
    {
      // matches any sheva in a syllable that is NOT preceded by a vowel character
      HEBREW: "(?<![\u{05B1}-\u{05BB}\u{05C7}].*)\u{05B0}",
      FEATURE: "syllable",
      TRANSLITERATION: function (syllable, _hebrew, schema) {
        const next = syllable.next;
        // discrepancy here: in havarotjs SHEVA is simply the character
        // whereas transliteration is concerned with a specific sheva, a vocal sheva
        const nextVowel = next.vowelName === "SHEVA" ? "VOCAL_SHEVA" : next.vowelName;

        if (next && nextVowel) {
          const vowel = schema[nextVowel] || "";
          // replaceAndTransliterate is an internal helper function
          return rules.replaceAndTransliterate(syllable.text, new RegExp("\u{05B0}", "u"), vowel, schema);
        }

        return syllable.text;
      }
    }
  ]
});

// bērēʾšît wayyabdēl

Namely, you have to imprt a rule and use it.

The PASS_THROUGH option could work like this:

const result = heb.transliterate("בְּרֵאשִׁ֖ית וַיַּבְדֵּל", {
  ADDITIONAL_FEATURES: [
    {
      // matches any sheva in a syllable that is NOT preceded by a vowel character
      HEBREW: "(?<![\u{05B1}-\u{05BB}\u{05C7}].*)\u{05B0}",
      FEATURE: "syllable",
      PASS_THROUGH: true,
      TRANSLITERATION: function (syllable, _hebrew, schema) {
        const next = syllable.next;
        // discrepancy here: in havarotjs SHEVA is simply the character
        // whereas transliteration is concerned with a specific sheva, a vocal sheva
        const nextVowel = next.vowelName === "SHEVA" ? "VOCAL_SHEVA" : next.vowelName;

        if (next && nextVowel) {
          const vowel = schema[nextVowel] || "";
          return syllable.text.replace(new RegExp("\u{05B0}", "u"), vowel);
        }

        return syllable.text;
      }
    }
  ]
});

This way no import is used, and it can continue to map characters in the rules as usual. No need to implement existing logic.

Option for `STRESS_MARKER` to be excluded on default accent

See #41

Also for STRESS_MARKER, maybe add a field to specify not to add the mark if it's at the expected/default location which is the last syllable. "always": false, for the default. "always": true, for the current behavior.

As an example

transliterate("שַׁבָּת אֶרֶץ",  { STRESS_MARKER : {
  location:'after-vowel', 
  mark: '\u0301',
  always: false
});

// šabbāt ʾéreṣ

Sequencing Issue

Bible programs like Logos and Accordance sequence text like:

Consonant + Vowel + Dagesh + Accent

The dagesh is messing up transliteration.

Needs to be sequenced correctly. Include new tests.

Improve documentation

The documentation is lacking.

Though the code is documented decently enough for intellisense, there should be better documentation.

Typedoc is nice to generate docs from code, but it kind of a pain for making things look nice or easily navigable.

Look into docusaurus

Unicode flag breaking older browsers

Since the Hebrew characters are not a part of the astral plane the u flag is not really necessary.

// remove.js
// line 11
module.exports = (text, options = {removeVowels: false}) => !options.removeVowels ? text.replace(/[\u0591-\u05F4, \uFB1D-\uFB4F]/gu, i => hebCharsRC[i]) : text.replace(/[\u0591-\u05F4, \uFB1D-\uFB4F]/gu, i => hebCharsRV[i]);

Probably need to redo in the manner of titForTas.js:

module.exports = text => [...text].map(char => char in hebChars ? hebChars[char] : char)
                                  .reduce((a, c) => a + c)

Resh problem in Jerusalem

Getting odd result with transliteration
This word "יְרוּשָׁלִַם" from this sentence: כִּי הִנְנִי קֹרֵא, לְכָל-מִשְׁפְּחוֹת מַמְלְכוֹת צָפוֹנָה--נְאֻם-יְהוָה; וּבָאוּ וְנָתְנוּ אִישׁ כִּסְאוֹ פֶּתַח שַׁעֲרֵי יְרוּשָׁלִַם, וְעַל כָּל-חוֹמֹתֶיהָ סָבִיב, וְעַל, כָּל-עָרֵי יְהוּדָה.

Mid-word coda consonant without shva nah is silent (e.g. "יִשָּׂשכָר")

In my experience, if a consonant which is the coda of a non-final syllable does not have a sheva nah, then it is as if the consonant is not there – the only exception being when the letter is a mater, in which case the vowel lengthens (though still the consonant is not pronounced).

As a very dramatic example, the second shin in "יִשָּׂשכָר" is not pronounced – i.e. the word should be transliterated as "yissakhar", not "yissashkhar" as it is now. This example is from the Romanization of Hebrew Wikipedia page, which also mentions this rule about a coda with no sheva nah being ignored.

Another example is the alef in "פָּארָן", which I argue should not be transliterated – i.e. "paran" not "pa’ran". This is debatable, since the alef is silent anyway, but it really does bother me that the alef is transliterated here, since from my experience, it really ought not to be.

I'm curious what you think. I'd also be happy to implement a fix if you OK this change.

(Also I started to say this on a PR I just made on havarotjs, but this pair of projects is incredible – thank you for all your work here!)

Tiberian Issachar

See Khan's discussion about the name on p103. Both יִשָּׂשכָ֖ר and וְיִשָּׂשכָ֖ר should be transliterated as jissɔːχɔːɔʀ̟ and vajissɔːχɔːɔʀ̟

Text w/ taamim but no niqqud throws error

Text w/ no taamim but no niqqud throws error

text: אֽנכ֖י יהו֣ה אלה֑יך
error: "Text must contain niqqud"

Originates in sequence

hebrew-transliteration/src/sequence.ts

Lines 3 to 23 in b0a26d8

 export const vowels = /[\u{05B0}-\u{05BD}\u{05BF}\u{05C7}]/u; 

 /** 

  * sequences Hebrew charactes according to the [SBL Hebrew Font Manual](https://www.sbl-site.org/Fonts/SBLHebrewUserManual1.5x.pdf) 

  * 

  * @param text - a string of Hebrew character 

  * @param qametsQatan - option to convert regular qamets characters to qamets qatan 

  * @returns a sequenced string of text 

  * @remarks 

  * seqeuncing follows the pattern of: consonant - dagesh - vowel - ta'am as defined in the {@link https://www.sbl-site.org/Fonts/SBLHebrewUserManual1.5x.pdf | SBL Hebrew Font Manual} 

  * 

  * @example 

  * 

  * ```ts 

  * heb.sequence("\u{5D1}\u{5B0}\u{5BC}\u{5E8}\u{5B5}\u{5D0}\u{5E9}\u{5B4}\u{5C1}\u{596}\u{5D9}\u{5EA}"); 

  * "\u{5D1}\u{5BC}\u{5B0}\u{5E8}\u{5B5}\u{5D0}\u{5E9}\u{5C1}\u{5B4}\u{596}\u{5D9}\u{5EA}"; 

  * ``` 

  */ 

 export const sequence = (text: string, qametsQatan = false): string => { 

 return vowels.test(text) ? new Text(text, { qametsQatan }).text : text; 

 };

B/c it contains "vowels" (actually taam) it tries to create a Text object, but the "vowels" in havarot are different:

https://github.com/charlesLoder/havarot/blob/a824a06690b2b823f37c555aa734088ce27904e7/src/text.ts#L135-L141

  private validateInput(text: string): string {
    const niqqud = /[\u{05B0}-\u{05BC}\u{05C7}]/u;
    if (!niqqud.test(text)) {
      throw new Error("Text must contain niqqud");
    }
    return text;
  }

hebrew-transliteration should match.

Note:m ay affect

hebrew-transliteration/src/transliterate.ts

Lines 70 to 76 in b0a26d8

 // prevents Text from throwing error when no vowels 

 if (!isText && !vowels.test(text)) { 

 const sin = new RegExp(transSchema.SHIN + "\u{05C2}", "gu"); 

 return mapChars(text, transSchema) 

 .replace(sin, transSchema.SIN) 

 .replace(/\u{05C1}/gu, ""); 

 }

Incorrect consonantal waw with holem

A consonantal waw with holem עָוֹן should be ʿāwōn but get ʿāôn.

Divine Name in Simplified is Wrong

Input: יְהוָ֥ה אֱלֹהֵ֖ינוּ
Excepted: yhvh elohenu
Received: yehvah elohenu

Shin + Sin Dot ligature incorrect without vowels

Without vowels שׂגב is transliterated as šׂgb — not the š with a \u05C2 next to it.

With vowels, שָׂגַב becomes śāgab correctly.

`ADDITIONAL_FEATURES` leaving stray shin/sin dot characters

I was trying to create an ADDITIONAL_FEATURES entry which changes a final patah-yod or qamats-yod to "ai" and came up with:

{ FEATURE: "word",
  HEBREW: "([\u{05B7}\u{05B8}])י$",
  TRANSLITERATION: "$1i" },

However, this seems to have strange effects on the rest of the word when it gets applied.

The thing that is definitely an error is that when this rule is applied, any shin/sin dot characters are left in the final string! It's hard to see the tiny dot at first, but for example:

transliterate("שַׁדַּי", my_schema) === "shׁadai"
transliterate("שַׁדַּי", my_schema).charCodeAt(2).toString(16) === "5c1" // SHIN_DOT

However, as I continued to experiment, I noticed that when this rule is applied, the remaining word is often transliterated completely incorrectly in a number of other ways: all dageshes are ignored, all shevas are vocal, all shureqs are vavs, all other ADDITIONAL_FEATURES are ignored, and probably more. As an quick demonstration of this with nonsense words:

transliterate("בַּי", my_schema) === "vai"
transliterate("גַרְגַי", my_schema) === "garᵉgai
transliterate("קוּמַי", my_schema) === "kwmai"

However, is it possible these latter observations are to be expected? Is it the case that the "word" feature is really only meant to be used for whole-word transliteration? If so, then this nonsensical behavior is okay, since I'm doing something that's isn't supposed to be done. If this is the case though, how would I write the rule I'm looking for?

Suggestion: DAGESH_CHAZAQ character addition

Is it possible to overload DAGESH_CHAZAQ to accept a character, like a combining circumflex for example. And it will be applied to any dagesh forte (but not lene) and to mappiq He as well. Thank you.

Add Tiberian Transcription Schema

See discussion here

Will definitely need a test under test/schemas.

double marks

Not sure of settings but the following fail if inserted in the tests. Text from Sefaria.
produces double dagesh mark instead:

    ${"sin dagesh "}   | ${"הַשָּׂדֶֽה"} | ${"haśādê"}   | ${{ DAGESH_CHAZAQ: "\u0301" }}

produces double stress mark instead:

    ${"geresh"}   | ${"עֵ֝ינֶ֗יךָ"}   | ${"ʿênêˈkā"}   | ${{ STRESS_MARKER: { location: "after-syllable", mark: "ˈ" } }}

produces "ha" instead of "ah":

    ${"furtive patach, sof pasuq"}   | ${"רֽוּחַ׃"}    | ${"rûaḥ"}

does not separate maqaf:

    ${"psalms 2:12 maqaf"}  |  ${"נַשְּׁקוּ־בַ֡ר"}  |  ${"naššǝqû-bar"}

By the way is it possible to have SILENT_SHEVA and MAPPIQ settings (default to blank strings)? For example "שַׁוְעִ֗י" can become "shavi" if silent sheva is not marked, instead of "shav,i". And "הִ֛וא" occurs often enough that it would be great to have a setting for it instead of the cpu-consuming ADDITIONAL_FEATURES; right now it translates to "hiv'" instead of just "hi". Thank you for this project!

verse failing

attempt to transliterate
יִֽירָא֥וּךָ עִם־שָׁ֑מֶשׁ וְלִפְנֵ֥י יָ֜רֵ֗חַ דֹּ֣ור דֹּורִֽים׃

ספר תהילים פרק:72 פסוק 5
your site gives "Hmmm...it seems something went wrong"
running the code both for transliterate and sequence are failing.
transliterate

/node_modules/havarotjs/dist/utils/syllabifier.js:238
throw new Error("Syllable should not precede a Cluster with a Mater");
^

Error: Syllable should not precede a Cluster with a Mater
at groupShureqs ( /node_modules/havarotjs/dist/utils/syllabifier.js:238:23)
at groupClusters ( /node_modules/havarotjs/dist/utils/syllabifier.js:260:26)
at syllabify ( /node_modules/havarotjs/dist/utils/syllabifier.js:344:29)
at Word.get syllables [as syllables] ( /node_modules/havarotjs/dist/word.js:67:44)
at /node_modules/hebrew-transliteration/dist/index.js:410:30
at Array.map ()
at transliterate ( /node_modules/hebrew-transliteration/dist/index.js:406:24)
at Object. ( /script/hebapp.js:33:20)
at Module._compile (node:internal/modules/cjs/loader:1103:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1157:10)

sequence

{}
/node_modules/hebrew-transliteration/dist/index.js:94
var mapChars = (text, schema) => [...text].map((char) => char in transliterateMap ? schema[transliterateMap[char]] : char).join("");
^

TypeError: text is not iterable
at mapChars ( /node_modules/hebrew-transliteration/dist/index.js:94:38)
at transliterate ( /node_modules/hebrew-transliteration/dist/index.js:403:12)
at Object. ( /script/hebapp.js:36:19)
at Module._compile (node:internal/modules/cjs/loader:1103:14)
at Object.Module._extensions..js (node:internal/modules/cjs/loader:1157:10)
at Module.load (node:internal/modules/cjs/loader:981:32)
at Function.Module._load (node:internal/modules/cjs/loader:822:12)
at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:77:12)
at node:internal/main/run_main_module:17:47

Sorting - Chrome/Edge

Chrome and Edge implement the first and second elements in Array.sort() in a different order than Firefox.

[...'כִּ֣י'].sort((a,b) => console.log('a is ' + a.charCodeAt(),  ' but b is ' + b.charCodeAt()))

// Chrome and Edge
>>> a is 1460  but b is 1499
>>> a is 1468  but b is 1460
>>> a is 1443  but b is 1468
>>> a is 1497  but b is 1443

//Firefox
>>> a is 1499  but b is 1460
>>> a is 1460  but b is 1468
>>> a is 1468  but b is 1443
>>> a is 1443  but b is 1497

This was causing sequence.js not to sort correctly.

Tiberian Jerusalem is incorrect

In the tiberian schema, יְרוּשָׁלִָם is incorrectly transliterated as jaʀ̟uːʃɔːˈlaːai͏m. Create a new rule so that it is jaʀ̟uːʃɔːˈlaːjim

Metheg should be removed with accents not vowels

Latin chars after Divine name dropping

See this issue

Add Sephardic IPA schema

It could be cool to have Sephardic but with IPA characters

Waw + Holem Sequence incorrect

v.1.1.2 introduced an issue where:

בוֹ (bet + waw + holem) > bwo instead of bô

Problem strongly likely at:

// src/testEach.js 
// line 41

// Tests for waw as a holem-mater
if (/wō(?!ǝ|ĕ|ă|ŏ|i|ē|e|a|ā|u|9)/.test(element)) {
   if (/wō/.test(element)) {
            element = changeElementSplit(element, /wō(?!ǝ|ĕ|ă|ŏ|i|ē|e|a|ā|u|9)/, 'ô');
            // this is a workaround for lack of lookbehind support
            let rev = [...element].reverse().reduce((a, c) => a + c);
            if (/ōw(?!ǝ|ĕ|ă|ŏ|i|ē|e|a|ā|u|9)/.test(rev)) {
                element = changeElementSplit(element, /wō/, 'ô');   
            }
}

Need to make more robust tests.

Consider optimizing `ADDITIONAL_FEATURES`

The ADDITIONAL_FEATURES are not optimized in any way.

See one use:

hebrew-transliteration/src/rules.ts

Lines 204 to 218 in aadbfba

 for (const seq of seqs) { 

 const heb = new RegExp(seq.HEBREW, "u"); 

 if (seq.FEATURE === "cluster" && heb.test(clusterText)) { 

 const transliteration = seq.TRANSLITERATION; 

 const passThrough = seq.PASS_THROUGH ?? true; 

 if (typeof transliteration === "string") { 

 return replaceAndTransliterate(clusterText, heb, transliteration, schema); 

 } 

 if (!passThrough) { 

 return transliteration(cluster, seq.HEBREW, schema); 

 } 

 clusterText = transliteration(cluster, seq.HEBREW, schema); 

 } 

 } 

 }

Perhaps even a break statement would improve things.

Should have some benchmarks too

Maqqaf after shureq is dropped

When a maqqaf is after a shureq (e.g. "נַשְּׁקוּ־בַ֡ר"), it is dropped.

console.log(heb.transliterate(`נַשְּׁקוּ־בַ֡ר`));
// expected: naššǝqû-bar
// received: naššǝqûbar

See original issue.

Should return non-Hebrew text

Probably something like

module.exports = text => [...text].map(char => char in hebChars ? hebChars[char] : char).reduce((a, c) => a + c)

Problem with Resh ḥōlem

I expected output of bêt ḥôron, but got bêt ḥôrn for בֵּית חוֹרֹן .

Incorrect sequence with two vowels on one consonant

See charlesLoder/hebrewTransliteration#1

In cases of more then one vowel on a letter the sequence messes it up
eg: יְרְוּשָׁלִָֽם
is sequeced as יְרְוּשָׁלִָֽם
which is a mistake.
this is the two words in hex
יְרְוּשָׁלִָֽם
יְרְוּשָׁלִָֽם

Brill

Perhaps add a schema for Brill's transliteration

Add Journal of Semitic Studies schema

See @camilstaps original issue here

I don't think it's possible to recognize short/long vowels(?) to distinguish e.g. i and ī, so I used ī for hireq-yod and i for plain hireq, which may require some manual fixes from the user (what would be helpful for this is a separate field for hireq+meteg and qibbuts+meteg).

Could you provide examples? I think the ADDITIONAL_FEATURES may be able to account for that.

The style guide also prescribes that qamets before hatef qamets be transliterated as long qamets: בַּֽצָּהֳרָֽיִם should be baṣṣå̄hå̆rå̄yim, not baṣṣåhå̆rå̄yim. I am not sure if this can be specified in the current system.

Interesting, so they say the qamets under the tsade should be a qamets qatan, but they maintain a distinction between qamets qatan and qamets gadol in transliteration. The stlyesheet says:

This transcription of the quality of the vowels corresponds to the Tiberian reading tradition of Biblical Hebrew,
with the exception of the shewa. The distribution of vocalic and silent shewa, however, follows the Tiberian
tradition.

Given that Khan is the editor, I would assume that means there is no distinction between qamets qatan and qamets gadol. Maybe I'll have to pry into this one.

Tsere-he is not recognized correctly, I'm not sure why: וְהִנֵּ֥ה should be wǝhinnē, not wǝhinnɛ.

I'll research that.

Let me know what you think of the two questions above.

Initial JSON

{
  "VOCAL_SHEVA": "ǝ",
  "HATAF_SEGOL": "ɛ̆",
  "HATAF_PATAH": "ă",
  "HATAF_QAMATS": "å̆",
  "HIRIQ": "i",
  "TSERE": "ē",
  "SEGOL": "ɛ",
  "PATAH": "a",
  "QAMATS": "å̄",
  "HOLAM": "ō",
  "QUBUTS": "u",
  "DAGESH": "",
  "DAGESH_CHAZAQ": true,
  "MAQAF": " ",
  "PASEQ": "",
  "SOF_PASUQ": "",
  "QAMATS_QATAN": "å",
  "FURTIVE_PATAH": "a",
  "HIRIQ_YOD": "ī",
  "TSERE_YOD": "ē",
  "SEGOL_YOD": "ɛ",
  "SHUREQ": "ū",
  "HOLAM_VAV": "ō",
  "QAMATS_HE": "å̄",
  "SEGOL_HE": "ɛ",
  "TSERE_HE": "ē",
  "MS_SUFX": "å̄yw",
  "ALEF": "ʾ",
  "BET_DAGESH": "b",
  "BET": "ḇ",
  "GIMEL": "ḡ",
  "GIMEL_DAGESH": "g",
  "DALET": "ḏ",
  "DALET_DAGESH": "d",
  "HE": "h",
  "VAV": "w",
  "ZAYIN": "z",
  "HET": "ḥ",
  "TET": "ṭ",
  "YOD": "y",
  "FINAL_KAF": "ḵ",
  "KAF": "ḵ",
  "KAF_DAGESH": "k",
  "LAMED": "l",
  "FINAL_MEM": "m",
  "MEM": "m",
  "FINAL_NUN": "n",
  "NUN": "n",
  "SAMEKH": "s",
  "AYIN": "ʿ",
  "FINAL_PE": "p̄",
  "PE": "p̄",
  "PE_DAGESH": "p",
  "FINAL_TSADI": "ṣ",
  "TSADI": "ṣ",
  "QOF": "q",
  "RESH": "r",
  "SHIN": "š",
  "SIN": "ś",
  "TAV": "ṯ",
  "TAV_DAGESH": "t",
  "DIVINE_NAME": "yhwh",
  "SYLLABLE_SEPARATOR": "",
  "ADDITIONAL_FEATURES": [],
  "STRESS_MARKER": {
    "location": "",
    "mark": ""
  },
  "longVowels": true,
  "qametsQatan": true,
  "sqnmlvy": true,
  "wawShureq": true,
  "article": true
}

Paseq should not be removed

divine name when vowels is read Elohim shows identical to YHWH

were it is written אדני יהוה. shouldn't be yhvh yhvh.

Upgrade havarotjs

Upgrade to the latest of havarotjs v.0.13.0

Add callback to additional features

Additional features option, should be able to take a string or a callback function

ADDITIONAL_FEATURES: [{
    FEATURE: "word",
    HEBREW: "הָאָֽרֶץ",
    TRANSLITERATION: (word) => { /** do something here* / }
  }]

The callback type definition would have to change depending on the FEATURE selected.

Additionally, the HEBREW property already gets converted to a regex, so a regex should be allowed too

Transliterting מִן־הַיְאֹ֗ר

The following Hebrew word:
מִן־הַיְאֹ֗ר

When using this module:
transliterate("מִן־הַיְאֹ֗ר", { isSimple: true, qametsQatan: true }));

Outputs:
"min-hayor"

But I would expect it to output:
"min-hayeor"

"min" == "From"
"ha-yeor" == "The Nile"

ye-or ("Nile") is a two syllable word (NOT one syllable).

And here is my proof: Parashat Miketz

It is the second word of the second verse - you can click on that word to hear it's pronunciation as vocalized by a native Jewish speaker.

Could you please advise?

${"multiple words and passeq"} | ${"רַ֛עַל ׀ רַ֛עַל"}             | ${"ˈʀ̟aʕal  ˈʀ̟aʕal"}

produces

"ˈʀ̟aʕal ˈ ˈʀ̟aʕal"

The paseq is receiving a stress marker.

It shouldn't.

Is there any clear way to solve this?

Thank you for making this library.

	if (syl.isFinal && !syl.isClosed) {
	const furtiveChet = /\u{05D7}\u{05B7}$/mu;
	if (furtiveChet.test(clusterText)) {
	return replaceWithRegex(clusterText, furtiveChet, "\u{05B7}\u{05D7}");
	}

	const furtiveAyin = /\u{05E2}\u{05B7}$/mu;
	if (furtiveAyin.test(clusterText)) {
	return replaceWithRegex(clusterText, furtiveAyin, "\u{05B7}\u{05E2}");
	}

	const furtiveHe = /\u{05D4}\u{05BC}\u{05B7}$/mu;
	if (furtiveHe.test(clusterText)) {
	return replaceWithRegex(clusterText, furtiveHe, "\u{05B7}\u{05D4}\u{05BC}");
	}
	}

	export const vowels = /[\u{05B0}-\u{05BD}\u{05BF}\u{05C7}]/u;

	/**
	* sequences Hebrew charactes according to the [SBL Hebrew Font Manual](https://www.sbl-site.org/Fonts/SBLHebrewUserManual1.5x.pdf)
	*
	* @param text - a string of Hebrew character
	* @param qametsQatan - option to convert regular qamets characters to qamets qatan
	* @returns a sequenced string of text
	* @remarks
	* seqeuncing follows the pattern of: consonant - dagesh - vowel - ta'am as defined in the {@link https://www.sbl-site.org/Fonts/SBLHebrewUserManual1.5x.pdf \| SBL Hebrew Font Manual}
	*
	* @example
	*
	* ```ts
	* heb.sequence("\u{5D1}\u{5B0}\u{5BC}\u{5E8}\u{5B5}\u{5D0}\u{5E9}\u{5B4}\u{5C1}\u{596}\u{5D9}\u{5EA}");
	* "\u{5D1}\u{5BC}\u{5B0}\u{5E8}\u{5B5}\u{5D0}\u{5E9}\u{5C1}\u{5B4}\u{596}\u{5D9}\u{5EA}";
	* ```
	*/
	export const sequence = (text: string, qametsQatan = false): string => {
	return vowels.test(text) ? new Text(text, { qametsQatan }).text : text;
	};

	// prevents Text from throwing error when no vowels
	if (!isText && !vowels.test(text)) {
	const sin = new RegExp(transSchema.SHIN + "\u{05C2}", "gu");
	return mapChars(text, transSchema)
	.replace(sin, transSchema.SIN)
	.replace(/\u{05C1}/gu, "");
	}

	for (const seq of seqs) {
	const heb = new RegExp(seq.HEBREW, "u");
	if (seq.FEATURE === "cluster" && heb.test(clusterText)) {
	const transliteration = seq.TRANSLITERATION;
	const passThrough = seq.PASS_THROUGH ?? true;
	if (typeof transliteration === "string") {
	return replaceAndTransliterate(clusterText, heb, transliteration, schema);
	}
	if (!passThrough) {
	return transliteration(cluster, seq.HEBREW, schema);
	}
	clusterText = transliteration(cluster, seq.HEBREW, schema);
	}
	}
	}