Giter VIP home page Giter VIP logo

Comments (7)

aarppe avatar aarppe commented on August 15, 2024 1

Depends with what script you compose the descriptive and normative FSTs.

If you compose it with the normative analyzer as done in the crk.hfscript recipe in crk/inc (as you plonk it to the input side of the normative analyzer, and that requires inverting, I believe):

echo 'Composing spelling relaxation transducer with normative analyser transducer to create descriptive analyser.'

hfst-compose -F -1 crk-orth.hfst -2 crk-anl-norm.hfst | hfst-minimize - -o crk-anl-desc.hfst

If you create it using the standard Giella Make file, I believe that knows how to properly compile (and invert when necessary).

The one annoying difference between HFST vs. XFST and FOMA is in how they interpret input/surface/lower and output/underlying/upper sides in some compositions (which I've never fully understood - to me, lookup means looking up, so the input would be the lower side and the analysis the upper side, which is at the same time the underlying form, which gets confusing). From folks in the know, the HFST solution would be more proper, but that's less of a consolation when XFST and FOMA operate differently.

from plains-cree-fsts.

aarppe avatar aarppe commented on August 15, 2024 1

We will probably want to implement a spell-relax rule also for word-final -uh, so that one can recognize neeyuh for nîya, and maybe some other interference from English.

from plains-cree-fsts.

aarppe avatar aarppe commented on August 15, 2024

The latest spellrelax file and how it is compiled (from crk.hfscript) should be the following:

hfst-regexp2fst -S -i $GTLANG_crk/src/orthography/spellrelax.regex | hfst-invert -o crk-orth.hfst

As far as I can remember, I've tried to have only one spellrelax file for all purposes.

For the record, for the spell-checker, I've also tried to specify exactly the same modifications for the error model, though that requires using the different formalism in the *.default.txt files.

from plains-cree-fsts.

eddieantonio avatar eddieantonio commented on August 15, 2024

It needs to be INVERTED?

from plains-cree-fsts.

eddieantonio avatar eddieantonio commented on August 15, 2024

These are the results I get when the spell relax is INVERTED:

echo "acuhkos" | flookup crk-descriptive-analyzer.fomabin
acuhkos	acâhkos+N+A+Sg
acuhkos	ahcahk+N+A+Der/Dim+N+A+Sg
acuhkos	atâhk+N+A+Der/Dim+N+A+Sg
echo "neeya" | flookup crk-descriptive-analyzer.fomabin
neeya	niya+Pron+Pers+1Sg
neeya	niyâ+Ipc
echo "neeyu" | flookup crk-descriptive-analyzer.fomabin
neeyu	niyâ+Ipc
neeyu	niya+Pron+Pers+1Sg

Does this look right?

from plains-cree-fsts.

aarppe avatar aarppe commented on August 15, 2024

Yup. I'm relying on the standard descriptive HFST in giella (our own script doesn't incorporate the regex's concerning various permutations of capitalization, but that shouldn't matter here):

hfst-optimized-lookup -q src/analyser-gt-desc.hfstol

You should also get the same results with ucukos and ucuhkos.

from plains-cree-fsts.

eddieantonio avatar eddieantonio commented on August 15, 2024

I inverted the FST as of 751f86f and now it works! Closing.

from plains-cree-fsts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.