Giter VIP home page Giter VIP logo

giellalt / lang-fao Goto Github PK

View Code? Open in Web Editor NEW
15.0 23.0 1.0 61.84 MB

Finite state and Constraint Grammar based analysers and proofing tools, and language resources for the Faroese language

Home Page: https://giellalt.uit.no

License: GNU General Public License v3.0

Makefile 2.14% Shell 3.76% M4 2.73% Perl 0.06% Regular Expression 1.16% XML 0.21% YAML 9.07% Text 80.87%
finite-state-transducers constraint-grammar minority-language nlp language-resources proofing-tools giellalt-langs maturity-prod geo-nordic langfam-indoeuropean

lang-fao's Issues

fst adds ?? reading in some, but not all cases

"<Føroyum>"
        "Føroyar" N Prop Fem Pl Dat Indef <W:0.0>
        "Føroyum" ?? <W:0.0>
"<.>"
        "." CLB <W:0.0>
: 
"<Tá ið>"
        "tá ið" CS <W:0.0>
: 
"<ístíðirnar>"
        "ístíð" N Fem Pl Acc Def <W:0.0>
        "ístíð" N Fem Pl Nom Def <W:0.0>
        "ístíðirnar" ?? <W:0.0>
        "tíð" N Fem Pl Acc Def <W:10.0>
                "ísur" CmpNP/None N Msc Sg Acc Cmp <W:10.0>
        "tíð" N Fem Pl Nom Def <W:10.0>
                "ísur" CmpNP/None N Msc Sg Acc Cmp <W:10.0>
        "tíð" N Fem Pl Acc Def <W:10.0>
                "ísur" CmpNP/None N Msc Sg Gen Cmp <W:10.0>
        "tíð" N Fem Pl Nom Def <W:10.0>
                "ísur" CmpNP/None N Msc Sg Gen Cmp <W:10.0>
: 
"<vóru>"
        "vera" V Ind Prt Pl <W:0.0>
        "vóru" ?? <W:0.0>
: 
"<av>"
        "av" ?? <W:0.0>
        "av" Interj <W:0.0>
        "av" Pr <W:0.0>
: 
"<um>"
        "um" ?? <W:0.0>
        "um" CS <W:0.0>
        "um" Pr <W:0.0>
: 

... for the command

cat misc/freecorpus.txt |hfst-tokenize -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst

I have no idea what is happening.

CLB analysis lost after ABBR in grammar checker

echo kvæð. | ./tools/grammarcheckers/modes/trace-faogram-release3-mwe-split.mode 
"<kvæð.>"
	"kvæð" ABBR Gram/IAbbr N Abbr <W:0.0>
;	"." CLB <W:0.0> "<.>"
;		"kvæð" ABBR Gram/IAbbr N Abbr <W:0.0> "<kvæð>" REMOVE:1709:longest-match
;	"." CLB <W:0.0> "<.>"
;		"kvæða" V Imp Sg <W:0.0> "<kvæð>" REMOVE:1709:longest-match
:\n

The correct analysis should have been the second one, which would have given a separate .+CLB analysis after cg-mwesplit.

Please see SME and other Sami Languages for working examples.

dependency.cg3 - ! is not a comment in CG-3

make[3]: Entering directory '/build/giella-fao-0.2.0+g3666~3246a37f-1~sid1/src/cg3'
echo "! missing dependency dependency.cg3" > dependency.cg3
"/usr/bin/cg-comp" dependency.cg3 dependency.bin
dependency.cg3: Error: Garbage data encountered on line 1 near `! missing dependency`!

! should be #.

fao-dis.rle works on victorio, but not on the mac (

This issue was created automatically with bugzilla2github

Bugzilla Bug 707

Date: 2008-07-05T08:25:17+02:00
From: Trond Trosterud <<trond.trosterud>>
To: Trond Trosterud <<trond.trosterud>>
CC: gunnar.hrafn

Last updated: 2008-08-10T09:33:18+02:00

Adjektiv har ikkje trebokstavsgrensa for samansetjing ⇒ mystiske forslag

Jf dette:

Bilde 27 08 2020 klokken 09 26

Fyrste og tredje forslag er ok, det andre ikkje. Analysen av det er:

echo eygaóð | hfst-lookup -q src/analyser-gt-norm.hfstol 
eygaóð	eyga+N+Neu+Sg+Gen+Cmp#óður+A+Fem+Sg+Nom+Indef
eygaóð	eyga+N+Neu+Sg+Gen+Cmp#óður+A+Neu+Pl+Acc+Indef
eygaóð	eyga+N+Neu+Sg+Gen+Cmp#óður+A+Neu+Pl+Nom+Indef

Det er opplagt ikkje ei samansetjing me vil ha. @Trondtr - noko du kan sjå på? @fo-raettstavari - synspunkt?

C-deletion in verbs (

This issue was created automatically with bugzilla2github

Bugzilla Bug 708

Date: 2008-07-09T23:04:08+02:00
From: Gunnar Hrafn <<gunnar.hrafn>>
To: Trond Trosterud <<trond.trosterud>>

Last updated: 2011-08-09T23:19:02+02:00

... backtracking around the substring ... but ... no analyses.

There are many error messages like:
Warning: The analysis of "<t.d.>" has backtracking around the substring "<t.d>", but that substring has no analyses.
The lexc lexicon structure is:

t.d ab-dot-trab ;

LEXICON ab-dot-trab ab-dot-noun-trab ; ! assuming noun

LEXICON ab-dot-noun-trab    +ABBR+Gram/TAbbr:    ab-dot-noun ;

LEXICON ab-dot-noun   !!= * **@CODE@**  This is the lexicon for abbrs that must have a period.
+N+Abbr:  dot-infl ;

LEXICON dot-infl   !!= * **@CODE@**
!!= **@LEXNAME@**
DOT ;

LEXICON DOT   !!= * **@CODE@** - Adds the dot to dotted abbreviations.
!!= **@LEXNAME@**

 +Use/-PMatch:%. # ; ! We need the dot here for regular fsts
! Split the abbr + full stop in two segments, but only when using pmatch:
< "@P.Pmatch.Loc@" {.} "+CLB":0 "+Use/PMatch":0 > # ;
! Make a regular ABBR analysis AND backtrack to find alternative analyses:
< "+Use/PMatch":0 "@P.Pmatch.Backtrack@" 0:%. > # ;

The error message is correct: "t.d" (the version without final dot) indeed has no analysis. Now, the situation is:

Either the error message gives rise to concern, and we should change the fst (although I do not see the problem)
Or everything is ok, in which case I suggest we get rid of the error message.

Preprocessor does not give newline for words. (

This issue was created automatically with bugzilla2github

Bugzilla Bug 653

Date: 2008-02-10T20:42:18+01:00
From: Trond Trosterud <<trond.trosterud>>
To: Saara Huhmarniemi <<saara.huhmarniemi>>

Last updated: 2008-03-14T09:34:26+01:00

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.