Giter VIP home page Giter VIP logo

Comments (3)

GoogleCodeExporter avatar GoogleCodeExporter commented on June 15, 2024
I created a method, however it has a little error: a word gets also compounded 
with itself, which is a nonsense. The idea is:
{{{
LEXICON Root
Noun1 ;

LEXICON Noun1
cat   Noun2;
city  Noun2;
fox   Noun2;
panic Noun2;
try   Noun2;
watch Noun2;
      Noun2;

LEXICON Noun2
0:cat   Ninf;
0:city  Ninf;
0:fox   Ninf;
0:panic Ninf;
0:try   Ninf;
0:watch Ninf;
}}}

I get as result:
{{{
catsnék
catnak
catt
cat
catcatsnék
catcatnak
catcatt
catcat
catwatchnak
catwatcht
catwatch
catwatchesnék
catpanicsnék
catpanicnak
catpanict
catpanic
catfoxnak
catfoxt
catfox
catfoxesnék
}}}

where catcat is a nonsense.

Does anybody have any idea, how to avoid the same word twice?

In reality Noun1 and Noun2 should contain the same word set, 
round 50.000 words, and I also think of a third and fourth  one for
triple and quadro compunds.

Original comment by [email protected] on 28 Sep 2012 at 1:58

Attachments:

from foma.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 15, 2024
I have found a solution for filtering identical elements. Maybe, this could go 
into the documentation.
{{{
!eq4.lexc: here re the first parts of the compound words; the words do not get 
any ending.

Multichar_Symbols +Noun 
LEXICON Root
+Noun:0     Nouns ;

LEXICON Nouns
cat   #;
dog   #;
horse #;

!eq41.lexc: The second part of the compound words. The words get all 
conjugation endings

Multichar_Symbols +Noun +Def +Indef +Nom +Acc +Gen +Plur
   +Prep+ +Art+ uN aN iN
LEXICON Root
+Noun:0     Nouns ;

LEXICON Nouns
cat   AddNoun;
dog   AddNoun;
horse AddNoun;
rat   AddNoun;
nyuszi AddNoun;

LEXICON AddNoun
+Acc:#%^t   #;
+Plur:#%^s  #;

#
# eq4.foma: reads in the lexc files
#  adds delimiters, get identical words, build difference
#  filter
#
read lexc eq4.lexc
define Lexicon
read lexc eq41.lexc
define Lexicon2
# add limits
define Lex1  %< Lexicon %# %< Lexicon2 ;
# get identical words using _eq
define Dlex [_eq( Lex1 , %< , %#)];
# filter out > and <
define CleanupTags %> -> 0 ,,
                   %< -> 0 ,,
                   %# -> 0;
# Grammar: difference filtered
define Grammar Lex1 - Dlex .o.
               CleanupTags
                        ;
regex Grammar;


Run result:
$ foma -l eq4.foma
...
foma[1]: lower-words
horsedog^s
horsedog^t
horsecat^s
horsecat^t
horserat^s
horserat^t
horsenyuszi^s
horsenyuszi^t
doghorse^s
doghorse^t
dogcat^s
dogcat^t
dograt^s
dograt^t
dognyuszi^s
dognyuszi^t
cathorse^s
cathorse^t
catdog^s
catdog^t
catrat^s
catrat^t
catnyuszi^s
catnyuszi^t
foma[1]: 
}}}

Original comment by [email protected] on 7 Oct 2012 at 12:13

from foma.

GoogleCodeExporter avatar GoogleCodeExporter commented on June 15, 2024

Original comment by [email protected] on 7 Oct 2012 at 12:15

Attachments:

from foma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.