Comments (3)
I created a method, however it has a little error: a word gets also compounded
with itself, which is a nonsense. The idea is:
{{{
LEXICON Root
Noun1 ;
LEXICON Noun1
cat Noun2;
city Noun2;
fox Noun2;
panic Noun2;
try Noun2;
watch Noun2;
Noun2;
LEXICON Noun2
0:cat Ninf;
0:city Ninf;
0:fox Ninf;
0:panic Ninf;
0:try Ninf;
0:watch Ninf;
}}}
I get as result:
{{{
catsnék
catnak
catt
cat
catcatsnék
catcatnak
catcatt
catcat
catwatchnak
catwatcht
catwatch
catwatchesnék
catpanicsnék
catpanicnak
catpanict
catpanic
catfoxnak
catfoxt
catfox
catfoxesnék
}}}
where catcat is a nonsense.
Does anybody have any idea, how to avoid the same word twice?
In reality Noun1 and Noun2 should contain the same word set,
round 50.000 words, and I also think of a third and fourth one for
triple and quadro compunds.
Original comment by [email protected]
on 28 Sep 2012 at 1:58
Attachments:
from foma.
I have found a solution for filtering identical elements. Maybe, this could go
into the documentation.
{{{
!eq4.lexc: here re the first parts of the compound words; the words do not get
any ending.
Multichar_Symbols +Noun
LEXICON Root
+Noun:0 Nouns ;
LEXICON Nouns
cat #;
dog #;
horse #;
!eq41.lexc: The second part of the compound words. The words get all
conjugation endings
Multichar_Symbols +Noun +Def +Indef +Nom +Acc +Gen +Plur
+Prep+ +Art+ uN aN iN
LEXICON Root
+Noun:0 Nouns ;
LEXICON Nouns
cat AddNoun;
dog AddNoun;
horse AddNoun;
rat AddNoun;
nyuszi AddNoun;
LEXICON AddNoun
+Acc:#%^t #;
+Plur:#%^s #;
#
# eq4.foma: reads in the lexc files
# adds delimiters, get identical words, build difference
# filter
#
read lexc eq4.lexc
define Lexicon
read lexc eq41.lexc
define Lexicon2
# add limits
define Lex1 %< Lexicon %# %< Lexicon2 ;
# get identical words using _eq
define Dlex [_eq( Lex1 , %< , %#)];
# filter out > and <
define CleanupTags %> -> 0 ,,
%< -> 0 ,,
%# -> 0;
# Grammar: difference filtered
define Grammar Lex1 - Dlex .o.
CleanupTags
;
regex Grammar;
Run result:
$ foma -l eq4.foma
...
foma[1]: lower-words
horsedog^s
horsedog^t
horsecat^s
horsecat^t
horserat^s
horserat^t
horsenyuszi^s
horsenyuszi^t
doghorse^s
doghorse^t
dogcat^s
dogcat^t
dograt^s
dograt^t
dognyuszi^s
dognyuszi^t
cathorse^s
cathorse^t
catdog^s
catdog^t
catrat^s
catrat^t
catnyuszi^s
catnyuszi^t
foma[1]:
}}}
Original comment by [email protected]
on 7 Oct 2012 at 12:13
from foma.
Original comment by [email protected]
on 7 Oct 2012 at 12:15
Attachments:
from foma.
Related Issues (20)
- Python bindings on mac - how to make work? HOT 4
- Leftover test star-free? HOT 3
- Quantified concatenation using < or > fails when not escaped, different from xfst & hfst
- Implement DESTDIR in Makefile
- Modify bison files to require minimum version HOT 5
- First release? HOT 2
- Reading a full forms lexicon HOT 7
- problem installing foma on Ubuntu 20.04.2 HOT 3
- revert and longest match HOT 1
- Don't require GraphViz GUI app on MacOS HOT 2
- Segfault when using priority union with large grammar HOT 4
- How to automate testing words in Foma? HOT 1
- How to save the output in txt (or any other) format? HOT 1
- How to create DLL library for foma? HOT 1
- Cannot import foma to Python HOT 1
- Javascript FST results are different from the plain FST results HOT 5
- Please tag new 'releases' HOT 2
- Warn users if name is > 40 characters HOT 1
- yyerror() parameters and consequent syntax error prints HOT 1
- Parallel make sometimes breaks build HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from foma.