<div class="snippet-clipboard-content notranslate position-relative overflow-auto" data-snippet-clip

Special cases do not need e to é conversion. How to get that done? about foma HOT 5 OPEN

GoogleCodeExporter commented on August 22, 2024

Special cases do not need e to é conversion. How to get that done?

from foma.

Comments (5)

GoogleCodeExporter commented on August 22, 2024

I found a solution:
define Etoee e -> é || _ "^" [ \0 & \k ] ; # \0: not zero kor & ként excluded

and 
define HarmRuleC C -> á // BackVowel \Vowel*  _ %^  [ \0 & \k ] .o. #  ként 
excluded
                 C -> é // FrontVowel \Vowel* _ %^  [ \0 & \k  ] .o.
                 C -> a // BackVowel \Vowel*  _ %^ [ 0 ] .o. 
                 C -> e // FrontVowel \Vowel* _ %^ [ 0 ] ;

That works, because both special cases start with 'k'.

Is there no way to say:
I want to exclude case '+For' and case '+Tem' from a rule?

Original comment by [email protected] on 4 Jan 2012 at 11:29

from foma.

GoogleCodeExporter commented on August 22, 2024

A brief comment: usually, if a rule is phonologically conditioned, it's a good 
idea to capture it with a rewrite rule, like you've done. 

On the other hand, if you're dealing with an exception, sometimes it's easier 
to mark it so in the lexicon, and have the rules bypass the exception. For 
example, in this instance, you could have marked those words where e does not 
alternate with é as, say E in the lexicon. That is, something like regE 
instead of rege. Then the rule won't affect that word, and you can place a rule 
like `E -> e` after the other rules.  Note that generally, it's most convenient 
to place the `E` only on the lower side (because you want the original form on 
the lexical side), so the entry should read something like:

{{{
rege:regE 
}}}

Minor detail, in `[ \0 & \k ]` the `\0`-part is redundant.

Original comment by [email protected] on 4 Jan 2012 at 2:18

from foma.

GoogleCodeExporter commented on August 22, 2024

The word rege is NOT an exception, but completely regular. The two endings 
(+Tem and +For) are the exceptions.

 #regét- Acc
 #regéhez- All
 # and so on...
 BUT
 #regekor- Tem
 #regeként - For
 #rege  - Nom

I can not fond out, how to say in proper regular expression form:
No ending, or ending "ként" or ending "kor" does not need e->é, all
others do need.

This is ok:
define Etoee e -> é || _ "^"  [ \k ]  ; # \0: not zero kor & ként excluded

but all trials to expand \k to \(ként) and \(kor) like:
define Etoee e -> é || _ "^"  [ \k \é \n \t | \k \o \r]  ; 
define Etoee e -> é || _ "^"  \[ k é n t | k o r ]  ; 
fail, since

 #apply down> rege+Noun+Acc
 #reget

gets wrong

How can I say: If no ending or ending = ként or ending = kor, no e->é rule, 
otherwise e->é rule?

I am worried, that \k is a bit too un-exact.

Original comment by [email protected] on 5 Jan 2012 at 9:28

from foma.

GoogleCodeExporter commented on August 22, 2024

I also tried to add +Abl, +Acc ... +Tem to each word, and then trigger to +Abl, 
etc.., no success.

Lexc:
LEXICON Case
+Abl:^tUl+Abl      #;
+Acc:^Gt+Acc       #;
...

.foma:

define Grammar Lexicon            .o.
               Etoee             ; #.o.   Here I stop

Etoee looks:
define Etoee e -> é || .#. \"^"+ _ "^"  ?*  [ "+" A b l | "+" A c c | "+" 
{Ade} | "+" {All} | "+" {Cau} | "+" {Dat} | "+" {Del} | "+" {Ela} | "+" {Fac} | 
"+" {For} | "+" {Ill} | "+" {Ine} | "+" {Ins} | "+" {Nom} | "+" {Sub} | "+" 
{Sup}| "+" {Ter} ] ?* ; 

I try both A b l and {Ade} form, none works

Results:
foma[1]: down
apply down> rege+Noun+Abl
rege^tUl+Abl
apply down> rege+Noun+Acc
rege^Gt+Acc
apply down> rege+Noun+Ade
rege^nDl+Ade
apply down> 

 no e->é on any place :-(
foma[1]: lower-words
rege^ig+Ter
rege^Pn+Sup
rege^rF+Sub
rege^+Nom
rege^VFl+Ins
rege^bFn+Ine
rege^bF+Ill
rege^ként+For
rege^VD+Fac
rege^bUl+Ela
rege^rUl+Del
rege^nFk+Dat
rege^ért+Cau
rege^hIz+All
rege^nDl+Ade
rege^Gt+Acc
rege^tUl+Abl

Strange is, that I did the same modification on the English lexc/foma files 
before:
in lexc:
LEXICON Vinf
+V+PresPart:^ing+PP #;

in foma:
define ConsonantDoubling g -> g g ||  .#. \"^"+ _ "^" ?* [ "+" {PP} | e d ] ?*;
...
define CleanupPP [ "+" {PP} ] -> 0;

define Grammar Lexicon           .o. 
               ConsonantDoubling .o. 
...
               CleanupPP         .o.
               Cleanup;

regex Grammar;


That works perfectly well:
lower-words
beg
begs
begging
begged
begged

I attach both the English and the Hungarian files here.

Original comment by [email protected] on 5 Jan 2012 at 7:56

Attachments:

from foma.

GoogleCodeExporter commented on August 22, 2024

I have found a quite well-looking solution. I modified step by step the English 
file, until it handled the Hungarian nouns, as it should. We can close this 
issue.

Original comment by [email protected] on 6 Jan 2012 at 8:41

Attachments:

from foma.

Special cases do not need e to é conversion. How to get that done? about foma HOT 5 OPEN

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent