Giter VIP home page Giter VIP logo

hunglish-webapp's People

Contributors

d0lphin avatar bpgergo avatar

Stargazers

dlacko avatar

hunglish-webapp's Issues

empty query

http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=&qr=&source=all

VelocityServlet: Error processing the template

search exception
javax.servlet.ServletException: search exception
    at 
mokk.nlp.dictweb.handlers.BiCorpusSearchHandler.handleRequest(BiCorpusSearc
hHandler.java:146)
    at 
mokk.nlp.dictweb.handlers.BiCorpusSearchHandler$BCELWrapper.handleRequest(U
nknown Source)
    at mokk.nlp.dictweb.MainServlet.handleRequest(MainServlet.java:60)
    at 
org.apache.velocity.servlet.VelocityServlet.doRequest(VelocityServlet.java:
358)
    at 
org.apache.velocity.servlet.VelocityServlet.doGet(VelocityServlet.java:317)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
    at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
nFilterChain.java:200)
    at 
org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilte
rChain.java:51)
    at 
org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChai
n.java:129)
    at java.security.AccessController.doPrivileged(Native Method)
    at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
hain.java:125)
    at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.j
ava:209)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.j
ava:144)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardContext.invoke(StandardContext.java:2358)
    at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:13
3)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.valves.ErrorDispatcherValve.invoke(ErrorDispatcherValve
.java:118)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:594)
    at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:11
6)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:594)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.jav
a:127)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.coyote.tomcat4.CoyoteAdapter.service(CoyoteAdapter.java:152)
    at 
org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:300)
    at 
org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:374)
    at 
org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:743)
    at 
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:675
)
    at 
org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:866)
    at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.ja
va:683)
    at java.lang.Thread.run(Thread.java:595)

Original issue reported on code.google.com by bpgergo on 25 Nov 2009 at 3:21

wrong translation or bug

http://szotar.mokk.bme.hu/hunglish/search/corpus?
ql=Azt%E1n+nemsok%E1ra+felt%FBnt+kis+kocsij%E1val+a+b%FCf%E9s+boszork%E1ny&qr
=&source=all

hu sen: Aztán nemsokára feltűnt kis kocsijával a büfés boszorkány
en sen: The sky was so dark and the windows so steamy that the lanterns were 
lit by midday.

Original issue reported on code.google.com by bpgergo on 2 Mar 2010 at 6:19

search for this "-+tuesday"

http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=&qr=-
%2Btuesday&source=all

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:47

  • Merged into: #6

fileupload

New feature is needed: users should be able to upload files.

Original issue reported on code.google.com by bpgergo on 11 Dec 2009 at 11:28

switch to up to date release of Lucene

The project uses an unofficial release of Lucene (namely 1.5 rc1 dev)

Let's just switch to the actual release (2.4.1)

Original issue reported on code.google.com by bpgergo on 5 Nov 2009 at 3:05

Documentation / code cleanup

1) write detailed documentation to guide a new user how to build the whole 
thing from source and how to run it on a certain platform (probably only 
Debian/Ubuntu will be supported)
2) Review, clean up code


Original issue reported on code.google.com by bpgergo on 1 Mar 2011 at 5:55

highlight for multiple words

Többszavas kifejezésekre is menjen a highlight.

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:49

build issues

build complete webapp from source

Original issue reported on code.google.com by bpgergo on 1 Dec 2009 at 2:32

unicode-capable sentence segmentation needed

Meglepetes ennyi ev utan: a huntoken html entitasokka
kodolja azokat, akik nincsenek benne a latin2 tablaban.
Me'g egyszer, lassabban: a hu.sen.one.sh kimenete latin2 kodolasu,
html entitasokkal augmentalt szoveg. Szerencsere me'g ezeken
sem all teljesen fejre a hunalign, es aztan a frontend ma'r szepen
megjeleniti oket. Mindenesetre nem lenne szep kidobalni minosegszureskor
az entitasokat tartalmazo mondatokat, mert van beloluk vagy 220,000.
Pelda olyanra, amit tele van ezzel:
datasources/hunglish2/en/hemingway-across_the_river_and_into_the_trees.txt 
, aminek az az oka, hogy rosszul lettek ocr-ezve az aposztrofok.
Ha majd utf8 lesz a pipeline, azt valoszinuleg nem eli tul a huntoken.

Original issue reported on code.google.com by [email protected] on 2 Mar 2011 at 2:57

doc-to-text encoding problem

- Me'g egy meglepetes: van egy-ket doksi, peldaul 
whole.top1000.sav/harness.data/hu/doc/469.hu.doc
azaz 
/big3/Work/HunglishMondattar/datasources/hunglish2/hu/christie-nyaralo_gyilkosok
.doc
, amikre a hackelt tcg/scripts/catdoc_latin2.sh nem szuperal.
Miert nem szuperal? Mert "catdoc -dISO-8859-2" helyett "catdoc -dutf-8" 
tortenik,
es az utobbi valamiert latin1 o"-u"-t tesz a szovegbe, amit aztan az iconv 
elhajit.
(Furcsa, de a word jol jeleniti meg.)

Kiket erint ez? Kabe ezeket, bar ennel me'g pontosabban is meg kell majd nezni:
ls whole.top1000.sav/harness.data/hu/doc/* | while read f ; do echo -n "$f " ; 
cat $f | catdoc -dutf-8 | ( iconv --f utf8 --t latin2 -c || true ) | grep -c " 
n " ; done | grep -v " 0$"

Original issue reported on code.google.com by [email protected] on 2 Mar 2011 at 2:55

search for this "- tuesday"

http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=&qr=-+tuesday&source=all 

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:47

FileUpload issues

1) the file name format should be 192.en.pdf instead of 192_EN.pdf
2) the default author should be null (not the first in the list)
3) "All" should not appear in Genres

Original issue reported on code.google.com by bpgergo on 21 Feb 2011 at 10:15

improve ranking

egy ötlet: 
ha több forrásból is van találat, 
akkor az első pár találat lehetőleg legyen különböző forrásokból, 
hogy növelje a találatok sokféleségét.

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:44

html-to-text encoding problem

A barom html2text meghagyja utf8-nak a tenylegesen utf8 szoveget,
viszont a html entitasokat lelkesen atkonvertalja latin-1-re. Az
eredmeny a jogi szovegek eseteben egy olyan keverek, ahol a fejlec
latin-1, a test utf8. A CELEX-nel me'g be lehetne drotozni, hogy utf8
es kesz, de a nagyvilagban persze vannak latin-2 html-ek.
UPDATE: A CELEX-re kezzel megcsinaltam egy elo-konverziot latin2-re.

Original issue reported on code.google.com by [email protected] on 2 Mar 2011 at 2:59

indexer testing

test the indexer on the output coming from the new pipeline

Original issue reported on code.google.com by bpgergo on 26 Nov 2009 at 3:00

Idézőjel-eltűnős bug.

Idézőjel-eltűnős bug.

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:49

jmorph tövező imeretlen szavakat nem ad vissza tőként

Bugreport: Az ismeretlen szavakat a jmorph nem veszi fel sajat tovukkent.
Ergo minden ismeretlen szora automatice nulla talalatot kapunk.
http://kozel.mokk.bme.hu:8080/hunglish/search?huSentence=keletkezett&enSentence=
happened&doc.genre=-10
http://kozel.mokk.bme.hu:8080/hunglish/search?huSentence=Daala&enSentence=&doc.g
enre=-10
Magyarra is angolra is.


Original issue reported on code.google.com by bpgergo on 24 Jan 2011 at 8:12

clean up absolute-relative paths in .properties files

A leheto legkevesebb helyen legyen a propertiesben abszolut path. Az osszes 
tobbi path relativ pathkent legyen megadva a fent emlitett abszolut path-okhoz 
kepest.

Mostanra stabilizalodott az alkalmazas konyvtar-layout-ja, ugyhogy haromnal 
tobb abszolut path szinte biztosan felesleges, annak a jele, hogy valami me'g 
nincs vegiggondolva. A harom, amirol tudok:
1. hol vannak a nyelvi eroforrasok.
2. hol van az ugynevezett deployment konyvtar, ami alatt minden adatkonyvtar 
van. A java webapp nezopontjabol ezek kozul egyebkent csak harom relevans van: 
fileUpload,hunglishIndex,hunglishIndexTmp.
3. hol van a harness_cronjob.sh.

A kodban termeszetesen egyaltalan semmilyen path ne legyen, se relativ, se 
abszolut.

Original issue reported on code.google.com by [email protected] on 1 Mar 2011 at 9:03

duplicates

duplumszűrés:
a központozás nélküli mondatokból csinálunk egy hash-t és azt lerakjuk 
egy 
plusz mezőben.
a lucene az indexeléskor figyelje, hogy ez szerepelt-e már a mondat (a hash-
re keresünk rá) 
és ha igen, akkor egy plusz fieldben megjegyezzük, hogy duplikátum

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:46

possible memory leak when stopping webapp

Oct 26, 2010 9:23:21 PM org.apache.catalina.loader.WebappClassLoader 
clearReferencesJdbc
SEVERE: The web application [/hunglish-0.1.0-SNAPSHOT] registered the JBDC 
driver [com.mysql.jdbc.Driver] but failed to unregister it when the web 
application was stopped. To prevent a memory leak, the JDBC Driver has been 
forcibly unregistered.
Oct 26, 2010 9:23:21 PM org.apache.catalina.loader.WebappClassLoader 
clearReferencesThreads
SEVERE: The web application [/hunglish-0.1.0-SNAPSHOT] appears to have started 
a thread named [Timer-1] but has failed to stop it. This is very likely to 
create a memory leak.
Oct 26, 2010 9:23:21 PM org.apache.catalina.loader.WebappClassLoader 
clearReferencesThreads
SEVERE: The web application [/hunglish-0.1.0-SNAPSHOT] appears to have started 
a thread named [MySQL Statement Cancellation Timer] but has failed to stop it. 
This is very likely to create a memory leak.
Oct 26, 2010 9:23:21 PM org.apache.catalina.loader.WebappClassLoader 
clearThreadLocalMap
SEVERE: The web application [/hunglish-0.1.0-SNAPSHOT] created a ThreadLocal 
with key of type 
[org.aspectj.runtime.internal.cflowstack.ThreadStackFactoryImpl.ThreadStackImpl]
 (value 
[org.aspectj.runtime.internal.cflowstack.ThreadStackFactoryImpl$ThreadStackImpl@
2ff3fb]) and a value of type [java.util.Stack] (value [[]]) but failed to 
remove it when the web application was stopped. This is very likely to create a 
memory leak.
log4j:ERROR LogMananger.repositorySelector was null likely due to error in 
class reloading, using NOPLoggerRepository.

Original issue reported on code.google.com by bpgergo on 26 Oct 2010 at 7:24

JMorph tövező hiányosságai

pl:
http://szotar.mokk.bme.hu/hunglish/search/corpus?
ql=leg%E1ltal%E1nosabb&qr=&source=all



Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:43

start page error

go this page:
http://szotar.mokk.bme.hu/hunglish/search
resutl is this:
VelocityServlet: Error processing the template

java.lang.NullPointerException
    at mokk.nlp.dictweb.DefaultRequestDispatcher.dispatch(DefaultRequestDispatcher.java:132)
    at mokk.nlp.dictweb.DefaultRequestDispatcher$BCELWrapper.dispatch(Unknown Source)
    at mokk.nlp.dictweb.MainServlet.handleRequest(MainServlet.java:58)
    at org.apache.velocity.servlet.VelocityServlet.doRequest(VelocityServlet.java:358)
    at org.apache.velocity.servlet.VelocityServlet.doGet(VelocityServlet.java:317)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
    at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
    at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
    at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
    at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
    at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
    at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
    at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
    at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
    at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
    at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
    at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
    at java.lang.Thread.run(Thread.java:619)


Original issue reported on code.google.com by bpgergo on 14 Oct 2009 at 11:53

&size=10000 is allowed on admin pages

http://kozel.mokk.bme.hu:8080/hunglish/bisen?page=1&size=10000
elegge lefektetne' a rendszert, ne engedjuk meg. Masreszt viszont a roo altal 
javasolt 25-nel azert sokkal nagyobb szamot is megengedne'k size-nak, mondjuk 
200-zal bezarolag.

Original issue reported on code.google.com by [email protected] on 2 Mar 2011 at 1:13

húzzál el

http://szotar.mokk.bme.hu/hunglish/search/corpus?
ql=h%FAzz%E1l+el&qr=&source=eulaw

VelocityServlet: Error processing the template

right_stemmed in doc #414281does not have any term position data stored
java.lang.IllegalArgumentException: right_stemmed in doc #414281does not 
have any term position data stored
    at 
org.apache.lucene.search.highlight.TokenSources.getTokenStream(TokenSources
.java:190)
    at 
mokk.nlp.bicorpus.index.lucene.LuceneBiCorpusSearcher.search(LuceneBiCorpus
Searcher.java:283)
    at 
mokk.nlp.bicorpus.index.lucene.LuceneBiCorpusSearcher$BCELWrapper.search(Un
known Source)
    at 
mokk.nlp.dictweb.handlers.BiCorpusSearchHandler.handleRequest(BiCorpusSearc
hHandler.java:143)
    at 
mokk.nlp.dictweb.handlers.BiCorpusSearchHandler$BCELWrapper.handleRequest(U
nknown Source)
    at mokk.nlp.dictweb.MainServlet.handleRequest(MainServlet.java:60)
    at 
org.apache.velocity.servlet.VelocityServlet.doRequest(VelocityServlet.java:
358)
    at 
org.apache.velocity.servlet.VelocityServlet.doGet(VelocityServlet.java:317)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:740)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
    at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(Applicatio
nFilterChain.java:200)
    at 
org.apache.catalina.core.ApplicationFilterChain.access$000(ApplicationFilte
rChain.java:51)
    at 
org.apache.catalina.core.ApplicationFilterChain$1.run(ApplicationFilterChai
n.java:129)
    at java.security.AccessController.doPrivileged(Native Method)
    at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterC
hain.java:125)
    at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.j
ava:209)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.j
ava:144)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardContext.invoke(StandardContext.java:2358)
    at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:13
3)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.valves.ErrorDispatcherValve.invoke(ErrorDispatcherValve
.java:118)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:594)
    at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:11
6)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:594)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.jav
a:127)
    at 
org.apache.catalina.core.StandardPipeline$StandardPipelineValveContext.invo
keNext(StandardPipeline.java:596)
    at 
org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:433)
    at 
org.apache.catalina.core.ContainerBase.invoke(ContainerBase.java:948)
    at 
org.apache.coyote.tomcat4.CoyoteAdapter.service(CoyoteAdapter.java:152)
    at 
org.apache.jk.server.JkCoyoteHandler.invoke(JkCoyoteHandler.java:300)
    at 
org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:374)
    at 
org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:743)
    at 
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:675
)
    at 
org.apache.jk.common.SocketConnection.runIt(ChannelSocket.java:866)
    at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.ja
va:683)
    at java.lang.Thread.run(Thread.java:595)

Original issue reported on code.google.com by bpgergo on 26 Oct 2009 at 1:25

scheduling of harness/indexing

Most blokkolja a quartz a második harness futtatást. Feladat: schedule-oljon 
inkább.

Job-nak hivjuk azt, amikor egymás után lefut egy control_harness és egy 
indexelés.

A quartz-nak garantalnia kell, hogy egyszerre csak egy job fut. (Ha tobb fut, 
az annyira haza tudja vagni a konzisztenciat, hogy legjobb, ha a 
control_harness es az indexelo maguk is resen vannak, es vegeznek valamife'le 
lockolast.)

A quartz a kovetkezo trivialis modon job schedule-ol: Egy queue-ban gyujti a 
job request-eket. Ha eppen szabad a munkapad, akkor rateszi a sor elejen allo 
jobot.

Kik tudnak job requestet kuldeni? Egyreszet egy idozito, ami x percenkent 
(harnessCronPeriodMinutes=x property) megcsinalja ezt. Masreszt, ha a 
konfiguralaskor (instantHarnessSchedule=True property) ezt megengedjuk, akkor a 
fileUploadController is.

Megjegyzések:

1. Ha egynél több várakozó job request van, akkor azok jelenleg nyugodtan 
összeolvaszthatók eggyé, mert ezek most tulajdonságmentesek.

2. Távoli jövő: Később lehet, hogy minimum három munkadarab-fajtát 
fogunk elkülöníteni, úgyismint: control_harness, dumplumfilter, indexing. 
Ezekre ugyanennyi munkapad lesz, és lényegesen komplexebb scheduling. Elvileg 
mindegyik teljesen consumer-producer. A startup overheadjük egyetlen oka 
annak, hogy nem mindegy, hogy milyen sorrendben vannak meghívva.

Original issue reported on code.google.com by [email protected] on 1 Mar 2011 at 8:36

manual approvement by moderators

APPROVEMENT

Bejon egy doksi, vegigmegy a duplumszures vegeig. Ha upload.is_approved=True, 
akkor beallitjuk a state-et I-re, kezdodhet az indexeles. Ha 
upload.is_approved=False, akkor beallitjuk A-ra, ami 'waiting for 
(A)pprovement'-et jelent. Az admin feluleten ket uj Control tamogatja ezt:
- Meg lehet kerdezni, hogy mely doc-ok vannak approve-olatlanul.
- Egy doc osszes nondup bisen-je sorrendben kilistazhato. Mindegyik mellett van 
egy pipa. Egyszeru, szokasos javascript control: egy mester-pipa, amelyik 
mindegyiket egyszerre bepipalja. Submit-ra kipipaltsag szerint beallitodik egy 
bisen.is_approved, es az erteketol fuggoen I vagy N allapotba lepunk.


Original issue reported on code.google.com by [email protected] on 1 Mar 2011 at 10:01

categories

Webapp. A felhasználó finomabban adhassa meg, hogy mely kategóriákban keres 
és melyeket hagy ki a keresésbol.
A jelenlegi rendszer tetszőleges mélységű hierarchiát kezel. 
A hierarchiát lehessen bővíteni. (hogyan?)
A UI-en legyen kétféle szűkítési lehetőség. 
Az egyik néhány olyan előredrótozott nodeset-re szűr, 
mint pl: informális, formális, irodalmi, jogi. 
A másikban ki van lapítva az egész fa, és ctrl-click-kel tetszőleges 
részhalmaz kijelölhető.

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:48

character encoding bug

search for 'idő'
http://kozel.mokk.bme.hu:8080/hunglish/search/?ql=id%F5&qr=&source=all

Original issue reported on code.google.com by bpgergo on 4 Jan 2010 at 1:56

english stemming

Jöjjön vissza az angol tövezés, ami valamilyen elszállós bug miatt lett 
kiszedve.

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:49

get back to trunk

we're developing in a tag
copy to trnk

Original issue reported on code.google.com by bpgergo on 1 Mar 2011 at 10:53

user downvote/upvote

Webapp. Felhasználói visszajelzés (downvote, upvote) a találatokra.
(implementation: plusz field az adatbázisban a szavazatokra, jobban 
kigondolni!!)
Jó lenne, ha egy "Jelentem" funnkcióból annyi meglenne, hogy teccik/nem 
teccik
ezeket pedig meg lehet nézni kézzel (commnd line),
hogy mik voltak azok, amik sokakaknak teccenek ill. amik nem
Ha úgy döntünk egy idő után, hogy érdemes,
akkor lehessen config fáljból átállítani az alkalmazást, hogy a 
rankingben 
előre vegye, amit sokan szerettek és hátra sorolja, amit sokan nem szerettek 

Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:48

JDK problem with sun.nio

java.lang.NoClassDefFoundError: Could not initialize class 
sun.nio.ch.FileChannelImpl
    java.io.RandomAccessFile.getChannel(RandomAccessFile.java:270)
    org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.<init>(NIOFS
Directory.java:88)
    org.apache.lucene.store.NIOFSDirectory.openInput(NIOFSDirectory.jav
a:67)
    org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:336)
    org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentIn
fos.java:583)
    org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:6
9)
    org.apache.lucene.index.IndexReader.open(IndexReader.java:309)
    org.apache.lucene.index.IndexReader.open(IndexReader.java:195)
    mokk.nlp.bicorpus.index.lucene.BisMapper.initialize(BisMapper.java:
110)
    org.apache.avalon.framework.container.ContainerUtil.initialize(Cont
ainerUtil.java:244)
    org.apache.avalon.fortress.impl.handler.ComponentFactory.newInstanc
e(ComponentFactory.java:213)
    org.apache.avalon.fortress.impl.factory.WrapperObjectFactory.newIns
tance(WrapperObjectFactory.java:92)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ne
wComponent(AbstractComponentHandler.java:278)
    org.apache.avalon.fortress.impl.handler.ThreadSafeComponentHandler.
doPrepare(ThreadSafeComponentHandler.java:72)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.pr
epareHandler(AbstractComponentHandler.java:179)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ge
t(AbstractComponentHandler.java:209)
    org.apache.avalon.fortress.impl.handler.LEAwareComponentHandler.get
(LEAwareComponentHandler.java:128)
    org.apache.avalon.fortress.impl.lookup.FortressServiceManager.looku
p(FortressServiceManager.java:129)
    mokk.nlp.bicorpus.index.lucene.LuceneBiCorpusSearcher.initialize(Lu
ceneBiCorpusSearcher.java:207)
    org.apache.avalon.framework.container.ContainerUtil.initialize(Cont
ainerUtil.java:244)
    org.apache.avalon.fortress.impl.handler.ComponentFactory.newInstanc
e(ComponentFactory.java:213)
    org.apache.avalon.fortress.impl.factory.WrapperObjectFactory.newIns
tance(WrapperObjectFactory.java:92)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ne
wComponent(AbstractComponentHandler.java:278)
    org.apache.avalon.fortress.impl.handler.ThreadSafeComponentHandler.
doPrepare(ThreadSafeComponentHandler.java:72)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.pr
epareHandler(AbstractComponentHandler.java:179)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ge
t(AbstractComponentHandler.java:209)
    org.apache.avalon.fortress.impl.handler.LEAwareComponentHandler.get
(LEAwareComponentHandler.java:128)
    org.apache.avalon.fortress.impl.lookup.FortressServiceManager.looku
p(FortressServiceManager.java:129)
    mokk.nlp.dictweb.handlers.BiCorpusSearchHandler.initialize(BiCorpus
SearchHandler.java:97)
    org.apache.avalon.framework.container.ContainerUtil.initialize(Cont
ainerUtil.java:244)
    org.apache.avalon.fortress.impl.handler.ComponentFactory.newInstanc
e(ComponentFactory.java:213)
    org.apache.avalon.fortress.impl.factory.WrapperObjectFactory.newIns
tance(WrapperObjectFactory.java:92)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ne
wComponent(AbstractComponentHandler.java:278)
    org.apache.avalon.fortress.impl.handler.ThreadSafeComponentHandler.
doPrepare(ThreadSafeComponentHandler.java:72)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.pr
epareHandler(AbstractComponentHandler.java:179)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ge
t(AbstractComponentHandler.java:209)
    org.apache.avalon.fortress.impl.handler.LEAwareComponentHandler.get
(LEAwareComponentHandler.java:128)
    org.apache.avalon.fortress.impl.lookup.FortressServiceManager.looku
p(FortressServiceManager.java:129)
    mokk.nlp.dictweb.DefaultRequestDispatcher.initialize(DefaultRequest
Dispatcher.java:109)
    org.apache.avalon.framework.container.ContainerUtil.initialize(Cont
ainerUtil.java:244)
    org.apache.avalon.fortress.impl.handler.ComponentFactory.newInstanc
e(ComponentFactory.java:213)
    org.apache.avalon.fortress.impl.factory.WrapperObjectFactory.newIns
tance(WrapperObjectFactory.java:92)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ne
wComponent(AbstractComponentHandler.java:278)
    org.apache.avalon.fortress.impl.handler.ThreadSafeComponentHandler.
doPrepare(ThreadSafeComponentHandler.java:72)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.pr
epareHandler(AbstractComponentHandler.java:179)
    org.apache.avalon.fortress.impl.handler.AbstractComponentHandler.ge
t(AbstractComponentHandler.java:209)
    org.apache.avalon.fortress.impl.handler.LEAwareComponentHandler.get
(LEAwareComponentHandler.java:128)
    org.apache.avalon.fortress.impl.lookup.FortressServiceManager.looku
p(FortressServiceManager.java:129)
    mokk.nlp.dictweb.MainServlet.init(MainServlet.java:109)
    org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve
.java:102)
    org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.j
ava:293)
    org.apache.coyote.http11.Http11Processor.process(Http11Processor.ja
va:849)
    org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.pro
cess(Http11Protocol.java:583)
    org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:
454)
    java.lang.Thread.run(Thread.java:636)

Original issue reported on code.google.com by bpgergo on 1 Dec 2009 at 2:36

pipeline issues

This is a placeholder for pipeline issues.

One particular problem is that quality measurement numbers are at the end of 
the bisentences and those numbers can be seen on the result page. Note that 
it is probably correct to have these numbers in the bisentences so this is 
actually a file parsing/indexing problem.

Other pipeline issues will be listed here and all pipeline changes will be 
committed on this ticket. 
Also it is a question whether all the pipeline related source should be 
committed into this project. 

Original issue reported on code.google.com by bpgergo on 26 Nov 2009 at 10:15

feedback for uploaders

Vegre kigondoltam, hogy hogyan nem lesz szerzoi jogi balhe abbol, hogy
hozzaferheto az alignment. Az UploadController hasheli az id es a szerzo
konkatenaciojat (tesztben eleg ehelyett az unobfuscated id), es ebbol ad egy 
url-t. Az upload megtortente utan erre az uploadFeedback lapra iranyit at.

- Amig upload.is_processed=N, addig itt csak egy "bocs, turelmet kerek" 
uzenetet ad.
- Ha processed, akkor megnezi, hogy now()-harnessed_timestamp tobb-e, mint egy 
nap.
-- Ha igen, akkor azt mondja, hogy "eroforras elavult".
-- Ha nem, akkor egyetlen weboldalra kidumpolja a kerdeses alignment
metaadatait, es a bisen-eket. (Ha nem fogadtuk el a doksit, akkor
persze csak a metaadatokat.)

Kis kerdes, hogy mennyire lassu az a select, ami kiteszi a doksi osszes 
bimondatat.

Original issue reported on code.google.com by [email protected] on 1 Mar 2011 at 7:16

new features specification

1) Filtering the search results

There will be two multi-select lookup, one for source filter and one for 
author filter. (Currently we have one single-select lookup as source 
filter)

Both of these lookups will be populated from two corresponding txt files 
when initializing the webapp.

These files will be regenerated by application at the end of the indexing 
phase.

2) Uploading new documents

User can upload two files or can paste text into two textboxes.
The user must also provide a title with the documents and must select the 
author and the source from the lookups.

The webapp will check whether the Author+Title does not appear in the index 
and will not let the user to upload a new document pair with an existing 
author+title.

The webapp will save the uploaded docs in a specified folder (name 
containing date) and a third textfile will be saved next to the two docs 
which will have the format:
Author<NL>
Title<NL>
Source<NL>

It will be possible to start the pipeline manually for the new docs or it 
will be run scheduled by cron.

The result of the piplene is the aligned doc.

It will be possible to start the indexing phase for the last result of the 
pipeline (that is indexing the newly aligned docs) or it will be run 
scheduled by cron.


Original issue reported on code.google.com by bpgergo on 5 Jan 2010 at 6:35

case sensitivity of search terms

The right query term (that is the English term) is somewhat case sensitive.
Consider there searches:
1) http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=sas&qr=&source=all
2) http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=Sas&qr=&source=all
3) http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=&qr=eagles&source=all
4) http://szotar.mokk.bme.hu/hunglish/search/corpus?ql=&qr=Eagles&source=all

The first three returns 257 results (Results 1 - 20 of 257)
But the fourth query, when the English term is capitalized, only returns 177 
results 
(Results 1 - 20 of 177)


Original issue reported on code.google.com by bpgergo on 14 Oct 2009 at 12:29

usability issue

a keresés után legyen a fókusz az inputboxon 
(amelyiket használta, ha mind a kettőt használta, akkor a magyaron) 
és legyen kijelölve a keresőkifejezés, 
hogy azonnal lehessen gépelni az új keresést


Original issue reported on code.google.com by bpgergo on 2 Oct 2009 at 5:45

org.hibernate.exception.GenericJDBCException: Cannot release connection

What steps will reproduce the problem?
1. do not use the application for several hours (possibly 8 hours)
2. load whatever page
instead of loading the page, you get 
org.hibernate.exception.GenericJDBCException: Cannot release connection

probable cause: mysql time-outs idle connections from the pool
see: http://mrather.blogspot.com/2008/09/hibernate-and-connection-pools.html

Original issue reported on code.google.com by bpgergo on 23 Jan 2011 at 1:04

a harness végére egy modul, ami kívül tartja a rossz minőségű adatokat

Új modul a harnessbe:
Nézze meg, hogy a qf után érdemes-e még betenni. Például nem lett-e túl
kicsi az eredetihez képest, illetve nem lett-e nagyon alacsony hunalign
pontszámú. (Utóbbit nem tudjuk kibogarászni a standard errorból, de a
mondatpáronkénti quality mezőket könnyen tudjuk átlagolni.)

Új modul a harnessbe:
Nézze meg, hogy nincs-e valami jellegzetes csapda, amibe az ékezetes
karakterekkel beleszaladunk. (Lehet, hogy nem blacklist-elni kell, hanem
whitelist-elni, tehát tulajdonképpen egy pici nyelvdetektort ereszteni rá.)

Original issue reported on code.google.com by [email protected] on 9 Mar 2010 at 7:25

use date instead of timestamp in tables

- A timestamp fogalmat rosszul hasznaljuk. Egyes helyeken, mint 
bisen.indexed_timestamp kiirtando,
mas helyeken, mint job_queue es upload, lecserelendo egy rendes datumra.

Ez a problema is ennek a kovetkezmenye:
- A webapp nem tudja a recordbol letrehozni az upload objektumot,
ha a  machine_upload nem hazudik be egy kitoltott harnessed_timestamp mezot.

Original issue reported on code.google.com by [email protected] on 1 Mar 2011 at 6:51

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.