ufal / mtmonkey Goto Github PK
View Code? Open in Web Editor NEWDistributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)
License: Other
Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)
License: Other
I am not sure the use of the schema in the screenshot below:
Sometimes the XML and JSON files will be translated, i am not sure whether the function is wanting to allow different types of input files to be translated. if so, very cool. What I mean is that the toolkit can allow us to perform multiple translations within a single call through using JSON object. Exactly the same JSON object will be returned back but with translated values.
I think the API should allow more reporting than just errors. Like warnings, debug info, other meta info. But how? Perhaps a single mechanism in combination with errors?
msgLevel: 0 = none, 1 = information, 2 = warning, 3 = error
msgText: "OK" (for msgLevel 0)
I use a simpler way, just warnings here, but this is not ideal:
http://zardoz.service.rug.nl:9070/rpc?action=translate&sourceLang=en&targetLang=nl&text=This+is+a+test.&nBestSize=3
We should add an option that will force the system to treat the request as a single sentence. I suggest segmentSentences=False
.
Hi, when trying to run this script, I get the following error:
mkdir: cannot create directory /virtualenv': Permission denied ln: failed to create symbolic link
./virtualenv': File exists
New python executable in /home/elav01/virtualenv/bin/python
Installing setuptools............done.
Installing pip...............done.
install_virtualenv.sh: line 32: virtualenv/bin/activate: No such file or directory
The problem does not occur if I run the script commands one by one from the commandline
From the API:
nBestSize: integer -- maximum number of translation options
Should these be unique translations? You can specify the option nbest-distinct
to mosesserver. If you don't, you may get ten times the same solutions, but with different scores (and perhaps different alignments?)
Output actual translation scores.
Nice to have: get scores using some baseline quality estimation approach instead of the rather uninformative translation probability output by Moses.
Workers should have an option to use no recasers (in case the "main" translation Moses model also handles recasing).
It might be useful to have timing info in the result. See, for example:
http://zardoz.service.rug.nl:9070/rpc?action=translate&sourceLang=nl&targetLang=en&text=Dit+is+een+test.
Implement outputting of n-best translations.
In the translation output, src-tokenized
should always be included if multiple sentences are translated.
If detokenize=false, should the result (still tokenized) be in "text" or in "tgt-tokenized"?
Currently, GET requests handle a small subset of arguments accepted by POST. We should unify this (should we?).
The API description puts the elements of the n-best list inside translated
, but the implementation puts them inside translation
. translated
then contains the individual sentences of a single n-best list member.
We should unify the behavior of the code with the API description or the other way round, whichever is more reasonable.
In the API, it says:
alignmentInfo: string -- request alignment information (optional, default = "false")
Shouldn't this be type boolean, just like detokenize?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.