jimregan / apertium-eu-fr Goto Github PK
View Code? Open in Web Editor NEWThis project forked from apertium/apertium-eu-fr
This project forked from apertium/apertium-eu-fr
Basque and French These are the linguistic data for the Apertium Basque--French machine translator. It translates only in the direction Basque--French. You need apertium-3.0 and lttoolbox-3.0 to use this translator. To compile the linguistical data simply do: $ ./configure to generate a Makefile file and then $ make inside of this directory. TAGGER To use this language-pair package with apertium YOU DO NOT NEED TO RETRAIN THE TAGGER. Probabilities and auxiliary data are provided for both the en-ca and the ca-en translation directions which should be acceptable for most applications, and should work even if you change the dictionaries in a reasonably way. If for some reason you need to retrain the tagger (for example, you have made really extensive changes to the dictionaries such as creating new lexical categories), you have three alternatives: * To perform a supervised training: To this end tagged corpora is provided, but tagged corpora (eu-tagger-data/eu.tagged ) could be obsolete for some words. If this is the case, the tagger training program will show you where the problems are and you will need to solve them by hand. Be sure to solve the problems by modifying ONLY the .tagged file, NEVER the .untagged file that is automatically generated. The supervised training is done by typing: make -f eu-es-supervised.make (for the Basque part-of-speech tagger) * To perform an unsupervised training: For this purpose you will need to assemble a large (hundreds of thousand of words) plain-text corpus for each language (for example, using a robot to harvest text from online newspapers) and put them in the proper place, for instance eu-tagger-data/eu.crp.txt. This type of training does not need human intervention but, as expected, results will be less adequate than those obtained with the supervised training. The unsupervised training is done through the iterative Baum-Welch algorithm. By default the number of iterations is set to 8, but you can change this value by editing the Makefile and changing the value of TAGGER_UNSUPERVISED_ITERATIONS. The unsupervised training is done by typing: make -f eu-es-unsupervised.make (for the Basque part-of-speech tagger) This is the training method followed to train the basque tagger. * To perform an unsupervised training by using target-language information and the rest of the modules of the Apertium MT engine: To do so you need large plain-text corpora on both languages. Please download the apertium-tagger-training-tools package and follow the instructions provided there.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.