Giter VIP home page Giter VIP logo

jslingua's Introduction

Abdelkrime Aries

Hi there ๐Ÿ‘‹, call me "Karim"

mail linkedin twitter

  • ๐Ÿ“– Assistant professor at Ecole nationale supรฉrieure d'informatique (ESI, ex. INI), Algiers, Algeria
  • ๐Ÿ’ป I develop useless programs (unless someone can use them somehow)
  • ๐Ÿ”๐ŸŒฎ๐Ÿ™๐Ÿ๐Ÿฐ I like food and cooking. Cooking is like programming; recipes are like automata
  • ๐ŸŽฏ For me, programming is not a language you use; it is an art and a logic. Good programmers are those who master the abstrat form of algorithms, not a specific language or worse: a toolkit
  • ๐Ÿ‘ฅ Not social, definatly insensitive and sarcasm is my gem. Talking to me is like communicating with an alien ๐Ÿ‘ฝ or even a robot ๐Ÿค–
  • ๐Ÿฑ Lazy like a cat
  • ๐Ÿ‘ฟ Evil

jslingua's People

Contributors

dependabot[bot] avatar greenat92 avatar kariminf avatar npmcdn-to-unpkg-bot avatar shilik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

jslingua's Issues

how to use Morpho.Feature.POS

Hi, i would like to use Morpho.Feature.POS but I am not sure how to implement it. Could someone please point me in the right direction?

transliteration of punctuation

  • When re-factored morse special chars; some punctuations in Arabic and Japanese get lost.
  • Also, the Jpn char ใ‚“ has the same morse cose as +
  • The japanese space has some issues

Re-Documenting

  • README Add more to the project description
  • Check the API documentation (add more stuff)
  • change CREDITS.md to CONTRIBUTING.md (Github understand it that way) : https://github.com/kariminf/jslingua/community
  • add a CODE_OF_CONDUCT whatever that is
  • add coding specifications (more helpful than code of conduct)

Add a function which returns different conjugation forms

The problem:

Not every language has the same conjugation pattern, for instance:

Pronoun past present future
I did do will do
He did does will do

This is an example of tenses; but if we incorporate the mood, we will have: Conditional, Indicative, etc. and if we introduce the aspect, we'll have perfect, continues, simple, perfect continuous. Then the negation and the voice (active, passive). For other languages, like Japanese: Formality (plain, polite, formal).

Japanese doesn't need personal pronouns in its verb conjugation.

JsLingua default stemmer

It will be great if there is a default stemmer designed for JsLingua.

  • Arabic
  • English
  • French
  • Japanese

ML tools

Logistic regression (multiclass)

Will be used for MEMM (PoS tagging)

Will be used for parsing

newStemmer method

Morpho.newStemmer = function (stemmerName, stemmerDesc, stemmerFct) {

When i create a new stemmer and recreate second one the second crushed the 1st stemmer

    Morpho.newStemmer.call(this, "porterStemmer", "English proter stemmr", porterStemmer);
    Morpho.newStemmer.call(this, "lancasterStemmer", "English Lnacaster stemmer", lancasterStemmer);

How to add new stemmer to English Morpho module without crushed previous stemmers ?
you'll find the code in #35

Add normalize function

Languages like Arabic use some normalization in Information retrieval tasks. For example, the diacritics are removed, etc.

Add Arabic stemmers

I prefer those based on regular expressions (rule-based):

I don't care if the char-by-char methods are more fast; because in the end, sending the task to user side will be more profitable than using a server-side (the exchange time)

Japanese translateration

Use a standard translateration:

  • Hepburn romanization
  • Kunrei-shiki romanization (ISO 3602)
  • Nihon-shiki romanization (ISO 3602 Strict)

English verb conjugation

Conjugation of some irregular verbs is wrong: past participle

For example: override

Must slice the prefix over- before verifying the verb

English verb conjugation

Complete the task:
A conjugation method which doesn't rely on any database. Even if the results are not 100% relevant. It can have a little table or something for irregular verbs.

[Syntax] Improve API

Force using MEMM for all languages

Then, each language must only implement words encoding ==> features extraction

[Morpho] French verb conjugation

Mostly it is done using patterns rather than rules. Never found a resource (either software or book) that uses rules.

There are some pattern verbs and a dictionary of verbs linked to these verbs expressing that they have the same conjugation pattern.

info: variable information

Information like: population, countries and dialects, is changing with time. So, this information can't be afforded by the module itself.

Mark the three functions as obsolete; delete their documentation and test.

Arabic number in letter of 1000s

When we want to say 103000; it will be:
ู…ุงุฆุฉ ูˆุซู„ุงุซุฉ ุขู„ุงู
, but we get:
ู…ุงุฆุฉ ูˆุซู„ุงุซุฉ ุฃู„ู

Arabic verb conjugation

Complete the task:
A conjugation method which doesn't rely on any database. Even if the results are not 100% relevant. It can have a little table or something for irregular verbs.

Arabic conjugation complexity

  • Create modules for each type
  • a function to detect the type of addition: ist, in-, etc.
  • fix voice and negation for some types

Arabic conj: Ajwaf waw and yaa list conflict

The lists aforded by Shereen Khoja for week middle waw and yaa have some similar verbs.

For example: ุทุงุฑ
it exists in waw and yaa

It should be just in yaa

So, the list must be revised

Refactoring code

Just arrange it as:

Constructor:

Data:

Groups of functionalities

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.