winkjs / wink-eng-lite-model Goto Github PK
View Code? Open in Web Editor NEWEnglish lite language model for wink-nlp.
Home Page: https://winkjs.org/wink-nlp/
License: MIT License
English lite language model for wink-nlp.
Home Page: https://winkjs.org/wink-nlp/
License: MIT License
the entity name patterns used to recognize numeric entities appear to be lacking - i can't recognize fractions... and decimals also don't seem to pick up correctly. For example when used to extract the parts of a sentence indicating ingredients i'm using this custom pattern:
const patterns = [
{ name: 'number', patterns: [ '[|PERCENT] [|CARDINAL] [|ORDINAL]' ] },
{ name: 'measurement', patterns: [ 'cup' ] },
{ name: 'food', patterns: [ 'butter' ] },
{ name: 'adjective', patterns: [ 'ADJ' ] },
{ name: 'adverbPhrase', patterns: [ '[|ADV] [|VERB]' ] }
];
then i try this string
"1/2 cup yellow butter, softened." ---> 1/2 was not picked up:
[Object: null prototype] { value: 'cup', type: 'measurement' },
[Object: null prototype] { value: 'yellow', type: 'adjective' },
[Object: null prototype] { value: 'butter', type: 'food' },
[Object: null prototype] { value: 'softened', type: 'adverbPhrase' }
".5 cup yellow butter, softened." ---> .5 was changed to 5
[Object: null prototype] { value: '5', type: 'number' },
but 0.5 worked, and "one half" also worked.
I expected NUM to be defined, but apparently you are not using that one. https://universaldependencies.org/u/pos/all.html#al-u-pos/NUM
I realize this package is meant for Node.js, but I was wondering how hard it would be to use in the browser. The first error that I get when trying to bundle it is related to this package:
error - ./node_modules/wink-eng-lite-model/dist/load-ner-model.js:1:0
Module not found: Can't resolve 'fs'
It seems like it would be possible to rewrite that file to directly require()
the JSON file instead of lazy loading it, but I'm guessing the lazy loading was done on purpose?
In Wikitext headings are marked up between multiple =
characters and are separated from the text using new lines. When breaking the text into sentences, wink-nlp doesn't consider the heading to be a separate sentence.
const text = `He spoke of a five-year freeze in domestic spending, eliminating
tax breaks for oil companies and reversing tax cuts for the wealthiest Americans,
banning congressional earmarks, and reducing healthcare costs. He promised the
United States would have one million electric vehicles on the road by 2015 and
be 80% reliant on \"clean\" electricity.\n\n\n==== LGBT rights ====\nOn October
8, 2009, Obama signed the Matthew Shepard and James Byrd Jr. Hate Crimes
Prevention Act, a measure that expanded the 1969 United States federal hate-crime
law to include crimes motivated by a victim's actual or perceived gender, sexual
orientation, gender identity, or disability.On October 30, 2009, Obama lifted the
ban on travel to the United States by those infected with HIV, which was celebrated
by Immigration Equality.On December 22, 2010, Obama signed the Don't Ask, Don't
Tell Repeal Act of 2010, which fulfilled a key promise made in the 2008
presidential campaign to end the Don't ask, don't tell policy of 1993 that had
prevented gay and lesbian people from serving openly in the United States Armed
Forces. In 2016, the Pentagon also ended the policy that barred transgender
people from serving openly in the military.`;
const doc = nlp.readDoc( text );
console.log( doc.sentences().itemAt(2).out() );
The output for this was:
==== LGBT rights ====
On October 8, 2009, Obama signed the Matthew Shepard and James Byrd Jr. Hate Crimes Prevention Act, a measure that expanded the 1969 United States federal hate-crime law to include crimes motivated by a victim's actual or perceived gender, sexual orientation, gender identity, or disability.
The expected outcome would be that ==== LGBT rights ====
and the rest of the text are in two separate sentences. This might be too specific a use case to actually solve for.
The wink-nlp install script does not work with pnpm monorepos...
❯ node -e "require( 'wink-nlp/models/install' )" wink-eng-lite-model
npm uninstall https://github.com/winkjs/wink-eng-lite-model/releases/download/1.3.1/wink-eng-lite-model-1.3.1.tgz
npm ERR! code EUNSUPPORTEDPROTOCOL
npm ERR! Unsupported URL Type "workspace:": workspace:*
npm ERR! A complete log of this run can be found in:
npm ERR! /home/brian/.npm/_logs/2022-08-11T14_46_21_368Z-debug-0.log
node:child_process:926
throw err;
^
Error: Command failed: npm uninstall https://github.com/winkjs/wink-eng-lite-model/releases/download/1.3.1/wink-eng-lite-model-1.3.1.tgz
at checkExecSyncError (node:child_process:851:11)
at Object.execSync (node:child_process:923:15)
at Object.<anonymous> (/project/node_modules/.pnpm/[email protected]/node_modules/wink-nlp/models/install.js:53:14)
at Module._compile (node:internal/modules/cjs/loader:1120:14)
at Module._extensions..js (node:internal/modules/cjs/loader:1174:10)
at Module.load (node:internal/modules/cjs/loader:998:32)
at Module._load (node:internal/modules/cjs/loader:839:12)
at Module.require (node:internal/modules/cjs/loader:1022:19)
at require (node:internal/modules/cjs/helpers:102:18) {
status: 1,
signal: null,
output: [ null, null, null ],
pid: 882155,
stdout: null,
stderr: null
}
Node.js v18.7.0
Distributing the wink-eng-lite-model as an npm package should fix this issue.
Hi.
Thanks for the nice tool!
Please consider the code below. It freezes on my machine at the nlp.readDoc
.
const winkNLP = require('wink-nlp')
const model = require('wink-eng-lite-model')
const nlp = winkNLP(model)
const content = '138375720109463900845220131105025504431resources094639008452'
nlp.readDoc(content)
I use NodeJS v 16 and
"wink-eng-lite-model": "https://github.com/winkjs/wink-eng-lite-model/releases/download/1.3.0/wink-eng-lite-model-1.3.0.tgz",
"wink-nlp": "^1.7.1"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.