Giter VIP home page Giter VIP logo

n-gram's Introduction

n-gram

Build Coverage Downloads Size

Get n-grams.

Contents

What is this?

This package gets you bigrams, trigrams, all the n-grams!

When should I use this?

You’re probably dealing with natural language, and know you need this, if you’re here!

Install

This package is ESM only. In Node.js (version 12.20+, 14.14+, 16.0+), install with npm:

npm install n-gram

In Deno with esm.sh:

import {nGram} from 'https://esm.sh/n-gram@2'

In browsers with esm.sh:

<script type="module">
  import {nGram} from 'https://esm.sh/n-gram@2?bundle'
</script>

Use

import {bigram, trigram, nGram} from 'n-gram'

bigram('n-gram') // ['n-', '-g', 'gr', 'ra', 'am']
nGram(2)('n-gram') // ['n-', '-g', 'gr', 'ra', 'am']

trigram('n-gram') // ['n-g', '-gr', 'gra', 'ram']

nGram(6)('n-gram') // ['n-gram']
nGram(7)('n-gram') // []

// Anything with a `.length` and `.slice` works: arrays too.
bigram(['alpha', 'bravo', 'charlie']) // [['alpha', 'bravo'], ['bravo', 'charlie']]

API

This package exports the identifiers nGram, bigram, and trigram. There is no default export.

nGram(n)

Create a function that converts a given value to n-grams.

Want padding (to include partial matches)? Use something like the following: nGram(2)(' ' + value + ' ')

bigram(value)

Shortcut for nGram(2).

trigram(value)

Shortcut for nGram(3).

Types

This package is fully typed with TypeScript. It exports no additional types.

Compatibility

This package is at least compatible with all maintained versions of Node.js. As of now, that is Node.js 14.14+, 16.0+, and 18.0+. It also works in Deno and modern browsers.

Related

Contribute

Yes please! See How to Contribute to Open Source.

Security

This package is safe.

License

MIT © Titus Wormer

n-gram's People

Contributors

greenkeeperio-bot avatar maffoobristol avatar wooorm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

n-gram's Issues

N-grams of longer arrays?

Is it possible to get the bigrams of: [alpha bravo bravo, charlie delta, alpha bravo]

Ideally I'd like a return of

2 alpha bravo
1 bravo bravo
1 charlie delta
0 bravo charlie
0 delta alpha

Perhaps my inputs are wrong, but it seems to be crossing the boundaries to return

2 alpha bravo
1 bravo bravo
1 charlie delta
1 bravo charlie
1 delta alpha

Usage without a module?

I'm trying to use const { bigram, trigram, nGram } = require('n-gram');

but I'm getting the error

Error [ERR_REQUIRE_ESM]: require() of ES Module C:\Users\x\x\x\x\node_modules\n-gram\index.js from C:\Users\x\x\x\x\server.js not supported. Instead change the require of index.js in C:\Users\x\x\x\x\server.js to a dynamic import() which is available in all CommonJS modules. at Object.<anonymous> (C:\Users\x\x\x\x\server.js:8:36) { code: 'ERR_REQUIRE_ESM'

Avoid duplicates

It doesn't seem to avoid duplicates... In an index, duplicates should be avoided to improve the performance.

Support for arrays

I tried putting an array rather than a string in, but it still treated it like a joined string. Is there any way in which n-grams of words rather than characters can be supported?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.