Giter VIP home page Giter VIP logo

minisearch's Introduction

MiniSearch

CI Build Coverage Status Minzipped Size npm npm downloads types

MiniSearch is a tiny but powerful in-memory fulltext search engine written in JavaScript. It is respectful of resources, and it can comfortably run both in Node and in the browser.

Try out the demo application.

Find the complete documentation and API reference here, and more background about MiniSearch, including a comparison with other similar libraries, in this blog post.

MiniSearch follows semantic versioning, and documents releases and changes in the changelog.

Use case

MiniSearch addresses use cases where full-text search features are needed (e.g. prefix search, fuzzy search, ranking, boosting of fields…), but the data to be indexed can fit locally in the process memory. While you won't index the whole Internet with it, there are surprisingly many use cases that are served well by MiniSearch. By storing the index in local memory, MiniSearch can work offline, and can process queries quickly, without network latency.

A prominent use-case is real time search "as you type" in web and mobile applications, where keeping the index on the client enables fast and reactive UIs, removing the need to make requests to a search server.

Features

  • Memory-efficient index, designed to support memory-constrained use cases like mobile browsers.

  • Exact match, prefix search, fuzzy match, field boosting.

  • Auto-suggestion engine, for auto-completion of search queries.

  • Modern search result ranking algorithm.

  • Documents can be added and removed from the index at any time.

  • Zero external dependencies.

MiniSearch strives to expose a simple API that provides the building blocks to build custom solutions, while keeping a small and well tested codebase.

Installation

With npm:

npm install minisearch

With yarn:

yarn add minisearch

Then require or import it in your project:

// If you are using import:
import MiniSearch from 'minisearch'

// If you are using require:
const MiniSearch = require('minisearch')

Alternatively, if you prefer to use a <script> tag, you can require MiniSearch from a CDN:

<script src="https://cdn.jsdelivr.net/npm/[email protected]/dist/umd/index.min.js"></script>

In this case, MiniSearch will appear as a global variable in your project.

Finally, if you want to manually build the library, clone the repository and run yarn build (or yarn build-minified for a minified version + source maps). The compiled source will be created in the dist folder (UMD, ES6 and ES2015 module versions are provided).

Usage

Basic usage

// A collection of documents for our examples
const documents = [
  {
    id: 1,
    title: 'Moby Dick',
    text: 'Call me Ishmael. Some years ago...',
    category: 'fiction'
  },
  {
    id: 2,
    title: 'Zen and the Art of Motorcycle Maintenance',
    text: 'I can see by my watch...',
    category: 'fiction'
  },
  {
    id: 3,
    title: 'Neuromancer',
    text: 'The sky above the port was...',
    category: 'fiction'
  },
  {
    id: 4,
    title: 'Zen and the Art of Archery',
    text: 'At first sight it must seem...',
    category: 'non-fiction'
  },
  // ...and more
]

let miniSearch = new MiniSearch({
  fields: ['title', 'text'], // fields to index for full-text search
  storeFields: ['title', 'category'] // fields to return with search results
})

// Index all documents
miniSearch.addAll(documents)

// Search with default options
let results = miniSearch.search('zen art motorcycle')
// => [
//   { id: 2, title: 'Zen and the Art of Motorcycle Maintenance', category: 'fiction', score: 2.77258, match: { ... } },
//   { id: 4, title: 'Zen and the Art of Archery', category: 'non-fiction', score: 1.38629, match: { ... } }
// ]

Search options

MiniSearch supports several options for more advanced search behavior:

// Search only specific fields
miniSearch.search('zen', { fields: ['title'] })

// Boost some fields (here "title")
miniSearch.search('zen', { boost: { title: 2 } })

// Prefix search (so that 'moto' will match 'motorcycle')
miniSearch.search('moto', { prefix: true })

// Search within a specific category
miniSearch.search('zen', {
  filter: (result) => result.category === 'fiction'
})

// Fuzzy search, in this example, with a max edit distance of 0.2 * term length,
// rounded to nearest integer. The mispelled 'ismael' will match 'ishmael'.
miniSearch.search('ismael', { fuzzy: 0.2 })

// You can set the default search options upon initialization
miniSearch = new MiniSearch({
  fields: ['title', 'text'],
  searchOptions: {
    boost: { title: 2 },
    fuzzy: 0.2
  }
})
miniSearch.addAll(documents)

// It will now by default perform fuzzy search and boost "title":
miniSearch.search('zen and motorcycles')

Auto suggestions

MiniSearch can suggest search queries given an incomplete query:

miniSearch.autoSuggest('zen ar')
// => [ { suggestion: 'zen archery art', terms: [ 'zen', 'archery', 'art' ], score: 1.73332 },
//      { suggestion: 'zen art', terms: [ 'zen', 'art' ], score: 1.21313 } ]

The autoSuggest method takes the same options as the search method, so you can get suggestions for misspelled words using fuzzy search:

miniSearch.autoSuggest('neromancer', { fuzzy: 0.2 })
// => [ { suggestion: 'neuromancer', terms: [ 'neuromancer' ], score: 1.03998 } ]

Suggestions are ranked by the relevance of the documents that would be returned by that search.

Sometimes, you might need to filter auto suggestions to, say, only a specific category. You can do so by providing a filter option:

miniSearch.autoSuggest('zen ar', {
  filter: (result) => result.category === 'fiction'
})
// => [ { suggestion: 'zen art', terms: [ 'zen', 'art' ], score: 1.21313 } ]

Field extraction

By default, documents are assumed to be plain key-value objects with field names as keys and field values as simple values. In order to support custom field extraction logic (for example for nested fields, or non-string field values that need processing before tokenization), a custom field extractor function can be passed as the extractField option:

// Assuming that our documents look like:
const documents = [
  { id: 1, title: 'Moby Dick', author: { name: 'Herman Melville' }, pubDate: new Date(1851, 9, 18) },
  { id: 2, title: 'Zen and the Art of Motorcycle Maintenance', author: { name: 'Robert Pirsig' }, pubDate: new Date(1974, 3, 1) },
  { id: 3, title: 'Neuromancer', author: { name: 'William Gibson' }, pubDate: new Date(1984, 6, 1) },
  { id: 4, title: 'Zen in the Art of Archery', author: { name: 'Eugen Herrigel' }, pubDate: new Date(1948, 0, 1) },
  // ...and more
]

// We can support nested fields (author.name) and date fields (pubDate) with a
// custom `extractField` function:

let miniSearch = new MiniSearch({
  fields: ['title', 'author.name', 'pubYear'],
  extractField: (document, fieldName) => {
    // If field name is 'pubYear', extract just the year from 'pubDate'
    if (fieldName === 'pubYear') {
      const pubDate = document['pubDate']
      return pubDate && pubDate.getFullYear().toString()
    }

    // Access nested fields
    return fieldName.split('.').reduce((doc, key) => doc && doc[key], document)
  }
})

The default field extractor can be obtained by calling MiniSearch.getDefault('extractField').

Tokenization

By default, documents are tokenized by splitting on Unicode space or punctuation characters. The tokenization logic can be easily changed by passing a custom tokenizer function as the tokenize option:

// Tokenize splitting by hyphen
let miniSearch = new MiniSearch({
  fields: ['title', 'text'],
  tokenize: (string, _fieldName) => string.split('-')
})

Upon search, the same tokenization is used by default, but it is possible to pass a tokenize search option in case a different search-time tokenization is necessary:

// Tokenize splitting by hyphen
let miniSearch = new MiniSearch({
  fields: ['title', 'text'],
  tokenize: (string) => string.split('-'), // indexing tokenizer
  searchOptions: {
    tokenize: (string) => string.split(/[\s-]+/) // search query tokenizer
  }
})

The default tokenizer can be obtained by calling MiniSearch.getDefault('tokenize').

Term processing

Terms are downcased by default. No stemming is performed, and no stop-word list is applied. To customize how the terms are processed upon indexing, for example to normalize them, filter them, or to apply stemming, the processTerm option can be used. The processTerm function should return the processed term as a string, or a falsy value if the term should be discarded:

let stopWords = new Set(['and', 'or', 'to', 'in', 'a', 'the', /* ...and more */ ])

// Perform custom term processing (here discarding stop words and downcasing)
let miniSearch = new MiniSearch({
  fields: ['title', 'text'],
  processTerm: (term, _fieldName) =>
    stopWords.has(term) ? null : term.toLowerCase()
})

By default, the same processing is applied to search queries. In order to apply a different processing to search queries, supply a processTerm search option:

let miniSearch = new MiniSearch({
  fields: ['title', 'text'],
  processTerm: (term) =>
    stopWords.has(term) ? null : term.toLowerCase(), // index term processing
  searchOptions: {
    processTerm: (term) => term.toLowerCase() // search query processing
  }
})

The default term processor can be obtained by calling MiniSearch.getDefault('processTerm').

API Documentation

Refer to the API documentation for details about configuration options and methods.

Browser compatibility

MiniSearch natively supports all modern browsers implementing JavaScript standards, but requires a polyfill when used in Internet Explorer, as it makes use functions like Object.entries, Array.includes, and Array.from, which are standard but not available on older browsers. The package core-js is one such polyfill that can be used to provide those functions.

Contributing

Contributions to MiniSearch are welcome! Please read the contributions guidelines. Reading the design document is also useful to understand the project goals and the technical implementation.

minisearch's People

Contributors

alessandrobardini avatar dependabot[bot] avatar emilianox avatar graphman65 avatar grimmen avatar lucaong avatar mister-hope avatar nilclass avatar rolftimmermans avatar ryan-codingintrigue avatar samuelmeuli avatar sandstrom avatar stalniy avatar th317erd avatar vreshch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

minisearch's Issues

Is it possible to search multiple separate objects?

In all of the examples I've encountered, minisearch does its thing against an object that contains different books, song titles, etc.

What I have been doing is using minisearch to find sections within a large document. That works.
Format is:
[
{ id: 1,
name: "Section Title",
link_name: "an id tag to search in the HTML doc",
description: "full text of that section to search against" },
...
]

What I want to do next is search over multiple documents, ideally each composed of their own separate object. This would be a lot nicer than combining all my document objects into one giant object.

Is that possible? Thanks in advance.

miniSearch.toJSON() + miniSearch.loadJSON() combination doesn't work

miniSearch.toJSON() + miniSearch.loadJSON() combination doesn't work because of absence options in the toJSON() output
as a result - error regarding absence of options.fields property

second issue is that with miniSearch.loadJSON(jsonObj) jsonObj have to be JSON.stringified string and miniSearch.toJSON() provides non-stringified one instead

Allow search query to use a different tokenizer

I'm evaluating MiniSearch for a project where I need to search through a fairly small set of documents (about 10-100 documents, only titles). Prefix searching is quite important in this use-case. One issue I don't know how to deal with though is the fact that if I have a tokenizer that removes single char terms (which is fine for the tokenization of the documents), then I don't get a result on the first key (which in this case is desirable). Could the search/indexing be allowed to use different tokenizers? Or maybe just a second argument to the tokenizer indicating what kind of tokenization it's currently doing would be better?

can i use it with a cordova app?

i have a cordova app, a drug reference with about 200 html pages (total size of them is about 40mb), is there a way that i can use minisearch to search the whole app (the whole static site) off-line? and if yes can you illustrate how

Crash when searching "constructor"

Hello! First, thanks for Minisearch, that's a useful, well-designed lib.

I've found an interesting bug though: I use Minisearch for technical markdown documents, and some of them contain the word "constructor". Minisearch crashes when I'm searching for any term that begins the word "constructor" (like "c", "con", "const", "construct"...)

The code I'm using for the search:

const results = this.miniSearch.search(search, { prefix: true, boost: { title: 10 } })

Here's the (webpack-mangled) outputted error:

TypeError: results[documentId].match[term].push is not a function
    at eval (index.js?7f7f:1732)
    at Array.forEach (<anonymous>)
    at eval (index.js?7f7f:1714)
    at Array.reduce (<anonymous>)
    at termResults (index.js?7f7f:1703)
    at MiniSearch.executeQuery (index.js?7f7f:1521)
    at eval (index.js?7f7f:1295)
    at Array.map (<anonymous>)
    at MiniSearch.search (index.js?7f7f:1294)
    at FilesNotesRepository._callee6$ (notes-files-repository.ts?700d:132)

index.js:1732 refers to results[documentId].match[term].push(field); in this code:

    Object.entries(ds).forEach(function (_ref15) {
      var documentId = _ref15[0],
          tf = _ref15[1];
      var docBoost = boostDocument ? boostDocument(self._documentIds[documentId], term) : 1;

      if (!docBoost) {
        return;
      }

      var normalizedLength = self._fieldLength[documentId][fieldId] / self._averageFieldLength[fieldId];
      results[documentId] = results[documentId] || {
        score: 0,
        match: {},
        terms: []
      };
      results[documentId].terms.push(term);
      results[documentId].match[term] = results[documentId].match[term] || [];
      results[documentId].score += docBoost * score(tf, df, self._documentCount, normalizedLength, boost, editDistance);
      results[documentId].match[term].push(field);
    });

Thank you

Use different field as id

In my case each record has only "name" and "url" fields. I'd like name to act as id as it's unique.

Is it possible to configure minisearch so it uses name field and doesn't complain that documents don't have this field?

Typescript - MiniSearch.default is not a constructor

Hi there! I am using minisearch with my Typescript project and I am really liking minisearch. However, I am having some difficulties with instantiating a MiniSearch using the build-in types.

I have the latest version: 3.0.0.
The code looks something like this:

import * as MiniSearch from 'minisearch';

const miniSearch = new MiniSearch({
    fields: fieldsWithLabel,
    idField: 'id',
  });

I get the error:

This expression is not constructable.
  Type 'typeof import("/PROJECTPATH/node_modules/minisearch/dist/types/index")' has no construct signatures.ts(2351)

If I try to do:

import * as MiniSearch from 'minisearch';

const miniSearch = new MiniSearch.default({
    fields: fieldsWithLabel,
    idField: 'id',
  });

I get this error:

TypeError: MiniSearch.default is not a constructor

For now, my workaround is that I added my own types at types/miniSearch/index.d.ts with:

declare module 'minisearch';

But I would much rather use the extensive typing build within the module. Do you know how to fix this?

Bug when removing documents whose fields are indexed via extractField

I reproduced this on 2.5.0 today:

const MiniSearch = require('minisearch');

const miniSearch = new MiniSearch({
  fields: ['data.title'],
  storeFields: ['id'],
  extractField: (document, fieldName) => fieldName.split('.').reduce((doc, key) => doc && doc[key], document)
})

const
  item1 = {id: 1, data: { title: "foo"} },
  item2 = {id: 2, data: { title: "bar"} };

miniSearch.addAll([item1, item2]);

console.log(miniSearch.search("foo"));

miniSearch.remove(item1);

console.log(miniSearch.search("foo")); // should be an empty list, but contains:

/*
[
  {
    id: undefined,
    terms: [ 'foo' ],
    score: 0,
    match: { foo: [Array] }
  }
]
*/

I couldn't repro without extractField config.

It's not critical since I can either normalize items into flatter structures, or filter the result for undefined in my ids, but I thought you'd like to know.

Repro'd interactively for your convenience.

Partial Searches

Great library. I wanted to ask about partial searches. What if you index the following titles:

fox trot bax

And you search for 'fo' however the term 'fox' is not returned. How can I accomplish something like this?

How large can an index be?

Any sense of how large an index's source-collection file size may be before performance becomes an issue?

How can I know how much the result is relative to my search input ?

First I would like to thank you for this great contribution and amazing work you provided for the open community.

I have an issue where I need to know how much the result is relative to my search input, I know that there is a score but to determine quality of the results I can't depends on the score because the score equation is depending on many factors.

How can I find a way to say for example if the score is more than "x" which means that the result quality is excellent and actually my clients find what they need to search for and if the score is less than "x" which means that the client maybe found the result for what he search and maybe not.

Thank you so much again.

Type definitions

Hi @lucaong,

Thanks a lot for your work on this awesome library! :)

Type definitions would be a great feature to have when using this library in a TypeScript project.

Is minisearch able to work with nested object fields?

Hi, I came across this library via googling for browser-compatible Elasticsearch alternatives. It looks great and I'm excited to try it out!

My use case is that I am retrieving a subset of results from an Elasticserch endpoint, that I would like to be searchable in an offline environment. These documents have a nested structure:

const docs = [
  {
    field: [
      { subfield: 'some value }
    ] 
  },
  {
    field: [
      { subfield: 'some other value }
    ] 
  },
  etc.
]

Elasticsearch allows for specifying searching subfields, like so:

"query": {
  "bool": {
    "must": [
      {
        "multi_match": {
          "query": "other",
          "fields": [
            "field.subfield"
          ],
          "fuzziness": "AUTO"
        }
      }
    ]
  }
},

Is it possible to achieve a similar search with this library?

Suffix search

Hello, I would like to know if a suffix search exists, or is there any easy "hack" to implement it.

I have a use case where I search through documents having the same starting codes the only differences between them are the few last characters, in order to avoid typing the many first characters that won't help I would like to type only the few last ones.

Here is a stackblitz showing my use case https://stackblitz.com/edit/angular-ivy-ygmjz9

Suggestion: Add the ability to index an array field

Hi,
I currently need to index all the strings found in a field that it's an array. Right now I'm creating an object with several fields (item1, item2... item9) and assigning the array values to each field and then indexing that document. Of course the above has several limitations, but right now I don't see any other way to do so.

Could you add support to index an array of strings as value of a field? Thanks in advance!

How to limit the number of results

Hi Luca,

Excellent tool, I am testing and it works great so I would ask how to limit the number of results that returns, for example I only want the first 10 results of one search.

I really appreciate the help.

Regards from Ecuador

Usage with Redux

Is there a way to use this with Redux?

I have a couple of React components creating their own minisearch stores using the same data from Redux. I'd rather put the minisearch store into Redux and have one instance of that data rather than two.

It's better to store plain objects in Redux and I'm wondering is there a way to retrieve the underlying data structure for a minisearch store, store it in Redux, and then 'rehydrate' into minisearch again to actually search the data?

Score enhancements

Hi Luca,

Congratulations, and thank you, for writing such great code and creating such an excellent client side search solution.

Minisearch is awesome!

I'm very keen to implement it in an application that searches approx 6000 food products and I'm hoping that you may be able to give me some advise on the best way to improve some of the score results that I'm getting on my data.

My customers search by product codes, or product descriptions and/or product brands so those are the 3 fields that I'm searching on.

I'm experimenting with fuzzy settings of around .5 to catch spelling issues on words like broccoli, for which I'm using test cases of 'br', 'bro', 'broc', 'broco', 'brocol', 'brocoli' etc

One of my other main test cases is 'cheese' eg 'çh', 'che', 'chee', 'chees', 'çheese'

I'm using the following boost settings - product code (2.1) product description (2) product brand (1.5)

I've put together a Google sheet to show 4 examples of where I would like to get different score results. The sheet is at https://docs.google.com/spreadsheets/d/1gKS2nbeF4TivgRcXDDdc6LmLc6-Q0dRksnUSWNvIbZo/edit?usp=sharing

I can provide a json file of the full product data if that helps.

Thanks again for creating and sharing minisearch.

Regards
Norm Archibald.

New `resultField` option to group stored fields into the result objects.

Hi,
thank you for your awesome library! Using a new resultField option the search result could be return the stored fields all together (to avoid conflicts on documents with reserved properties score, terms, match):

const miniSearch = MiniSearch.new({
  fields: ['title', 'text'],
  storeFields: ['title', 'category'],
  resultField: 'doc'
})
miniSearch.addAll(documents)
let results = miniSearch.search('zen art motorcycle')
// => [
  { id: 2, doc: { title: 'Zen and the Art of Motorcycle Maintenance', category: 'fiction' }, score: 2.77258, terms: [ ... ], match: { ... } },
  { id: 4, doc: { title: 'Zen and the Art of Archery', category: 'non-fiction' }, score: 1.38629, terms: [ ... ], match: { ... } }

All the best!

Suggestion: pass in document as 3rd param to tokenize to access nested objects

I have documents with nested fields e.g.

[{ id: '', name: '', location: { postal: '11111', placename: '', unit: '07-03' } }]

i would like to access the nested field within e.g. 'location.postal'

currently i can use tokenize to return [postal,unit]
e.g.

tokenize(value, key){
if (key === location) return [value.postal, value.unit]
}

but both results are under the same key "location"

minisearch.search('query', { fields: ['location'] });  // this will search both "postal" & "unit" even if i just need "postal" field

if tokenize simply take in a 3rd parameter, the entire document, i could parse the field key myself to access the nested value
e.g.

tokenize(value, key, document){
if (key === 'location.postal') return document.location.postal;
if (key === 'location.unit') return document.location.unit;
}

Use extractField to extract document id field

I've found that using a document structure with an id that is nested, like this for example where tracker.id is the document id:

{
  tracker : {
    id : 'documentId',
    ...
  },
  asset : { ... }
}

is not possible. I'm not sure if there are other considerations preventing this, but it seems as if using extractField to extract the idField in addition to regular fields might solve the issue. I would like to avoid changing the document structure if possible.

Ship ES6 version of the library

Today there are major browsers support ES6 and ESM, so it make sense to ship ES6 version of your library together with UMD version.

Usually there are 3 types of versions which are good to have in your package:

  • ES6 + ESM (ES modules)
  • UMD
  • ES5 + ESM (es5m or esm5)

To make this job easier I'd recommend you to use rollup instead of webpack. The resulting bundle will be smaller and without internal module system (which webpack adds in the bundle).

You can check https://github.com/stalniy/rollup-plugin-content/blob/master/rollup.config.js to see how I did this in one of my libraries.

Let me know if you need help with this, I can submit a PR.

Expose default tokenizer and term post-processor

I've written code to compose different tokenizers and term post-processors together, however, I would like to be able to fall back to the built-in ones. Today, I do this by copying the regex/logic into my application, however it would be much better if they were available as exports from the minisearch package so that they can be built uppon.

Feature request: expose SearchableMap

I love MiniSearch, but a small change could make it a lot more usable for my use case. I pre-index a company's user directory on the server side and then send it to the browser--works great. What I'd like to do, however, is have a web page which lists all of the users. Without access to "_index" (the root SearchableMap), I can't really get documents outside of the search functionality. If you exported SearchableMap, and provided an accessor I think MiniSearch could handle all of my client-side needs... and then when I need detail I can fetch the full record from the server.

Array fields like tags or list

Does minisearch support also searching on array fields like a list of tags or a list of authors?

For example find where tags contains both "a and "b" values.

Better support for property mangling

The issue is that I cannot mangle properties in my app due to destructuring of private properties in minisearch.

I configure terser to mangle all properties that start from _ but due to destructuring on this line search in my app doesn't work.

As a temporary workaround I marked _tree and _prefix as reserved props.

I'd appriate if you could change this and not use properties that stars with _ in destructuring assignments

Highlighting Matches

Hey there 👋

What is the preferred strategy to get the index of a hit within the original body in order to highlight it?

example feedback

I was excited to find minisearch since it has no dependencies, and reading the code of the example.
But the example is actually a demo, not an example, and it uses React!
You can't really see how to use autosuggest or the filter etc. if the code is minimized, and there is no link to the repository.
The demo is pretty nice, but it's named "example", not "demo". I was expecting to see some code so I can follow along to do my own. I went to the docs, but that didn't help at all.
Also, why are there two sets of CSS for the demo?
Also, why is the body background color not set (I see my browser default color)?
Also, your repository folder is called "examples" (plural), yet there is only one.

Undeclared unfetch dependency on examples

I've got this error running $ yarn run build on examples folder of v1.1.2.

ERROR in ./src/app.js
Module not found: Error: Can't resolve 'unfetch' in '/Users/dario/projects/minisearch/examples/src'
 @ ./src/app.js 61:0-28 108:6-11
 @ ./src/index.js
error Command failed with exit code 2.
info Visit https://yarnpkg.com/en/docs/cli/run for documentation about this command.

I solved it by adding unfetch to package.json dependencies:

"dependencies": {
    "unfetch": "^4.0.1"
  }

Filter Auto-Suggestions?

Hello,

I'm wondering about the possibility/best practice for filtering autosuggestions. Sorry in advance if this question is better suited for StackOverflow or a different forum.

In my app, I have a search input field and several filter selects. These filters are applied to the results returned from MiniSearch. I'm displaying search term autosuggestions as users type into the search input. So in some scenarios, autosuggestions are displayed for search results that will get filtered out.

I'm wondering about the best way to handle this . I could re-create the MiniSearch index each time a filter is changed, but wanted to see if there was a different approach.

Thanks for the plugin!

Don't publish minified sources to npm

Minified sources make it incredibly annoying to debug what happens within minisearch. Libraries, in general, should not be published minified, that's an issue for applications to deal with (if they want to). I recommend simply using rollup to bundle the sources together or running babel on the sources directly. I can help with this setup if wanted.

Serialization?

I can't find any mention of this in the docs, but is there any way to save/export/serialize an index once created? It would be nice to be able to save an index to a file and then load that string via an ajax call instead of having to retrieve and load all the documents for each request.

Removing items by id

I've got documents that look like this:

const tasks = [
{ id: 1, title: "clean the house" }, 
{ id: 2, title: "eat food" }
]

If the title of the task with id = 1 changes, I'd like to update that change in the index. However, in my current application, I don't have access to the entire old version of the document. I just know the id and the new values for the title field.

In order to remove an item in Minisearch, it looks like I need to pass the whole document that I originally added. Is there a way I can remove an item by id? If so, I can just remove by id and then add the new document.

Spread operator doesn't work on Safari.

I'm using MiniSearch for a front-end web app and it's been great so far so thanks for your work! Unfortunately when I started testing other browsers to make sure things worked I discovered the spread operator doesn't work on Safari. I changed maybe 10 or 12 lines of code to instead use Object.assign() and it seems to have fixed the issue. I'm relatively new to Javascript so there may be a better way to handle this but things seem to be working properly after my changes so I didn't look into it any further.

Import error on typescript environment

I'm use:

  • node v14.5.0,
  • typescript v3.9.6
  • ts-node v8.10.2

on a Windows 10 system, and when I only import Minisearch like:

import Minisearch from 'minisearch';

I get the follow error on try run with ts-node:

const searcher = new Minisearch({
                 ^
TypeError: minisearch_1.default is not a constructor
    at Object.<anonymous> (C:\Users\farin\atla\bercario\bot\src\internals\Searcher.ts:8:18)
    at Module._compile (internal/modules/cjs/loader.js:1201:30)
    at Module.m._compile (C:\Users\farin\atla\bercario\bot\node_modules\ts-node\src\index.ts:858:23)
    at Module._extensions..js (internal/modules/cjs/loader.js:1221:10)
    at Object.require.extensions.<computed> [as .ts] (C:\Users\farin\atla\bercario\bot\node_modules\ts-node\src\index.ts:861:12)       
    at Module.load (internal/modules/cjs/loader.js:1050:32)
    at Function.Module._load (internal/modules/cjs/loader.js:938:14)
    at Module.require (internal/modules/cjs/loader.js:1090:19)
    at require (internal/modules/cjs/helpers.js:75:18)

Can anyone help me? I'm not know if I'm doing something wrong and I can't imagine how to fix that.

cyrillic/unicode search not working. Is it intended?

Hey,
just played a bit with your library. Is it intended not to work with cyrillic/unicode strings? Perhaps you did not implement it in terms of performance?

Simple snippet to reproduce:

const documents = [
    {id: 1, title: 'София'},
    {id: 2, title: 'Пловдив'},
    {id: 3, title: 'Sofia'},
    {id: 4, title: 'Plovdiv'},
];   
let miniSearch = new MiniSearch({ fields: ['title']});
miniSearch.addAll(documents);

console.log(miniSearch.autoSuggest('so')); // Works!
console.log(miniSearch.autoSuggest('со')); // Nope :(
console.log(miniSearch.autoSuggest('пло')); // Nope :(
console.log(miniSearch.autoSuggest('Plo')); // Works!

Irrational behaviour in the demo app?

Hi!

Thank you for an interesting project! I found it via the Codrops mailing list.
I was trying out some searches in the demo application and noticed something that I find odd.

If you start typing 'eve' you get a list of potential matches including 'eve', 'everlast', and 'everclear'.
If I add an 'r' to the search so the search term in 'ever', then 'everlast' and 'everclear' disappears and the list is replaced with 'ever never'.
Shouldn't the search term 'ever' still match 'everlast' and 'everclear'?

Partial matching inside words?

First, Thank You for minisearch, it's a great tool! 👍

My question: is there a way to include results that would find an item containing the word Raymond with search mon (Ie. the same kind of results that String.prototype.includes() yields)?

Exlude newline characters

Shouldn't newline characters (\n and \r) also be part of the tokenizer regex?

Otherwise, words at the start of a new line won't be matched by a search since the preceding newline chars are added to the start of that term.

Remove all documents

Luca, first let me say Minisearch is great. I've been battling with FlexSearch for a while (which seems to be stagnant) and finally gave up and moved to Minisearch for our Clibu Notes app.

I did try Minisearch earlier and had some issues which I couldn't reproduce this time around. 😀

I need to be able to remove all documents from the search index and wonder what the most efficient way to do this is. I don't want to iterate all the docs and call remove().

Is it safe just to delete the searchEngine instance's and do new MiniSearch() to start over or do you have another suggestion?

[Question] Why an id in the input dataset is required?

I may ask a dumb question but I didn't see the answer in the documentation or existing issues.

If the documents passed to miniSearch.addAll() is containing data that doesn't have an id filed this error occurs.

Uncaught Error: MiniSearch: document does not have ID field "id"

Question 1: But why is the id filed required in the input?
I understand we need a reference for the results, but there are several options other than requiring and id field in the input data:

  • As the input in an array, use the array index as id for the results
  • Use a unique field already existing on the data, eg. if the name field is unique so it can be used as a reference of the results

I already have a json dataset that is used for several usages, eg. like a pseudo API. I'm not super fan of adding an extra id field with arbitrary values for each item as it will make the file bigger & give me extra work when I could actually re-use the name field or array index as an id.

Question 2: Is it possible to make the id requirement optional & use another unique field as reference/index?

Feature request: Result highlighting

Are there plans on supporting highlighting of matched characters in the result? Other search engines do this by return match-indexes in the result.

Defining documents by array indexes instead of object keys?

Hi there: is it possible to define objects (and search through them) by array indexes instead of object keys? For example, instead of defining:

{
    "id": "song-123",
    "title": "Time to Shine"
}

Defining something like:

["song-123", "Time to Shine"]

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.