nextapps-de / flexsearch Goto Github PK
View Code? Open in Web Editor NEWNext-Generation full text search library for Browser and Node.js
License: Apache License 2.0
Next-Generation full text search library for Browser and Node.js
License: Apache License 2.0
Thanks for this capability. I am excited to learn how this works for several use cases I have
I ran your 'best practice with some modeification. I cannot find the source of the error ...
"c is not a function'
Here is my code
const FlexSearch = require("flexsearch")
const bookstore = new FlexSearch();
const pizzashop = new FlexSearch();
const votingbooth = new FlexSearch();
let settings = {
action: "score",
adventure: {
encode: "extra",
tokenize: "strict",
depth: 5,
threhold: 5,
doc: {
id: "id",
field: ["intent", "text"]
} },
comedy: {
encode: "advanced",
tokenize: "forward",
threshold: 5
}
}
let index = {}
const add = (id, cat, intent, text) => {
console.log(gr(Starting on Index ${id}
))
console.log(for ${cat}, ${intent}, ${text}
)
try {
(index[cat] || (
index[cat] = new FlexSearch(settings[cat])
)).add(id, intent, text);
} catch(error) {
console.log(error)
}
}
const search = (cat, query) => {
return index[cat] ? index[cat].search(query) : [];
}
let x = 0
training.map((t) => {
console.log(b(Creating index ${x}
))
x++
add(x, "bookstore", t.intent, t.text);
add(x, "pizzashop", t.intent, t.text);
add(x, "votingbooth", t.intent, t.text);
})
//add(1, "action", "Movie Title");
//add(2, "adventure", "Movie Title");
//add(3, "comedy", "Movie Title");
console.log(r(THIS SHOULD EXECUTE LAST
))
//index.update(10025, "Road Runner");
//index.remove(10025);
var result1 = search("bookstore", "i am searching for a book"); // --> [1]
var result2 = search("pizzashop", "howdy"); // --> [1]
var result3 = search("votingboooth", "i need directions"); // --> [1]
console.log(========== FAST SEARCH TEST ==========
)
console.log(result1)
console.log(result2)
console.log(result3)
The log shows an empty array
Pretty neat. Performances really well.
I read in #7 "Flexsearch is a micro library whose complexity we want to keep as low as possible in the core. "
What about sorting? We are currently considering replacing our list filters by flexsearch. It would be nice to use the same index also for sorting.
Which kind of expression do you prefer?
var results = index.search([{
field: "title",
query: "foobar",
presence: "required"
},{
field: "body",
query: "content",
presence: "optional"
},{
field: "blacklist",
query: "xxx",
presence: "prohibited"
}]);
var results = index.search([{
field: "title",
query: "foobar",
bool: "and"
},{
field: "body",
query: "content",
bool: "or"
},{
field: "blacklist",
query: "xxx",
bool: "not"
}]);
var results = index.search([{
field: "+title",
query: "foobar"
},{
field: "body",
query: "content"
},{
field: "-blacklist",
query: "xxx"
}]);
Hi @ts-thomas ,
I am a beginner to open source contribution / projects. I want to work on the port of this library for Ruby. If possible can you point towards any reference/article/blog post related to scoring algorithm and other implementations used in this library. If anyone is already working on this library for Ruby, please let me know, I would also love to contribute to the project.
for example: "PostgreSQL快速入门".
The benchmarks for query and memory tests use different presets, but compare to same config of other libraries.
It would be helpful to be able to compare the difference of flexsearch performance between presets, while showing a full, unbiased picture.
I'm trying to load the language files to use with the stemmer p.e., but I'm getting a TypeError: Cannot read property 'registerLanguage' of undefined
error.
var FlexSearch = require('flexsearch')
require(require('flexsearch/lang/en')
The error seems to indicate that the flexsearch object is not in scope, but when pass it as a global variable I get the same error. Am I missing something here?
Is it possible to know total number of items found, to know how many pages in pagination should be displayed?
I tried to build this library with 'npm run build-compact' and got some errors like below :
/bin/sh: -c: line 0: unexpected EOF while looking for matching '' /bin/sh: -c: line 1: syntax error: unexpected end of file { Error: Command failed: java -jar node_modules/google-closure-compiler-java/compiler.jar --compilation_level=ADVANCED_OPTIMIZATIONS --use_types_for_optimization=true --new_type_inf=true --jscomp_warning=newCheckTypes --generate_exports=true --export_local_property_definitions=true --language_in=ECMASCRIPT6_STRICT --language_out=ECMASCRIPT6_STRICT --process_closure_primitives=true --summary_detail_level=3 --warning_level=VERBOSE --emit_use_strict=true --output_manifest=log/manifest.log --output_module_dependencies=log/module_dependencies.log --property_renaming_report=log/renaming_report.log' --js='flexsearch.js' --js='lang/**.js' --js='!lang/**.min.js' --define='RELEASE=compact' --define='DEBUG=false' --define='PROFILER=false' --define='SUPPORT_WORKER=false' --define='SUPPORT_ENCODER=true' --define='SUPPORT_CACHE=false' --define='SUPPORT_ASYNC=true' --define='SUPPORT_PRESETS=true' --define='SUPPORT_SUGGESTIONS=false' --define='SUPPORT_SERIALIZE=false' --define='SUPPORT_INFO=false' --define='SUPPORT_DOCUMENTS=true' --define='SUPPORT_WHERE=false' --define='SUPPORT_LANG_DE=false' --define='SUPPORT_LANG_EN=false' --js_output_file='dist/flexsearch.compact.js' && exit 0
and just found a simple error in 'compile.js(116:92)'.
exec("java -jar node_modules/google-closure-compiler-java/compiler.jar" + parameter + "' --js='flexsearch.js' --js='lang/**.js' --js='!lang/**.min.js'" + flag_str + " --js_output_file='dist/flex search." + (options["RELEASE"] || "custom") + ".js' && exit 0", function(){
After removing the unnecessary single quotation after parameter + "
, the build process worked fine.
I think it's just a mistyping... maybe. 😓
flexsearch version 0.5.1
Can't destroy index instance in the browser because of the error.
Here is test HTML:
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>Benchmark Presets</title>
<style>
body{
font-family: sans-serif;
}
table td{
padding: 1em 2em;
}
button{
padding: 5px 10px;
}
</style>
</head>
<body>
<div id="container"></div>
<script src="../dist/flexsearch.min.js"></script>
<script>
(function(){
var index = new FlexSearch({
doc: {
id: 'id',
field: 'title'
}
});
index.add([
{ id: 1, title: 'foo' },
{ id: 2, title: 'bar' }
])
console.log(index.search('foo'))
index.destroy()
})();
</script>
</body>
</html>
Window console displays error:
TypeError: a is undefined[Learn More]
flexsearch.min.js:33:45
I would expect that if I search for a term, and that term appears once in document A but several times in document B, that B would have a higher position in the results than A. But that does not seem to be the case.
Example:
const FlexSearch = require(`flexsearch`)
const index = new FlexSearch({
tokenize: `strict`,
encode: `advanced`,
cache: false,
doc: {
id: `id`,
field: {
content: {
threshold: 9,
resolution: 10,
},
},
},
})
index.add([{
id: 1,
content: `billy bob thorton`,
}, {
id: 2,
content: `billy who now what billy okay so what now thorton?`,
}])
console.log(
index.search(`billy`)
)
// => [ { id: 1, content: 'billy bob thorton' },
// { id: 2,
// content: 'billy who now what billy okay so what now thorton?' } ]
I would expect that a search for billy
would have a higher score for document id 2 than document id 1, but the search returns document id 1 as the top result.
Tested with [email protected].
I tried setting the "worker" option to false and everything worked very well. But when I enable this option and set it to any number different than false, my console prints "Uncaught (in promise) TypeError: Cannot read property 'length' of undefined".
Here is the screenshot:
I have around 30.000 items, thats why I want to use the web worker feature.
Any ideas? I can give another informations if necessary.
Please make suggestions or give some feedback.
The extraction of the core functionality is basically required for many upcoming features as well as for still existing ones, like:
These still existing features has to remain as a core functionality:
The basic core API should have this methods:
These missing features also needs to be integrated as a core functionality:
These functions should be extracted as an optional tooling:
The plugin API is required to provide additional tooling and features in a modular and extendable manner. The plugin API should have these capabilities:
There are several requests of a TypeScript port. The advantage of TypeScript compared to plain JavaScript may be too less, since the TypeScript also compiles to JavaScript and is also less optimized as the Google Closure Compiler for that purpose.
Technically there are two targets:
Browsers are actually covered as well as Node.js. Making a TypeScript port will do not cover any additional ecosystem. Only the formal codebase will differ and at the end it is just a different pattern for the same result. That's why I prefer a browser-less system-wide port over TypeScript. The language Rust is pretty close to TypeScript/JavaScript and covers 2., so this might be a better candidate for a port.
There is no final decision at the moment, so let us discuss pro and cons here.
I'm trying to create an index over a large dataset and I want to separate the script that's creating the index from the script that's using the index. The index creation seems to work very well, but when I use index.export()
, I'm getting a RangeError: Invalid string length
error. Is there a way to export the index as a file without getting this error? A possible solution would be to allow exporting via a stream that could be written to a file directly.
Thanks!
If the same word appears in different doc fields, the search returns no results.
Demo (open console):
https://stackblitz.com/edit/flexsearch
Is there any plan to make a React Component from flexsearch
?
Hi, I've been using - https://github.com/jeancroy/FuzzySearch
...which I've found to be very quick. How does FuzzySearch compare to flexsearch?
Be great if you could make some more usage examples.
What is the best way to handle documents with multi value attributes?
For example a document with a m:n relation to another entity.
Does the library support serialize/deserialize flexsearch object as json ?
I'd love to create index in Node , but will deserialize the object in browser for client-side searching.
Hi Thomas
Using the following example
const FlexSearch = require('./flexsearch')
const fs = new FlexSearch({
encode: 'extra',
tokenize: 'full',
threshold: 1,
depth: 4,
resolution: 9,
async: false,
worker: 1,
cache: true,
suggest: true,
doc: {
id: 'id',
field: [ 'intent', 'text' ]
}
})
fs.add([
{
id: 0,
intent: 'intent',
text: 'text'
}, {
id: 1,
intent: 'intent',
text: 'howdy - how are you doing'
}
])
console.log('INFO', fs.info())
const result = fs.search('howdy', { bool: 'or' })
console.log('RESULT', result)
const result2 = fs.search('howdy -', { bool: 'or' })
console.log('RESULT', result2)
An exception is thrown using 'howdy -
as search parameter. When setting suggest to false, the search is successful, but the search for howdy -
does not find any results.
The exception thrown is
.../search/flexsearch.js:3308
z = suggestions.length;
^
TypeError: Cannot read property 'length' of undefined
at intersect (.../servers/search/flexsearch.js:3308:37)
at FlexSearch.merge_and_sort (.../servers/search/flexsearch.js:1393:22)
at FlexSearch.search (.../servers/search/flexsearch.js:1561:43)
at Object.<anonymous> (.../servers/search/test2.js:33:19)
at Module._compile (internal/modules/cjs/loader.js:734:30)
at Object.Module._extensions..js (internal/modules/cjs/loader.js:745:10)
at Module.load (internal/modules/cjs/loader.js:626:32)
at tryModuleLoad (internal/modules/cjs/loader.js:566:12)
at Function.Module._load (internal/modules/cjs/loader.js:558:3)
at Function.Module.runMain (internal/modules/cjs/loader.js:797:12)
In flexsearch on line 3068
function intersect(arrays, limit, cursor, suggest, bool, has_not) {
let result = [];
let suggestions;
const length_z = arrays.length;
suggestions is not being assigned, because the while loop on line 3133 is false
while(++z < length_z){
so the assignment of the suggestion variable on line 3211 is bypassed
let found = false;
i = 0;
suggestions = [];
while(i < length){
The reason the search for howdy -
, when suggestions is false, is unsuccessful is probably because of the options passed in. Should I implement my own tokenizer if I would like to find queries like howdy -
?
Thanks in advance
Regards
William
var index = FlexSearch.create({
doc: {
id: "url",
field: [
"title",
"content"
]
}
});
Invoke:
index.search(
"test",
{
page: true,
limit: 5
})
Result:
{
"page": "0",
"next": "5",
"result": [
{
"title": "Load Testing V. 1.0.1",
"content": "test",
"url": "/Project_Management/validations/validation2"
},
{
"title": "Pre Test Inpsection Report",
"content": "test",
"url": "/V_and_V/5016-09-F21"
},
{
"title": "Packaging Validaiton Test Report",
"content": "test",
"url": "/V_and_V/5016-09-F19"
},
{
"title": "EMC 60601 Test Plan",
"content": "test",
"url": "/V_and_V/5016-09-F23"
},
{
"title": "Third Party Testing",
"content": "test",
"url": "/3rd_Party_Testing"
}
]
}
Invoke:
index.search(
[
{
field: "title",
query: "test",
boost: 1
},
{
field: "content",
query: "test",
boost: 0.5
}
],
{
page: true,
limit: 5
}));
Result:
{
"page": "0",
"next": null,
"result": [
]
}
I need to be able to page the results, while also search multiple fields with different boost
values.
Hi
I am trying to run the example you posted in issue #30 without any luck.
Here is the code:
const FlexSearch = require('flexsearch')
// provide a document descriptor for each index
// the field "id" and at least one "field" is mandatory.
const settings = {
'bookstore': {
preset: 'score',
doc: {
id: 'id',
field: ['intent', 'text']
}
},
'pizzashop': {
encode: 'extra',
tokenize: 'strict',
depth: 5,
threshold: 5,
doc: {
id: 'id',
field: ['intent', 'text']
}
},
'votingbooth': {
encode: 'advanced',
tokenize: 'forward',
threshold: 5,
doc: {
id: 'id',
field: ['intent', 'text']
}
}
}
const index = {}
const add = (cat, doc) => {
const i = index[cat] || (
index[cat] = new FlexSearch(settings[cat])
)
i.add(doc)
}
const search = (cat, query) => {
return index[cat] ? index[cat].search(query) : []
}
// provide documents which have the same structure as defined in the document descriptor above
const bookstore = [{
id: 0,
intent: 'intent',
text: 'text'
}, {
id: 1,
intent: 'intent',
text: 'i am searching for a book'
}]
const pizzashop = [{
id: 0,
intent: 'intent',
text: 'text'
}, {
id: 1,
intent: 'intent',
text: 'howdy'
}]
const votingbooth = [{
id: 0,
intent: 'intent',
text: 'text'
}, {
id: 1,
intent: 'intent',
text: 'i need directions'
}]
// add a full document or an array of documents to the index
add('bookstore', bookstore)
add('pizzashop', pizzashop)
add('votingbooth', votingbooth)
console.log('INFO', index['bookstore'].info())
console.log('INFO', index['pizzashop'].info())
console.log('INFO', index['votingbooth'].info())
console.log('INFO', index['bookstore'])
// search
const result1 = search('bookstore', 'i am searching for a book') // --> [1]
const result2 = search('pizzashop', 'howdy') // --> [1]
const result3 = search('votingbooth', 'i need directions') // --> [1]
console.log('========== FAST SEARCH TEST ==========')
console.log(result1)
console.log(result2)
console.log(result3)
and the ouput I get is:
INFO { id: 0,
memory: 0,
items: 0,
sequences: 0,
chars: 0,
cache: false,
matcher: 0,
worker: undefined,
threshold: 1,
depth: 4,
contextual: true }
INFO { id: 3,
memory: 0,
items: 0,
sequences: 0,
chars: 0,
cache: false,
matcher: 0,
worker: undefined,
threshold: 5,
depth: 5,
contextual: true }
INFO { id: 6,
memory: 0,
items: 0,
sequences: 0,
chars: 0,
cache: false,
matcher: 0,
worker: undefined,
threshold: 5,
depth: 0,
contextual: 0 }
INFO k {
id: 0,
o: [],
f: 'strict',
w: false,
async: false,
threshold: 1,
b: 9,
depth: 4,
C: false,
m: false,
s: [Function: bound ],
a:
{ id: [ 'id' ],
field: [ [Array], [Array] ],
index: { intent: [k], text: [k] },
keys: [ 'intent', 'text' ] },
h:
[ [Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {},
[Object: null prototype] {} ],
i: [Object: null prototype] {},
c: [Object: null prototype] {},
g:
[Object: null prototype] {
'0': { id: 0, intent: 'intent', text: 'text' },
'1':
{ id: 1, intent: 'intent', text: 'i am searching for a book' } },
v: true,
cache: false,
j: false }
========== FAST SEARCH TEST ==========
[]
[]
[]
Am I missing something?
Node version: 11.9.0
Thanks in advance...
Idea: implement funding of this project using something like issuehunt
Hello,
I've faced with the error: TypeError: Cannot convert object to primitive value
It is produced in the .add
method.
Version: 0.6.21
Version 0.6.2 works as expected.
Hello,
I've faced with the following behaviour.
This example works as expected:
const FlexSearch = require('flexsearch');
const index = new FlexSearch();
index.add(1, 'Foobar')
console.log(index.search('Foobar'));
// [ 1 ]
But this one shows no results.
const FlexSearch = require('flexsearch');
const index = new FlexSearch();
index.add(1, 'Фообар')
console.log(index.search('Фообар'));
// []
I've tested in node and in browser.
Hello. Is there any plans to support offset in addition to limit for implementing pagination? Thanks in advance.
The next page is not a problem, but the previous one. When I call the previous page, I get an array instead of an object. Then the fields for the page are also missing.
Could you give an simple example of a pagination back and forth?
I say weird, but I should rather say… unconventional.
Like:
while(i < length){
tmp = arr[i++];
const index = "@" + tmp;
if(check[index]){
Are they on purpose?
If so, what is their purpose?
If not, could using tools like Prettier (or Prettier + ESLint) help?
Are there any plans to support typescript?
I’d love to read more on this scoring strategy.
I look for the paper that is cited in the README. I can not find on the web.
Any help?
Only object type stemmer is supported
Hello, is it possible to count distinct values of field or\and get distinct values for some fields? For example, when searching products in catalog, it's good to know distinct category id's of results
The readme includes the line
Note: This feature is actually not enabled by default. Read here how to enable.
but the "here" link doesn't go to any page, and I can't find the intended target in the repo :-o
I don't know enough fulltext index terminology to infer what these two settings actually mean.
I'm guessing from context that "depth" is the maximum number of words/tokens away a term can be and still be considered relevant.
I have no idea what the "threshold" number implies. :-x
I know I want that sweet contextual searching, so I'd love to figure this out so I can pick numbers appropriate to my use case.
first thanks for great library, but its not working with react native :(
When I set a depth
, I would expect that if I search for multiple terms, documents that contain those terms near each other would score higher.
Example:
const FlexSearch = require(`flexsearch`)
const index = new FlexSearch({
tokenize: `strict`,
encode: `advanced`,
cache: false,
doc: {
id: `id`,
field: {
content: {
threshold: 9,
resolution: 10,
depth: 2,
},
},
},
})
index.add([{
id: 1,
content: `billy who now what billy okay so what now thorton?`,
}, {
id: 2,
content: `billy bob thorton`,
}])
console.log(
index.search(`billy thorton`)
)
// => [ { id: 1,
// content: 'billy who now what billy okay so what now thorton?' },
// { id: 2, content: 'billy bob thorton' } ]
I would expect document id 2 to be the top result, since it contains "billy" and "thorton" within two words of each other, but the top result is actually document id 1.
Tested in [email protected].
I tried to use code example from unit test, but got the following error:
Code to reproduce:
const FlexSearch = require('flexsearch')
// tslint:disable
;(async () => {
const index = new FlexSearch({
async: true,
doc: {
id: 'id',
field: [ 'data:name' ]
}
})
const data = [{
id: 2,
data: {
title: 'Title 3',
body: 'Body 3'
}
}, {
id: 1,
data: {
title: 'Title 2',
body: 'Body 2'
}
}, {
id: 0,
data: {
title: 'Title 1',
body: 'Body 1'
}
}]
await index.add(data)
console.log(index.search)
const result = await index.search({
field: 'data:body',
query: 'body'
})
console.dir(result)
})()
Output:
[Function]
(node:10016) UnhandledPromiseRejectionWarning: TypeError: Cannot read property 'search' of undefined
at h.search (C:\Users\User\Documents\Projects\test\node_modules\flexsearch\dist\flexsearch.node.js:24:281)
at C:\Users\User\Documents\Projects\test\index.js:38:30
at process._tickCallback (internal/process/next_tick.js:43:7)
at Function.Module.runMain (internal/modules/cjs/loader.js:778:11)
at startup (internal/bootstrap/node.js:300:19)
at bootstrapNodeJSCore (internal/bootstrap/node.js:826:3)
(node:10016) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 1)
(node:10016) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
Environment: Node
Node version: v11.2.0
Flexsearch version: "^0.5.2"
I thinking about to remove these features:
index.find()
(get document by ID will remain)index.where()
The main reasons for this may:
What do you think about?
I tried to activate the suggestion function but it does not change anything in the result. How does it work?
thanks.
For each item that matches a query, I'd like to be able to get unindexed arbitrary data — not just its ID.
For example: for matches when searching Shakespeare plays, I'd like to be able to return the text of an individual line (which is indexed) but also play name, location, speaker, etc.
What's the best way to achieve this?
I can do this in Elasticlunr (for example) like this:
const index = elasticlunr(function() {
this.addField('text'); // doc property to be indexed
this.setRef('id'); // doc property that is the ID of each item
for (const doc of docs) {
// doc includes additional arbitrary data for each item: play, speaker, location, etc.
this.addDoc(doc);
}
}
Would I simply need to create an object that maps IDs with item data, or is there a better way to do this?
Great project by the way — thanks so much for building this.
Can someone do a benchmark between this library and Algolia?
I just want to know if I should drop algolia for a better copycat?
Thank you ;)
For example in Chinese, 一个单词
are two words. How to make sure I can get the correct result when searching 单词
?
I expected to get matching documents to be unique within result. What is the angle for repeating these?
Example:
const f = new FlexSearch({
doc: {
id: 'id',
field: ['field1', 'field2']
}
})
const docs = [
{id: 1, field1: 'phrase', field2: 'phrase'}
]
f.add(docs)
console.log(f.search('phrase'))
// Result = [{id: 1, field1: "phrase", field2: "phrase"} 1: {id: 1, field1: "phrase", field2: "phrase"}]
NOTE: I've rewritten the entire issue because I've found a way to reproduce my issue on a very small dataset.
I've noticed that I'm missing search results depending on the order of fields that I provide when creating the index.
In the following example, there are two objects where notation:0
matches the search term WW 8840
, and one object where prefLabel:de
matches WW 8840
. In the first example, only the latter object is returned as a search result even though all fields are supposed to be searched. The second example returns the correct search results just by reordering the fields (putting notation:0
to the end). Note that when specifying notation:0
as the only field to search, it will return the correct results in both cases.
Non-working example (prints 1
and 2
even though the first query should return 3
results):
const FlexSearch = require("flexsearch")
let index = new FlexSearch({
doc: {
id: "uri",
field: [
"prefLabel:de",
"notation",
"editorialNote:de",
]
},
profile: "score"
})
// Example dataset
let concepts = [
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WW%208720%20-%20WW%209239"}],"created":"2012-07-05","editorialNote":{"de":"(Blutgruppen s. XD 3200)"},"http://www.w3.org/2004/02/skos/core#closeMatch":[{"uri":"http://d-nb.info/gnd/4130604-1"},{"uri":"http://d-nb.info/gnd/4022814-9"},{"uri":"http://d-nb.info/gnd/4070945-0"},{"uri":"http://d-nb.info/gnd/4074195-3"}],"identifier":["152145:13422"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WW 8840 - WW 8879","prefLabel":{"de":"Blutkörperchen (Erythrozyt, Leukozyt), Hämoglobin"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WW%208840%20-%20WW%208879"},
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WD%205000%20-%20WD%205970"}],"created":"2012-07-05","editorialNote":{"de":"(Antibiotika s. XI 3500)"},"http://www.w3.org/2004/02/skos/core#closeMatch":[{"uri":"http://d-nb.info/gnd/4155845-5"},{"uri":"http://d-nb.info/gnd/4276935-8"},{"uri":"http://d-nb.info/gnd/4176522-9"},{"uri":"http://d-nb.info/gnd/4175383-5"},{"uri":"http://d-nb.info/gnd/4148701-1"}],"identifier":["148204:"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WD 5380","prefLabel":{"de":"Pyrrolfarbstoffe, Cytochrome, Chromoproteine (Hämoglobin s. WW 8840)"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WD%205380"},
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WW%208840%20-%20WW%208879"}],"created":"2012-07-05","editorialNote":{},"identifier":["152145:13423"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WW 8840","prefLabel":{"de":"Allgemeines"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WW%208840"}
]
index.add(concepts)
let results
results = index.search("WW 8840")
console.log(results.length) // only matches the second concept (which mentions "WW 8840" in label)
results = index.search("WW 8840", {
field: "notation"
})
console.log(results.length) // correctly matches two concepts
// with large dataset, also correctly matches the two concepts
Working example (prints 3
and 2
as expected, just by reordering fields):
const FlexSearch = require("flexsearch")
let index = new FlexSearch({
doc: {
id: "uri",
field: [
"prefLabel:de",
"editorialNote:de",
"notation",
]
},
profile: "score"
})
// Example dataset
let concepts = [
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WW%208720%20-%20WW%209239"}],"created":"2012-07-05","editorialNote":{"de":"(Blutgruppen s. XD 3200)"},"http://www.w3.org/2004/02/skos/core#closeMatch":[{"uri":"http://d-nb.info/gnd/4130604-1"},{"uri":"http://d-nb.info/gnd/4022814-9"},{"uri":"http://d-nb.info/gnd/4070945-0"},{"uri":"http://d-nb.info/gnd/4074195-3"}],"identifier":["152145:13422"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WW 8840 - WW 8879","prefLabel":{"de":"Blutkörperchen (Erythrozyt, Leukozyt), Hämoglobin"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WW%208840%20-%20WW%208879"},
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WD%205000%20-%20WD%205970"}],"created":"2012-07-05","editorialNote":{"de":"(Antibiotika s. XI 3500)"},"http://www.w3.org/2004/02/skos/core#closeMatch":[{"uri":"http://d-nb.info/gnd/4155845-5"},{"uri":"http://d-nb.info/gnd/4276935-8"},{"uri":"http://d-nb.info/gnd/4176522-9"},{"uri":"http://d-nb.info/gnd/4175383-5"},{"uri":"http://d-nb.info/gnd/4148701-1"}],"identifier":["148204:"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WD 5380","prefLabel":{"de":"Pyrrolfarbstoffe, Cytochrome, Chromoproteine (Hämoglobin s. WW 8840)"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WD%205380"},
{"@context":"https://gbv.github.io/jskos/context.json","broader":[{"uri":"http://rvk.uni-regensburg.de/nt/WW%208840%20-%20WW%208879"}],"created":"2012-07-05","editorialNote":{},"identifier":["152145:13423"],"inScheme":[{"uri":"http://uri.gbv.de/terminology/rvk/"}],"modified":"2018-12-14","notation":"WW 8840","prefLabel":{"de":"Allgemeines"},"type":["http://www.w3.org/2004/02/skos/core#Concept"],"uri":"http://rvk.uni-regensburg.de/nt/WW%208840"}
]
index.add(concepts)
let results
results = index.search("WW 8840")
console.log(results.length) // only matches the second concept (which mentions "WW 8840" in label)
results = index.search("WW 8840", {
field: "notation"
})
console.log(results.length) // correctly matches two concepts
// with large dataset, also correctly matches the two concepts
Any idea why this is happening? Thanks!
Hello, first of all, thanks for creating new nice search engine. We are looking to use it instead of elasticsearch, which is very complex and have lots of legacy in it’s DSL and difficulties to get desired results. Currently we are interested if there’s any plans to implement multiple documents update by single query? It’s necessary, for example, to disable some of products when it’s category is disabled.
Also, to avoid creating another ticket, I would like know if it is possible to boost search result based on numeric value stored in search index itself.
Thanks in advance.
Hey
First thanks for the amazing library!
I would like to know if you can index a number and get the subject name, sub-topic, and paragraph number.
And whether it is possible to find two paragraphs together
For example
book:
[
{
"topic": "topic",
"content": [
{
"title":
"parts": [
"word1, word2, word3, word4, word5",
"word6, word7, word8, word9, word10",
]
}
]
}
]
index.search("word2 word3") // = [{topic: "topic1", title: "title1", part: 0}]
index.search("word5 word6") // = [{topic: "topic1", title: "title1", part: 0}, {topic: "topic1", title: "title1", part: 0}]
```
Thanks
I see the documentation on indexing different fields in a document has been fleshed out, which is great, I was wondering how that would work.
The readme claims that field searching is a thing in 0.7.0, but the changelog only goes up to 0.6.0 and the version on npm is 0.6.2 – what's the deal there?
Besides wondering where I could find 0.7.0 I have one question: how does boosting work?
I have a document with a title
, and a body
. I want matches in the title
to count towards the score 10x more than matches in the body
.
Could I achieve that by setting the boost on the title
field to 10, and the boost on the body
field to 1? Is that how boost works, or have I misguessed? What is the default boost for a field?
We use flexsearch in a react app. Performs pretty well, thanks!
We store the flexsearch settings in a constant outside of a component. We also store documents and not key values pairs.
The first initialization of the component works perfect. All following behave wrong. The doc property is null. I guess flexsearch accesses the object by reference and somehow replaces the doc property.
Is this behavior expected?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.