Comments (10)
Hello, I know this feature from database queries but sadly this functionality isn't actually supported by FlexSearch.
from flexsearch.
@ts-thomas what preventing us from support this option by adding matching distinct values to object and accumulating it during search iteration? I just want to know how difficult is to implement it and if there's chances that you will accept pull request with it.
from flexsearch.
Sounds good to me. What would help me is a short example of a small set of documents and the desired result when searching. That would give me a better insight.
from flexsearch.
@ts-thomas I will provide you an proposed example when possible. Are you considering to split large chunk of code onto modules to simplify development? it could be helpful for creating PR's
from flexsearch.
Yes of course, it is already on my plan. I'm considering to use the new ES6 modules functionality because it is also compatible with Closure Compiler. Another option is to port the codebase to TypeScript. Would be nice to know, how TypeScript could be compiled into other programming languages easily (I personally targeting C/C++, Java, Python). Java JNI is also an option for me for this purpose.
from flexsearch.
Here is some proposed examples:
const documents = [
{ id: 1, data: 'text 1', category: 1 },
{ id: 2, data: 'text 2', category: 1 },
{ id: 3, data: 'text 1', category: 2 }
]
// Getting distinct values
const results = index.search({
query: text,
distinct: ['data', 'category']
})
// results containing:
{
documents: [
{ id: 1, data: 'text 1', category: 1 },
{ id: 2, data: 'text 2', category: 1 },
{ id: 3, data: 'text 1', category: 2 }
],
distinct: {
category: [1, 2],
data: ['text 1', 'text 2']
}
}
// Getting distinct count
const results = index.search({
query: 'text',
distinct_count: ['data', 'category']
})
// results containing:
{
documents: [
{ id: 1, data: 'text 1', category: 1 },
{ id: 2, data: 'text 2', category: 1 },
{ id: 3, data: 'text 1', category: 2 }
],
distinct_count: {
category: 2,
data: 2
}
}
Note that sometimes distinct values will be too large and optionally we should support only distinct count as well. One more part is that found documents is returned in separate field, thus it give flexibility to store additional data in return result without complicating overall API.
About C++, I'm not quite sure that it good idea to compile typescript to C++. In my experience, I had very successful case of implementing algorithms itself in C\C++ library, then wrapping it to use by other scripting languages, probably it good idea in your case too. For example, you can use swig to create adapter for upper level languages. Only one disadvantage here is that it would not work for web browser usage, but it could significantly reduce overhead of re-implementing algorithm for every scripting language.
I think good starting point here could be creation of robust, well documented and stable TypeScript implementation
from flexsearch.
Thanks a lot for the example. The last thing which is not clear for me is what is the main purpose to have distinct in the result? I think this would help me to understand the requirements. I also added this feature to the milestones https://github.com/nextapps-de/flexsearch/milestone/25
The separate field for the results is a good point, because it is also needed by the pagination.
The TypeScript port of the core functionality as a ultimate base makes a lot of sense and will come surely.
from flexsearch.
Simple example where distinct is useful:
Let’s say that we store category id within product document. When we query for products, we also need to know in which categories the search result products are. It’s useful to perform filtering after search query was executed. Sorry for my poor English.
from flexsearch.
Thanks for providing me an useful example. This feature would be possible and may coming soon. They may some other tasks which needs to be done before (like Plugin-API), the distinct would be a good example for a plugin.
from flexsearch.
This feature was added to the milestones. It should be build on top of the upcoming Plugin API. This makes it easier for everyone to build features without understanding the whole algorithm.
from flexsearch.
Related Issues (20)
- RangeError: Invalid string length HOT 2
- How to import Document in nodejs with ESNext modules (typescript)?
- Invalid regular expression
- Suggestion and tokenize "forward"
- Webpack cannot resolve flexsearch HOT 1
- `IndexOptions` TS interface is missing some options HOT 1
- Current NPM package is stale HOT 2
- Cloning of flexsearch objects
- new Document causes `.default is not a constructor` error HOT 1
- Benchmark link is broken HOT 1
- Setting a string as "encode" doesn't work (confusing document?)
- How to return context (+ or - few lines) around the hits of given search text
- Enriched document search showing duplicate results
- I have a question about how the results are sorted
- Do not force string in Index#add() when custom encoder
- "Document Indexes" link in readme is broken
- Typo in cdn link
- TypeScript doesn't allow omitting id from the document descriptor while the README says it is allowed
- document search option "pluck" is rejected by TypeScript
- Some documents may appear multiple times in the search result
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flexsearch.