Giter VIP home page Giter VIP logo

Comments (10)

ts-thomas avatar ts-thomas commented on May 6, 2024

Hello, I know this feature from database queries but sadly this functionality isn't actually supported by FlexSearch.

from flexsearch.

georgyfarniev avatar georgyfarniev commented on May 6, 2024

@ts-thomas what preventing us from support this option by adding matching distinct values to object and accumulating it during search iteration? I just want to know how difficult is to implement it and if there's chances that you will accept pull request with it.

from flexsearch.

ts-thomas avatar ts-thomas commented on May 6, 2024

Sounds good to me. What would help me is a short example of a small set of documents and the desired result when searching. That would give me a better insight.

from flexsearch.

georgyfarniev avatar georgyfarniev commented on May 6, 2024

@ts-thomas I will provide you an proposed example when possible. Are you considering to split large chunk of code onto modules to simplify development? it could be helpful for creating PR's

from flexsearch.

ts-thomas avatar ts-thomas commented on May 6, 2024

Yes of course, it is already on my plan. I'm considering to use the new ES6 modules functionality because it is also compatible with Closure Compiler. Another option is to port the codebase to TypeScript. Would be nice to know, how TypeScript could be compiled into other programming languages easily (I personally targeting C/C++, Java, Python). Java JNI is also an option for me for this purpose.

from flexsearch.

georgyfarniev avatar georgyfarniev commented on May 6, 2024

Here is some proposed examples:

const documents = [
  { id: 1, data: 'text 1', category: 1 },
  { id: 2, data: 'text 2', category: 1 },
  { id: 3, data: 'text 1', category: 2 }
]


// Getting distinct values
const results = index.search({
  query: text,
  distinct: ['data', 'category']
})


// results containing:
{
  documents: [
    { id: 1, data: 'text 1', category: 1 },
    { id: 2, data: 'text 2', category: 1 },
    { id: 3, data: 'text 1', category: 2 }
  ],
  distinct: {
    category: [1, 2],
    data: ['text 1', 'text 2']
  }
}

// Getting distinct count
const results = index.search({
  query: 'text',
  distinct_count: ['data', 'category']
})


// results containing:
{
  documents: [
    { id: 1, data: 'text 1', category: 1 },
    { id: 2, data: 'text 2', category: 1 },
    { id: 3, data: 'text 1', category: 2 }
  ],
  distinct_count: {
    category: 2,
    data: 2
  }
}

Note that sometimes distinct values will be too large and optionally we should support only distinct count as well. One more part is that found documents is returned in separate field, thus it give flexibility to store additional data in return result without complicating overall API.

About C++, I'm not quite sure that it good idea to compile typescript to C++. In my experience, I had very successful case of implementing algorithms itself in C\C++ library, then wrapping it to use by other scripting languages, probably it good idea in your case too. For example, you can use swig to create adapter for upper level languages. Only one disadvantage here is that it would not work for web browser usage, but it could significantly reduce overhead of re-implementing algorithm for every scripting language.

I think good starting point here could be creation of robust, well documented and stable TypeScript implementation

from flexsearch.

ts-thomas avatar ts-thomas commented on May 6, 2024

Thanks a lot for the example. The last thing which is not clear for me is what is the main purpose to have distinct in the result? I think this would help me to understand the requirements. I also added this feature to the milestones https://github.com/nextapps-de/flexsearch/milestone/25

The separate field for the results is a good point, because it is also needed by the pagination.

The TypeScript port of the core functionality as a ultimate base makes a lot of sense and will come surely.

from flexsearch.

georgyfarniev avatar georgyfarniev commented on May 6, 2024

Simple example where distinct is useful:

Let’s say that we store category id within product document. When we query for products, we also need to know in which categories the search result products are. It’s useful to perform filtering after search query was executed. Sorry for my poor English.

from flexsearch.

ts-thomas avatar ts-thomas commented on May 6, 2024

Thanks for providing me an useful example. This feature would be possible and may coming soon. They may some other tasks which needs to be done before (like Plugin-API), the distinct would be a good example for a plugin.

from flexsearch.

ts-thomas avatar ts-thomas commented on May 6, 2024

This feature was added to the milestones. It should be build on top of the upcoming Plugin API. This makes it easier for everyone to build features without understanding the whole algorithm.

from flexsearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.