Giter VIP home page Giter VIP logo

Comments (10)

bennimmo avatar bennimmo commented on May 25, 2024 1

I would be able to utilise this library with this functionality. Love the work though! I would be using the python lib.

from usearch.

ashvardanian avatar ashvardanian commented on May 25, 2024

Hi @plurch! That's available in the C++ layer already. I assume you are using a binding. Is it written in Python? Are you going to check those IDs against some Python data-structure like a set or dictionary?

from usearch.

plurch avatar plurch commented on May 25, 2024

Hi @ashvardanian , I am using the python bindings to build the index and then the javascript bindings to do the search from a web app. So I really would be using the search filtering through JS in my case, but python support would probably be useful eventually also.

I think some type of interface similar to faiss that accepts a set of int ids to exclude/include might work. This isn't something that is actively blocking me, just curious about feasibility at this point while evaluating some ANN libraries. Thanks!

from usearch.

ashvardanian avatar ashvardanian commented on May 25, 2024

Gotcha, @plurch, thanks for the feedback! I will keep in mind for future releases πŸ€—

from usearch.

plurch avatar plurch commented on May 25, 2024

Sounds good πŸ‘

from usearch.

raulcarlomagno avatar raulcarlomagno commented on May 25, 2024

metadata filtering would a game change feature

are you thinking about adding metadata storage besides vectors storage?
i mean, for the filtering support. Avoiding Faiss way in which you should filter in advance the ids to compare, but sometimes these ids could be million quantities

from usearch.

ashvardanian avatar ashvardanian commented on May 25, 2024

@raulcarlomagno, in our case, we use predicate functions instead of an ID list. Passing them from C and Rust isn't hard to add, C++ already supports that, but in Python and JavaScript, I am not sure about how we can make it fast...

from usearch.

raulcarlomagno avatar raulcarlomagno commented on May 25, 2024

what about adding an optional storage for metadata like rocksdb? you keep the current vectors index, and other index for the metadata, and this predicate function thing is done inside C, not python
the heavy thing is done in internally in C, transparent for python API wrapper

or maybe you don't want to mess storing metadata... ☺️

from usearch.

bennimmo avatar bennimmo commented on May 25, 2024

Hi @plurch! That's available in the C++ layer already. I assume you are using a binding. Is it written in Python? Are you going to check those IDs against some Python data-structure like a set or dictionary?

Would this also apply to the clustering, as this would be a real game changer?

from usearch.

lukebuehler avatar lukebuehler commented on May 25, 2024

I just built a PoC with usearch. It's amazing! However, metadata filtering is blocking me to use it in our product. In our case, we were first using the java bindings, but are now using python. A predicate solution, like you have in c++, would work. However setting some meta data fields and then being able to filter on them would be the best--basically how it works in qdrant.

I'm aware that this is a big ask and will require to extend your store by some other, non-vector index, but it would make it one of the most attractive in-process vector stores out there.

from usearch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.