Comments (3)
While itβs still being developed, it appears this library https://github.com/asg017/sqlite-vec will support vector search with sqlite in wasm. I hope usearch sqlite extensions support wasm in the future too!
from usearch.
I think we should update the release.yml
to use WASI SDK 21 over the currently used 20. Moreover, we should avoid using the SDK directly, and instead pass it as a CMake toolchain, as documented.
from usearch.
If we get this, my app for local rag would be solved. Although am concerned about data persistence, my use case is users upload textbooks and can perform vector similarity on individual textbooks instead of lets say a user uploads 10,000 textbooks, my project atm will do textbook by textbook to avoid overloading the memory.
How would 1: data persistance and 2: memory be handled with wasm + sqlite + usearch based approach?
Conceptually I would like something like
Query: who are the authors
Textbook: Calculus textbook selected, the sqlite id is textbook_id_here
Sqlite: target the table textbook_id_here
and query it using unum search
I've read OPFS + wasm may be a great solution to this but in general will the entire db will all be populated in memory. There seems to not be a single non-memory solution to vector search in browser. Disk-ANN works in C and other languages, im assuming solely due to file system and non browser restrictions. However, I assume a browser-js native implementation can work complete with hnsw and vector search but without multithreading as i believe multithread is not possible in browser-js. Given OPFS is files and supports reading the bytes its basically gives freedom to talk to unum indexes, avoiding the posisblitiy of using wasm all together.
Similar projects I've found are here:
https://github.com/askorama/orama
They allow client based vector search by being js native
https://github.com/askorama/orama/blob/main/packages/docs/open-source/usage/search/vector-search.md#L5L102
And heres how they persist data
https://github.com/askorama/orama/blob/44836b3f2132061b907015f18bea334f9dd4478b/packages/docs/open-source/plugins/plugin-data-persistence.md#L5L104
However for sqlite approach theres also this repo that saves sqlite
storage in indexedDB (which apprently OPFS supersedes in performance but heres the repo nonetheless)
https://github.com/jlongster/absurd-sql
Theres also work by the official sqlite team to address this persisting storage
https://sqlite.org/wasm/doc/trunk/persistence.md
Given this information perhaps we can just have native js browser based OPFS solution to query unum files index.unumsearch
calculus.unumsearch
and my app can just be query("my calculus question", opfs.file(namehere)) on unum file per textbook
Interested in more discussion here
from usearch.
Related Issues (20)
- Bug: Syntax Error with Jest in ESM HOT 3
- Bug: Rust build does not use simsimd (`index.hardware_acceleration()` reports `serial`) HOT 2
- Bug: cannot open old database (created with 2.9.2) with new version (2.12.0) HOT 8
- Feature: adding `py.typed` metadata to `python/usearch` HOT 1
- Bug: npm package does not support esm in nodejs project. HOT 3
- "usearch_sqlite" binary for Windows HOT 1
- Bug: Rust test_add_remove_vector fails on main-dev HOT 2
- Docs: The usearch.h exact_search documentation doesn't match the function arguments HOT 2
- Bug: On the SIFT1M test set, the speed of Usearch is much slower than HNSWlib HOT 23
- Bug: Insertwith multiple threads throws exception on C# Bindings HOT 4
- Bug: Non of methods are working in Multi-threaded envirionment HOT 12
- Bug: Index restore view=True does not work for shared user index HOT 3
- Usearch in Rust fails to compile for Android
- Bug: JAVA Search not match after deleting key HOT 3
- Bug: 'Illegal Instruction' while restoring an index from disk HOT 1
- Bug: Missing results after save + view + filtered_search HOT 5
- Bug: Index build is 40% slower when SIMSIMD is enabled on Mac M1 Pro HOT 10
- Bug: error loading SQLite extension without python
- Bug: [Rust] all distances are zero HOT 2
- Feature: Add Python documentation regarding how to search with custom metrics with more than two variables
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from usearch.