Comments (18)
Thanks for looking into it.
- 512-1024
- l2sq
- Python
This is the typical retrieval scenario in web search when LLMs are used.
I am passing exact=True for search as you see in my test code.
from usearch.
I can indeed confirm vastly better performance on my setup (by a factor 20). I can only say fantastic work! π
Question: How does one get the distances?
from usearch.
@vprelovac, oh, I see! So IndexFlatL2 is not actually an index. It just appends vectors and brute-forces on search. On tiny collections with 1000 entries, it makes sense. The kNN benchmarks are generally meaningful from 1M entries onwards. The correct comparison there would be against IndexHNSW from FAISS.
That being said, we can probably automatically switch to an exact search on tiny collections :) I will consider it in the next release.
from usearch.
the kNN benchmarks are generally meaningful from 1M entries onwards.
Except when they are not :) Our production use case is 500-50000 items, and the index has to be recreated every time. Every ms matters.
Is there a configuration where usearch can be faster than faiss on this use case?
from usearch.
@vprelovac, probably. You can pass the additional exact=True
for the' search' queries, which will switch to brute force.
As for construction, I've just added a task to the 1.0 TODO list.
Can you please provide more details about your setup?
- What's the dimensionality of the vectors?
- Which metrics do you use?
- Which interface/platform do you prefer? Python, C++, WASM, ...
We may be able to adjust the implementation accordingly π€
from usearch.
Hey, @vprelovac! I have implemented an exact search shortcut, which avoids index construction altogether. The interface is as simple as it gets:
from usearch.index import search, MetricKind
import numpy as np
# Set the seed for reproducibility
np.random.seed(42)
# Generate 500 random vectors of length 1024 with values between 0 and 1
vectors = np.random.rand(500, 1024).astype(np.float32)
vector = np.random.rand(1024).astype(np.float32)
search(vectors, vector, 50, MetricKind.L2sq, exact=True)
I have repeated your benchmark on my MacBook Pro with the M2 CPU, and the speed is higher than for FAISS, even without having a BLAS dependency and no SimSIMD optimizations for the L2 metric. This is going to be merged into 1.0. More optimizations coming soon.
from usearch.
π This issue has been resolved in version 0.23.0 π
The release is available on GitHub release
Your semantic-release bot π¦π
from usearch.
Those should be in matches.distances
. This method also supports batch search, so you can search for many entries at once. It also supports an additional threads
argument, to specify the amount of CPU resources to allocate.
I am glad it worked so well, @vprelovac! We are also preparing major releases in UCall, UForm, and UStore, so stay tuned and donβt hesitate to share feedback. Myself and the team will be happy to find new ways we can improve our projects π€
from usearch.
Thanks this solves the use case for small size, real time, vector based search.
Given another use case of say 1-5M items, whre index can be built before search - how would you approach that?
from usearch.
@vprelovac for that I would indeed use the Index
class. Iβd recommend experimenting with the dtype
constructor argument. Try βi8β. If recall doesnβt drop, use that. Otherwise, use βf16β. It will be both space-efficient and fast. I am also experimenting with βi4β, which should work in case you use heavily quantized models. Will have more updates for this in the next couple of days. You can also control the balance between speed and accuracy using the connectivity
constructor argument. Higher value, will probably lead to slower construction time, may result in higher accuracy, and will definitely use more memory.
from usearch.
@ashvardanian Sorry for reopening an old thread.
Finally had some more time to do testing with new version. Here is replication code.
Findings at the end.
!pip install faiss-cpu git+https://github.com/vioshyvo/mrpt/ usearch
from usearch.index import search, MetricKind
import numpy as np
import mrpt
import faiss
import timeit
import matplotlib.pyplot as plt
def run_mrpt(vector, vectors, k=15):
index = mrpt.MRPTIndex(vectors)
res=index.exact_search(vector, k, return_distances=False)
return res
def run_faiss(vector, vectors,k=15):
index = faiss.IndexFlatL2(vectors.shape[-1])
index.add(vectors)
D, I = index.search(np.array([vector]), k)
return I
def run_usearch(vector, vectors, k=15):
matches= search (vectors, vector, k, MetricKind.L2sq, exact=True)
return matches
sizes = [16, 24, 36, 54, 81, 122, 182, 273, 410, 615, 923, 1384, 2076]
#sizes = [3570, 4500, 5500, 6700, 7700, 8800, 9900, 11000, 15000,37481, 67466, 121440, 218591, 500000]
# Initialize lists to store the execution times
mrpt_times = []
faiss_times = []
usearch_times = []
number=20
# Benchmark the functions for each dataset size
for size in sizes:
print(size)
# Generate random dataset and query vector
vectors = np.random.rand(size, 768).astype(np.float32)
vector = np.random.rand(768).astype(np.float32)
# Time the execution of each function
mrpt_time = timeit.timeit(lambda: run_mrpt(vector, vectors, k=15), number=number)
faiss_time = timeit.timeit(lambda: run_faiss(vector, vectors, k=15), number=number)
usearch_time = timeit.timeit(lambda: run_usearch(vector, vectors, k=15), number=number)
# Append the execution times to the corresponding lists
mrpt_times.append(mrpt_time)
faiss_times.append(faiss_time)
usearch_times.append(usearch_time)
# Plot the results
plt.figure(figsize=(20, 12))
plt.plot(sizes, mrpt_times, label='MRPT', marker='o')
plt.plot(sizes, faiss_times, label='FAISS', marker='o')
plt.plot(sizes, usearch_times, label='usearch', marker='o')
plt.xlabel('Number of Vectors')
plt.ylabel('Execution Time (s)')
plt.title('Nearest Neighbor Search Performance')
plt.legend()
plt.grid()
plt.show()
What is interesting is that for small number of items (<2000), Faiss is still the king of the hill and both mrpt and usearch show great varience.
Going above that, Faiss starts to lag sinigifcantly, and MRPT consistently comes on top.
So perfect verctor search at the moment is a combination of Faiss for <2000 items and MRPT for more than that. Can you make one to rule them all? ;)
from usearch.
For sure, @vprelovac! Thanks for sharing updates! Which hardware are you running on? Can you please check my SimSIMD library? I have upgraded it recently, but haven't merged into USearch yet.
from usearch.
This is just CPU on colab. Not sure what SimSIMD is, but if it affects the speed of usearch on CPU, it should be installed and used by default? Or an example using my code would be great.
from usearch.
@vprelovac, I will double check in the next couple of days, in the meantime, here is what I mean by SimSIMD - it now has it's own light weight Python bindings. Here is a set of benchmarks for SimSIMD vs NumPy vs SciPy on a few different machines.
from usearch.
Also, if you are running on recent hardware, you may prefer to use half-precision floating point numbers.
from usearch.
@vprelovac, can you please check the newest release and please try half-precision as well π€
from usearch.
@ashvardanian Sure - Can you let me know if you tried it with the above code and do I need to make any changes to it?
from usearch.
Here are the results I get running your script, @vprelovac:
If I understand correctly, MRPTIndex
is just a wrapper around the C++ Eigen library. It's a pretty good library for numeric computations. It might be more accurate to compare it to the faiss.IndexFlatIP
and the usearch.MetricKind.IP
.
All three systems support batch queries. Both FAISS and USearch support half-precision. With half-precision, FAISS becomes much slower, but we retain good performance. Here are the charts I'm getting for batch size 10, enabling half-precision in USearch, leading to 2x lower memory consumption.
Here is the refreshed script:
!pip install faiss-cpu git+https://github.com/vioshyvo/mrpt/ search
from usearch.index import search, MetricKind, ScalarKind
import numpy as np
import mrpt
import faiss
import timeit
import matplotlib.pyplot as plt
def run_mrpt(vector, vectors, k=15):
index = mrpt.MRPTIndex(vectors)
res=index.exact_search(vector, k)
return res
def run_faiss(vector, vectors,k=15):
index = faiss.IndexFlatIP(vectors.shape[-1])
index.add(vectors)
D, I = index.search(vector, k)
return I
def run_usearch(vector, vectors, k=15):
matches= search (vectors, vector, k, MetricKind.IP, exact=True)
return matches
from usearch.compiled import hardware_acceleration
print("USearch hardware acceleration:", hardware_acceleration(dtype=ScalarKind.F16, ndim=768, metric_kind=MetricKind.IP), hardware_acceleration(dtype=ScalarKind.F32, ndim=768, metric_kind=MetricKind.IP))
count_dataset = [16, 24, 36, 54, 81, 122, 182, 273, 410, 615, 923, 1384, 2076]
# Initialize lists to store the execution times
mrpt_times = []
faiss_times = []
usearch_times = []
k = 10
count_repetitions = 100
count_queries = 10
# Benchmark the functions for each dataset size
for size in count_dataset:
print(size)
# Generate random dataset and query vector
dataset = np.random.rand(size, 768).astype(np.float32)
queries = np.random.rand(count_queries, 768).astype(np.float32)
dataset_f16 = np.random.rand(size, 768).astype(np.float16)
queries_f16 = np.random.rand(count_queries, 768).astype(np.float16)
# Time the execution of each function
mrpt_time = timeit.timeit(lambda: run_mrpt(queries, dataset, k=k), number=count_repetitions)
faiss_time = timeit.timeit(lambda: run_faiss(queries, dataset, k=k), number=count_repetitions)
usearch_time = timeit.timeit(lambda: run_usearch(queries_f16, dataset_f16, k=k), number=count_repetitions)
# Append the execution times to the corresponding lists
mrpt_times.append(mrpt_time)
faiss_times.append(faiss_time)
usearch_times.append(usearch_time)
# Plot the results
plt.figure(figsize=(20, 12))
plt.plot(count_dataset, mrpt_times, label='MRPT', marker='o')
plt.plot(count_dataset, faiss_times, label='FAISS', marker='o')
plt.plot(count_dataset, usearch_times, label='usearch', marker='o')
plt.xlabel('Number of Vectors')
plt.ylabel('Execution Time (s)')
plt.title('Nearest Neighbor Search Performance')
plt.legend()
plt.grid()
plt.show()
from usearch.
Related Issues (20)
- Bug: Download failed: usearch_sqlite_macos_arm64_2.10.0.dylib could not be found HOT 10
- Feature: SQLite WASM builds HOT 3
- Bug: module 'usearch' has no attribute 'sqlite_path' HOT 3
- Bug: [email protected] nodejs installation fails HOT 78
- Bug: error: linking with `cc` failed: exit status: 1 in rust crate HOT 20
- Bug: crash when hardware concurrency is exceeded HOT 5
- Bug: index.search returns invalid keys when k > index size HOT 5
- Bug: Deadlock in concurrent update()s HOT 5
- Bug: Replacing initial entry affects visibility of other entries HOT 2
- Feature: Cross compilation of sqlite extension for ios and android for react native apps HOT 2
- Bug: Issues index dtype=i8 with Inner Product Metrics HOT 27
- Feature parity between GoLang and C HOT 1
- Feature: Java search API extension to batch search and ANN.
- Bug: Segfault when dimensions of added vector don't add up (Rust) HOT 8
- Bug: Failed to run c++ examples. HOT 2
- Bug: Arm64 versions starting at v10.0 and up give the error Fatal Python error: Illegal instruction HOT 3
- Low index performance after `clear()` HOT 2
- Bug: Syntax Error with Jest in ESM HOT 3
- Bug: Rust build does not use simsimd (`index.hardware_acceleration()` reports `serial`) HOT 2
- Bug: cannot open old database (created with 2.9.2) with new version (2.12.0) HOT 8
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from usearch.