Giter VIP home page Giter VIP logo

vecsim-demo's Introduction

Visual and Semantic Similarity with Redis

This demo goes along with the Announcement of a New Redis Vector Similarity Search

You will experiment with two key applications of Vector Similarity Search application using a realistic dataset:

  • Semantic Search: Given a sentence check products with semantically similar text in the product keywords
  • Visual Search: Given a query image, find the Top K most "visually" similar in the catalogue

About the Amazon Product dataset

The CSV product data used in this demo was derived from the "Amazon Berkeley Objects Dataset"

Each row in the CSV file correspond to a product in the original dataset.

Before you start

Clone the Repo

git clone https://github.com/RedisAI/vecsim-demo.git

Fire Up the Docker containers

Use docker compose to start up 2 containers:

  • vesim: A redis container with Vector Similarity Search (VSS) on port 6379
  • jupyter: A python notebook server on port 8888 pre-loaded with 4 notebooks
    • 2 notebooks illustrating how to perform Visual Similarity with Redis VSS
    • 2 notebooks illustrating how to perform semantic Similarity with Redis VSS
cd vecsim-demo
docker compose up

NOTE: The first time you run the above command, it will take 5-10 minutes (depending on your network) The jupyter container downloads a 3.25GB tar file with product images from the "Amazon Berkeley Objects Dataset"

Launch the Jupyter Notebooks

Monitor the logs and look out for the link to launch jupyter on your local machine copy the URL Or run the following:

jupyter notebook

Open a local browser to this link

Step 1: Semantic Similarity - Part I

Open this notebook http://127.0.0.1:8888/notebooks/SemanticSearch1k.ipynb

Run All Cells and check the outputs

You will generate embeddings for 1,000 products and perform semantic similarity using two indexing methods(HNSW and brute-force)

Step 2: Semantic Similarity - Part II

Open this notebook http://127.0.0.1:8888/notebooks/SemanticSearch100k.ipynb

Run All Cells and check the outputs

You will load ~100k previously-generated embeddings for the first 100,000 products in the dataset. You'll perform semantic similarity on a larger dataset

Step 3: Visual Similarity - Part I

Open this notebook http://127.0.0.1:8888/notebooks/VisualSearch1k.ipynb

Run All Cells and check the outputs

You will generate embeddings for 1,000 product images and perform visual similarity using two indexing methods

Step 4: Visual Similarity - Part II

Open this notebook http://127.0.0.1:8888/notebooks/VisualSearch100k.ipynb

You'll perform visual similarity on a larger dataset using two indexing methods (HNSW and brute-force)

Stop the Docker containers

docker compose down

About the Amazon Product data

The dataset used in this demo was derived from the "Amazon Berkeley Objects Dataset"

In particular, each long text field in the product_data.csv was extracted from the original JSON encoded object representing each product.

Thanks to Amazon.com for sharing the original dataset. This includes all product data, images and 3D models under the Creative Commons Attribution-NonCommercial 4.0 International Public License (CC BY-NC 4.0)

Credit to the creators of the dataset: Matthieu Guillaumin Amazon.com Thomas Dideriksen Amazon.com Kenan Deng Amazon.com Himanshu Arora Amazon.com Arnab Dhua Amazon.com Xi (Brian) Zhang Amazon.com Tomas Yago-Vicente Amazon.com Jasmine Collins UC Berkeley Shubham Goel UC Berkeley Jitendra Malik UC Berkeley

vecsim-demo's People

Contributors

anujak avatar dvirdukhan avatar guyav46 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

vecsim-demo's Issues

getting similar cosine similarity score for totally different queries

I will put an example to make it clearer:

I have a text like (I simplify it): "John Smith is a film maker" --> I create an embedding of this text and store it on Redis

My query is --> "Who is John Smith" --> The similarity score is 0.29

Then, I do another query (let's call it nonsense query)--> "Who is randomNameorText? --> The similarity score is 0.30 (higher is worse)

Even in some cases the nonsense query has a better (lower) score than my query.

I do not understand this behaviour. Why do random questions get better or similar scores than a legit question ?

Furthermore all nonsense questions get similarty sores close to 0.3. I have not seen any score of 0.4 or higher. I would have expected for totally out of context questions I would get scores of 0.9.

Technical information:
I am creating the embeddings with openai and storing them in redis.
I am using FLAT, type FLOAT64 and distance_metric cosine

Kernel restarting in VisualSearch

Running VisualSearch1k.ipynb fails with error

Kernel Restarting
The kernel for VisualSearch1k.ipynb appears to have died. It will restart automatically.

The corresponding line in the Docker log is

[I 2023-06-16 09:39:03.923 ServerApp] AsyncIOLoopKernelRestarter: restarting kernel (1/5), keep random ports

This happens while running function generate_img2vec_dict().

It looks like this is an OOM error - reducing batch_size in the function call from 250 to 125 allows the notebook to complete.

Environment:

  • MacBook Pro M1
  • 16 GB RAM
  • Podman with 4 GB RAM allocated to VM

SemanticSearch notebooks: max text length not accurate

Thanks for the tutorial!

Two remarks regarding MAX_TEXT_LENGTH for embeddings:

  • The maximum sequence length of models is not measured in characters but in tokens. If you cut with return val[:MAX_TEXT_LENGTH] you encode less text than you actually can with your model.
  • You do not need the auto_truncate function because sentence-transformer's encode() will do this automatically for you.

Demo crashes on Macbook M1

I haven't got the chance to test on a different machine yet at this very moment, but currently the demo makes Redis crash on my M1 Macbook when firing the redisearch query:

Attaching to vecsim-demo_vecsim_1
�[36mvecsim_1  |�[0m 1:C 11 Jan 2022 21:50:48.181 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
�[36mvecsim_1  |�[0m 1:C 11 Jan 2022 21:50:48.182 # Redis version=6.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
�[36mvecsim_1  |�[0m 1:C 11 Jan 2022 21:50:48.182 # Configuration loaded
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.186 * monotonic clock: POSIX clock_gettime
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.196 * Running mode=standalone, port=6379.
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.197 # Server initialized
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.207 * <search> Redis version found by RedisSearch : 6.2.4 - oss
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.207 * <search> RediSearch version 99.99.99 (Git=v1.99.5-353-g089691c)
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.208 * <search> Low level api version 1 initialized successfully
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.209 * <search> concurrent writes: OFF, gc: ON, prefix min length: 2, prefix max expansions: 200, query timeout (ms): 500, timeout policy: return, cursor read size: 1000, cursor max idle (ms): 300000, max doctable size: 1000000, max number of search results:  1000000, search pool size: 20, index pool size: 8, 
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.213 * <search> Initialized thread pool!
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.219 * Module 'search' loaded from /usr/lib/redis/modules/redisearch.so
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.225 * <ReJSON> version: 999999 git sha: 42e6f32 branch: master
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.225 * <ReJSON> Exported RedisJSON_V1 API
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.226 * <ReJSON> Enabled diskless replication
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.226 * <ReJSON> Created new data type 'ReJSON-RL'
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.228 * Module 'ReJSON' loaded from /usr/lib/redis/modules/rejson.so
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.228 * <search> Acquired RedisJSON_V1 API
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:50:48.233 * Ready to accept connections
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m === REDIS BUG REPORT START: Cut & paste starting from here ===
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.901 # Redis 6.2.4 crashed by signal: 4, si_code: 2
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.902 # Crashed running the instruction at: 0x400365908f
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ STACK TRACE ------
�[36mvecsim_1  |�[0m EIP:
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(_Z18L2SqrSIMD16Ext_SSEPKvS0_S0_+0x3f)[0x400365908f]
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m Backtrace:
�[36mvecsim_1  |�[0m /lib/x86_64-linux-gnu/libpthread.so.0(+0x14140)[0x4002156140]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(_Z18L2SqrSIMD16Ext_SSEPKvS0_S0_+0x3f)[0x400365908f]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(_ZN15BruteForceIndex9topKQueryEPKvmP17VecSimQueryParams+0x277)[0x4003647a37]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(VecSimIndex_TopKQuery+0x26)[0x40036465d6]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(NewVectorIterator+0xdf)[0x400361841f]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(QAST_Iterate+0x43)[0x40035f5c83]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(AREQ_ApplyContext+0x2eb)[0x400359fc6b]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(+0x6ccc4)[0x400359acc4]
�[36mvecsim_1  |�[0m /usr/lib/redis/modules/redisearch.so(+0x6db0e)[0x400359bb0e]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(RedisModuleCommandDispatcher+0x53)[0x40000d82d3]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(call+0xf4)[0x400004e1b4]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(processCommand+0x5b3)[0x400004fce3]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(processInputBuffer+0xf8)[0x4000062ba8]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(+0xf9848)[0x40000f9848]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(aeProcessEvents+0x292)[0x4000046d82]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(aeMain+0x1d)[0x4000046fed]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(main+0x316)[0x4000043286]
�[36mvecsim_1  |�[0m /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x400218ad0a]
�[36mvecsim_1  |�[0m /usr/local/bin/redis-server *:6379(_start+0x2a)[0x400004375a]
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ REGISTERS ------
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # 
�[36mvecsim_1  |�[0m RAX:0000004006bf5c18 RBX:0000000000000000
�[36mvecsim_1  |�[0m RCX:0000000000000000 RDX:0000004000482420
�[36mvecsim_1  |�[0m RDI:0000004006bf5058 RSI:00000040e30f4040
�[36mvecsim_1  |�[0m RBP:000000400046bde8 RSP:0000004001c3d668
�[36mvecsim_1  |�[0m R8 :0000004000f30e50 R9 :0000004002322be0
�[36mvecsim_1  |�[0m R10:0000000000061ab0 R11:0000004000f39000
�[36mvecsim_1  |�[0m R12:0000004000f30e58 R13:00000040e30f4000
�[36mvecsim_1  |�[0m R14:0000004001c3d6e0 R15:0000004000482420
�[36mvecsim_1  |�[0m RIP:000000400365908f EFL:0000000000000202
�[36mvecsim_1  |�[0m CSGSFS:002b000000000033
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d677) -> 000000400046bb60
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d676) -> 000000000000001d
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d675) -> 0000004000f928d8
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d674) -> 0000004000f928d8
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d673) -> 0000004000f30e58
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d672) -> 0000004000469910
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d671) -> 000000400046bb60
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d670) -> 00000040000d685d
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66f) -> 0000004000469dc0
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66e) -> 0000004000469db8
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66d) -> 0000000000000005
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66c) -> 0000004001c3d6e0
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66b) -> 0000004000482408
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d66a) -> ff7fffffe2be8133
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d669) -> 00000000000186a0
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.911 # (0000004001c3d668) -> 0000004003647a37
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ INFO OUTPUT ------
�[36mvecsim_1  |�[0m # Server
�[36mvecsim_1  |�[0m redis_version:6.2.4
�[36mvecsim_1  |�[0m redis_git_sha1:00000000
�[36mvecsim_1  |�[0m redis_git_dirty:0
�[36mvecsim_1  |�[0m redis_build_id:1e8a644f6d08b278
�[36mvecsim_1  |�[0m redis_mode:standalone
�[36mvecsim_1  |�[0m os:Linux 5.10.47-linuxkit x86_64
�[36mvecsim_1  |�[0m arch_bits:64
�[36mvecsim_1  |�[0m multiplexing_api:epoll
�[36mvecsim_1  |�[0m atomicvar_api:atomic-builtin
�[36mvecsim_1  |�[0m gcc_version:10.2.1
�[36mvecsim_1  |�[0m process_id:1
�[36mvecsim_1  |�[0m process_supervised:no
�[36mvecsim_1  |�[0m run_id:e29676b17d5f1ad447cf0079084942371668dd20
�[36mvecsim_1  |�[0m tcp_port:6379
�[36mvecsim_1  |�[0m server_time_usec:1641937889881947
�[36mvecsim_1  |�[0m uptime_in_seconds:41
�[36mvecsim_1  |�[0m uptime_in_days:0
�[36mvecsim_1  |�[0m hz:10
�[36mvecsim_1  |�[0m configured_hz:10
�[36mvecsim_1  |�[0m lru_clock:14547937
�[36mvecsim_1  |�[0m executable:/usr/local/bin/redis-server
�[36mvecsim_1  |�[0m config_file:
�[36mvecsim_1  |�[0m io_threads_active:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Clients
�[36mvecsim_1  |�[0m connected_clients:1
�[36mvecsim_1  |�[0m cluster_connections:0
�[36mvecsim_1  |�[0m maxclients:10000
�[36mvecsim_1  |�[0m client_recent_max_input_buffer:88
�[36mvecsim_1  |�[0m client_recent_max_output_buffer:0
�[36mvecsim_1  |�[0m blocked_clients:0
�[36mvecsim_1  |�[0m tracking_clients:0
�[36mvecsim_1  |�[0m clients_in_timeout_table:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Memory
�[36mvecsim_1  |�[0m used_memory:459385536
�[36mvecsim_1  |�[0m used_memory_human:438.10M
�[36mvecsim_1  |�[0m used_memory_rss:0
�[36mvecsim_1  |�[0m used_memory_rss_human:0B
�[36mvecsim_1  |�[0m used_memory_peak:459385536
�[36mvecsim_1  |�[0m used_memory_peak_human:438.10M
�[36mvecsim_1  |�[0m used_memory_peak_perc:100.00%
�[36mvecsim_1  |�[0m used_memory_overhead:5973920
�[36mvecsim_1  |�[0m used_memory_startup:904776
�[36mvecsim_1  |�[0m used_memory_dataset:453411616
�[36mvecsim_1  |�[0m used_memory_dataset_perc:98.89%
�[36mvecsim_1  |�[0m allocator_allocated:459473080
�[36mvecsim_1  |�[0m allocator_active:459804672
�[36mvecsim_1  |�[0m allocator_resident:467374080
�[36mvecsim_1  |�[0m total_system_memory:2085416960
�[36mvecsim_1  |�[0m total_system_memory_human:1.94G
�[36mvecsim_1  |�[0m used_memory_lua:37888
�[36mvecsim_1  |�[0m used_memory_lua_human:37.00K
�[36mvecsim_1  |�[0m used_memory_scripts:0
�[36mvecsim_1  |�[0m used_memory_scripts_human:0B
�[36mvecsim_1  |�[0m number_of_cached_scripts:0
�[36mvecsim_1  |�[0m maxmemory:0
�[36mvecsim_1  |�[0m maxmemory_human:0B
�[36mvecsim_1  |�[0m maxmemory_policy:noeviction
�[36mvecsim_1  |�[0m allocator_frag_ratio:1.00
�[36mvecsim_1  |�[0m allocator_frag_bytes:331592
�[36mvecsim_1  |�[0m allocator_rss_ratio:1.02
�[36mvecsim_1  |�[0m allocator_rss_bytes:7569408
�[36mvecsim_1  |�[0m rss_overhead_ratio:0.00
�[36mvecsim_1  |�[0m rss_overhead_bytes:-467374080
�[36mvecsim_1  |�[0m mem_fragmentation_ratio:0.00
�[36mvecsim_1  |�[0m mem_fragmentation_bytes:-459326240
�[36mvecsim_1  |�[0m mem_not_counted_for_evict:0
�[36mvecsim_1  |�[0m mem_replication_backlog:0
�[36mvecsim_1  |�[0m mem_clients_slaves:0
�[36mvecsim_1  |�[0m mem_clients_normal:20568
�[36mvecsim_1  |�[0m mem_aof_buffer:0
�[36mvecsim_1  |�[0m mem_allocator:jemalloc-5.1.0
�[36mvecsim_1  |�[0m active_defrag_running:0
�[36mvecsim_1  |�[0m lazyfree_pending_objects:0
�[36mvecsim_1  |�[0m lazyfreed_objects:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Persistence
�[36mvecsim_1  |�[0m loading:0
�[36mvecsim_1  |�[0m current_cow_size:0
�[36mvecsim_1  |�[0m current_cow_size_age:0
�[36mvecsim_1  |�[0m current_fork_perc:0.00
�[36mvecsim_1  |�[0m current_save_keys_processed:0
�[36mvecsim_1  |�[0m current_save_keys_total:0
�[36mvecsim_1  |�[0m rdb_changes_since_last_save:400001
�[36mvecsim_1  |�[0m rdb_bgsave_in_progress:0
�[36mvecsim_1  |�[0m rdb_last_save_time:1641937848
�[36mvecsim_1  |�[0m rdb_last_bgsave_status:ok
�[36mvecsim_1  |�[0m rdb_last_bgsave_time_sec:-1
�[36mvecsim_1  |�[0m rdb_current_bgsave_time_sec:-1
�[36mvecsim_1  |�[0m rdb_last_cow_size:0
�[36mvecsim_1  |�[0m aof_enabled:0
�[36mvecsim_1  |�[0m aof_rewrite_in_progress:0
�[36mvecsim_1  |�[0m aof_rewrite_scheduled:0
�[36mvecsim_1  |�[0m aof_last_rewrite_time_sec:-1
�[36mvecsim_1  |�[0m aof_current_rewrite_time_sec:-1
�[36mvecsim_1  |�[0m aof_last_bgrewrite_status:ok
�[36mvecsim_1  |�[0m aof_last_write_status:ok
�[36mvecsim_1  |�[0m aof_last_cow_size:0
�[36mvecsim_1  |�[0m module_fork_in_progress:0
�[36mvecsim_1  |�[0m module_fork_last_cow_size:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Stats
�[36mvecsim_1  |�[0m total_connections_received:1
�[36mvecsim_1  |�[0m total_commands_processed:100002
�[36mvecsim_1  |�[0m instantaneous_ops_per_sec:0
�[36mvecsim_1  |�[0m total_net_input_bytes:357692168
�[36mvecsim_1  |�[0m total_net_output_bytes:400005
�[36mvecsim_1  |�[0m instantaneous_input_kbps:0.00
�[36mvecsim_1  |�[0m instantaneous_output_kbps:0.00
�[36mvecsim_1  |�[0m rejected_connections:0
�[36mvecsim_1  |�[0m sync_full:0
�[36mvecsim_1  |�[0m sync_partial_ok:0
�[36mvecsim_1  |�[0m sync_partial_err:0
�[36mvecsim_1  |�[0m expired_keys:0
�[36mvecsim_1  |�[0m expired_stale_perc:0.00
�[36mvecsim_1  |�[0m expired_time_cap_reached_count:0
�[36mvecsim_1  |�[0m expire_cycle_cpu_milliseconds:2
�[36mvecsim_1  |�[0m evicted_keys:0
�[36mvecsim_1  |�[0m keyspace_hits:100000
�[36mvecsim_1  |�[0m keyspace_misses:0
�[36mvecsim_1  |�[0m pubsub_channels:0
�[36mvecsim_1  |�[0m pubsub_patterns:0
�[36mvecsim_1  |�[0m latest_fork_usec:0
�[36mvecsim_1  |�[0m total_forks:0
�[36mvecsim_1  |�[0m migrate_cached_sockets:0
�[36mvecsim_1  |�[0m slave_expires_tracked_keys:0
�[36mvecsim_1  |�[0m active_defrag_hits:0
�[36mvecsim_1  |�[0m active_defrag_misses:0
�[36mvecsim_1  |�[0m active_defrag_key_hits:0
�[36mvecsim_1  |�[0m active_defrag_key_misses:0
�[36mvecsim_1  |�[0m tracking_total_keys:0
�[36mvecsim_1  |�[0m tracking_total_items:0
�[36mvecsim_1  |�[0m tracking_total_prefixes:0
�[36mvecsim_1  |�[0m unexpected_error_replies:0
�[36mvecsim_1  |�[0m total_error_replies:0
�[36mvecsim_1  |�[0m dump_payload_sanitizations:0
�[36mvecsim_1  |�[0m total_reads_processed:21834
�[36mvecsim_1  |�[0m total_writes_processed:21833
�[36mvecsim_1  |�[0m io_threaded_reads_processed:0
�[36mvecsim_1  |�[0m io_threaded_writes_processed:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Replication
�[36mvecsim_1  |�[0m role:master
�[36mvecsim_1  |�[0m connected_slaves:0
�[36mvecsim_1  |�[0m master_failover_state:no-failover
�[36mvecsim_1  |�[0m master_replid:4f0f8d9ebdb0eac8fc65411b1f7d123c69a1b344
�[36mvecsim_1  |�[0m master_replid2:0000000000000000000000000000000000000000
�[36mvecsim_1  |�[0m master_repl_offset:0
�[36mvecsim_1  |�[0m second_repl_offset:-1
�[36mvecsim_1  |�[0m repl_backlog_active:0
�[36mvecsim_1  |�[0m repl_backlog_size:1048576
�[36mvecsim_1  |�[0m repl_backlog_first_byte_offset:0
�[36mvecsim_1  |�[0m repl_backlog_histlen:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # CPU
�[36mvecsim_1  |�[0m used_cpu_sys:1.223736
�[36mvecsim_1  |�[0m used_cpu_user:18.429628
�[36mvecsim_1  |�[0m used_cpu_sys_children:0.004812
�[36mvecsim_1  |�[0m used_cpu_user_children:0.048110
�[36mvecsim_1  |�[0m used_cpu_sys_main_thread:1.211601
�[36mvecsim_1  |�[0m used_cpu_user_main_thread:18.387543
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Modules
�[36mvecsim_1  |�[0m module:name=search,ver=999999,api=1,filters=0,usedby=[],using=[ReJSON],options=[]
�[36mvecsim_1  |�[0m module:name=ReJSON,ver=999999,api=1,filters=0,usedby=[search],using=[],options=[handle-io-errors]
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Commandstats
�[36mvecsim_1  |�[0m cmdstat_hset:calls=100000,usec=16276751,usec_per_call=162.77,rejected_calls=0,failed_calls=0
�[36mvecsim_1  |�[0m cmdstat_FT.CREATE:calls=1,usec=8165,usec_per_call=8165.00,rejected_calls=0,failed_calls=0
�[36mvecsim_1  |�[0m cmdstat_info:calls=1,usec=816,usec_per_call=816.00,rejected_calls=0,failed_calls=0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Errorstats
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Cluster
�[36mvecsim_1  |�[0m cluster_enabled:0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m # Keyspace
�[36mvecsim_1  |�[0m db0:keys=100000,expires=0,avg_ttl=0
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ CLIENT LIST OUTPUT ------
�[36mvecsim_1  |�[0m id=7 addr=172.19.0.1:57208 laddr=172.19.0.2:6379 fd=8 name= age=38 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=3367 qbuf-free=37587 argv-mem=3246 obl=0 oll=0 omem=0 tot-mem=64846 events=r cmd=FT.SEARCH user=default redir=-1
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ CURRENT CLIENT INFO ------
�[36mvecsim_1  |�[0m id=7 addr=172.19.0.1:57208 laddr=172.19.0.2:6379 fd=8 name= age=38 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=3367 qbuf-free=37587 argv-mem=3246 obl=0 oll=0 omem=0 tot-mem=64846 events=r cmd=FT.SEARCH user=default redir=-1
�[36mvecsim_1  |�[0m argv[0]: 'FT.SEARCH'
�[36mvecsim_1  |�[0m argv[1]: 'my_bf_index'
�[36mvecsim_1  |�[0m argv[2]: '@item_keywords_vector:[$vec_param TOPK 5]'
�[36mvecsim_1  |�[0m argv[3]: 'RETURN'
�[36mvecsim_1  |�[0m argv[4]: '3'
�[36mvecsim_1  |�[0m argv[5]: 'item_keywords_vector_score'
�[36mvecsim_1  |�[0m argv[6]: 'item_name'
�[36mvecsim_1  |�[0m argv[7]: 'item_keywords'
�[36mvecsim_1  |�[0m argv[8]: 'SORTBY'
�[36mvecsim_1  |�[0m argv[9]: 'item_keywords_vector_score'
�[36mvecsim_1  |�[0m argv[10]: 'ASC'
�[36mvecsim_1  |�[0m argv[11]: 'LIMIT'
�[36mvecsim_1  |�[0m argv[12]: '0'
�[36mvecsim_1  |�[0m argv[13]: '5'
�[36mvecsim_1  |�[0m argv[14]: 'PARAMS'
�[36mvecsim_1  |�[0m argv[15]: '2'
�[36mvecsim_1  |�[0m argv[16]: 'vec_param'
�[36mvecsim_1  |�[0m argv[17]: '��<��S={?�7a#���'k����<��i�7��=��@=**��h,G�_�Ҽ��r;L��;�I-����<41�<t�~=��+��%����q<'������<T�e�|N����e�3�I�����V�ػ�i"�����%	���r2<����X��=��4=��1����=�һ�oF
��,U=YAN���J=���9�^�<|S =�̠��&�.�-ƾ�������-�A�����A;R���!��<����r�Ҽ��ļ���=�~�:�I�=�VA��(C=�#�=�V�=��C���:=��ּi&j��P滟����W约A��K�S��:�<c悻EǸ�ܶW��a�<�$�<���n��=�kf����93]N���Y<���<T�<�A�#���@=B����������� ��=������2���ּ|֨;X/E������.O=�!����e:�q'
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ MODULES INFO OUTPUT ------
�[36mvecsim_1  |�[0m # ReJSON_trace
�[36mvecsim_1  |�[0m ReJSON_trace:   0: redis_module::base_info_func
�[36mvecsim_1  |�[0m    1: modulesCollectInfo
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/module.c:7059:9
�[36mvecsim_1  |�[0m    2: logModulesInfo
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/debug.c:1596:22
�[36mvecsim_1  |�[0m    3: printCrashReport
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/debug.c:1847:5
�[36mvecsim_1  |�[0m       sigsegvHandler
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/debug.c:1829:5
�[36mvecsim_1  |�[0m    4: <unknown>
�[36mvecsim_1  |�[0m    5: _Z18L2SqrSIMD16Ext_SSEPKvS0_S0_
�[36mvecsim_1  |�[0m    6: _ZN15BruteForceIndex9topKQueryEPKvmP17VecSimQueryParams
�[36mvecsim_1  |�[0m    7: VecSimIndex_TopKQuery
�[36mvecsim_1  |�[0m    8: NewVectorIterator
�[36mvecsim_1  |�[0m    9: QAST_Iterate
�[36mvecsim_1  |�[0m   10: AREQ_ApplyContext
�[36mvecsim_1  |�[0m   11: buildRequest
�[36mvecsim_1  |�[0m   12: execCommandCommon
�[36mvecsim_1  |�[0m   13: RedisModuleCommandDispatcher
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/module.c:694:5
�[36mvecsim_1  |�[0m   14: call
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/server.c:3713:5
�[36mvecsim_1  |�[0m   15: processCommand
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/server.c:4238:9
�[36mvecsim_1  |�[0m   16: processCommandAndResetClient
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/networking.c:2010:9
�[36mvecsim_1  |�[0m       processInputBuffer
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/networking.c:2111:17
�[36mvecsim_1  |�[0m   17: callHandler
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/connhelpers.h:79:18
�[36mvecsim_1  |�[0m       connSocketEventHandler
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/connection.c:295:14
�[36mvecsim_1  |�[0m   18: aeProcessEvents
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/ae.c:427:17
�[36mvecsim_1  |�[0m   19: aeMain
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/ae.c:487:9
�[36mvecsim_1  |�[0m   20: main
�[36mvecsim_1  |�[0m              at /usr/src/redis/src/server.c:6392:5
�[36mvecsim_1  |�[0m   21: __libc_start_main
�[36mvecsim_1  |�[0m   22: _start
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m 
�[36mvecsim_1  |�[0m ------ FAST MEMORY TEST ------
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.992 # Bio thread for job type #0 terminated
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.992 # Bio thread for job type #1 terminated
�[36mvecsim_1  |�[0m 1:M 11 Jan 2022 21:51:29.993 # Bio thread for job type #2 terminated
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4000211000 (14295040 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4001c6a000 (12288 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4001db3000 (8192 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 400213e000 (16384 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4002160000 (16384 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4002325000 (40960 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4002600000 (2097152 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 400292f000 (2097152 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4003000000 (4194304 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4003804000 (12288 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40039d3000 (12288 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40039f1000 (8388608 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 400432d000 (8388608 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4004b2e000 (8388608 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 400532f000 (8388608 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 4005b30000 (3238813696 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40c6bf6000 (2621440 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40c7600000 (6291456 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40c7c75000 (48758784 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40cab1e000 (44040192 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40cd571000 (109051904 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40d3e18000 (218103808 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e0e18000 (1384448 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e0f6a000 (83886080 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e6200000 (2097152 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e6569000 (2097152 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e676a000 (8388608 bytes)
�[36mvecsim_1  |�[0m *** Preparing to test memory region 40e8000000 (135168 bytes)
�[36mvecsim_1  |�[0m .O.O.O.O.O.O.O.O.O.O.O.O.O.O.O.�[36mvecsim-demo_vecsim_1 exited with code 137
�[0m

TypeError: search() got an unexpected keyword argument 'query_params'

Hey,

I've used your notebook as guidance for my use case. I've loaded and indexed the data, which works fine. However, when I run a query I get the error above. Here is the code for executing the query:

query_emb = query_emb.astype(np.float32).tobytes()
q = redisearch.Query(f'@embedding:[$query_emb TOPK 5] => {{$EFRUNTIME : {EF}}}').sort_by('embedding_score').paging(0, 5).return_fields('embedding_score')
results = index.search(q, query_params={'query_emb': query_emb})

The problem is in the last line, meaning, the argument query_params is non-existent. This is also confirmed by the documentation:

Help on method search in module redisearch.client:

search(query) method of redisearch.client.Client instance
    Search the index for a given query, and return a result of documents
    
    ### Parameters
    
    - **query**: the search query. Either a text for simple queries with default parameters, or a Query object for complex queries.
                 See RediSearch's documentation on query format

I've used the redislabs/redisearch:feature-vecsim docker image.

Any help is appreciated.

Newly added product

Let's say there are couple of new products which are to be indexed. Is there way to index them with all the indexed products? Or should we start indexing from the very begining?

[Question] Can we search using existing redisai tensors?

I already have millions of tensors on my redis instance (stored via redisAI tensorset).

I was hoping to be able to search similar items directly, instead of following this notebook in which basically redisai is not used (just redissearch with byte serialized embeddings).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.