Comments (4)
cc @xieydd
from pgvecto.rs.
We need to evaluate the performance difference of a Cloud RDS, such as supabase, over a local pgvecto.rs using the fdw scenario.
A simple sample is provided below:
#!/usr/bin/python
# ref: https://leileiluoluo.com/posts/postgres-foreign-data-wrappers.html
import psycopg2
import psycopg2.extras
import numpy as np
import time
DIMS = 10
LOCAL_DATABASE = "user=postgres.xxx password=xxx host=aws-0-us-west-1.pooler.supabase.com port=5432 dbname=postgres"
LOCAL_USER_DATABASE = "user=local_user.xxx password=secret host=aws-0-us-west-1.pooler.supabase.com port=5432 dbname=postgres"
FORIGEN_DATABASE = "dbname=tensorchord user=tensorchord host=xxx.modelz.tech port=5432 password=xxx sslmode=require"
repeated_array_str = ','.join(map(str, ["random()" for _ in range(DIMS)]))
random_array_str = ','.join(map(str, [np.random.random() for _ in range(DIMS)]))
local_conn = psycopg2.connect(LOCAL_DATABASE)
local_conn.autocommit = True
local_user_conn = psycopg2.connect(LOCAL_USER_DATABASE)
local_user_conn.autocommit = True
forigen_conn = psycopg2.connect(FORIGEN_DATABASE)
forigen_conn.autocommit = True
FDW_DDL = """
DROP TABLE IF EXISTS test;
CREATE TABLE test (id integer PRIMARY KEY, embedding vector({dims}) NOT NULL);
"""
LOCAL_DDL = """
DROP TABLE IF EXISTS local;
CREATE TABLE local (id integer PRIMARY KEY, name VARCHAR(50) NOT NULL);
"""
INSERT_LOCAL = "INSERT INTO local (id, name) VALUES (1, 'terry'), (2, 'jason'), (3, 'curry')"
INSERT_TEST = "INSERT INTO test SELECT i, ARRAY[{array}]::real[] FROM generate_series(1, 100) i;"
DELETE_TEST = "DELETE FROM test;"
CREATE_INDEX = 'CREATE INDEX ON test USING vectors (embedding vector_l2_ops) WITH (options = "[indexing.flat]");'
DELETE_INDEX = "DELETE FROM test WHERE i = {i};"
QUERY = "SELECT * FROM test ORDER BY embedding <-> '[{array}]' LIMIT 10;"
CREATE_FOREIGN_SERVER="""
CREATE SERVER foreign_server
FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'test-burntu0m01gami31.modelz.tech', port '5432', dbname 'tensorchord', sslmode 'require');
"""
CREATE_USER_MAPPER="""
CREATE USER MAPPING FOR local_user
SERVER foreign_server
OPTIONS (user 'fdw_user', password 'secret');
"""
CREATE_FOREIGN_TABLE="""
CREATE FOREIGN TABLE foreign_test (id integer, embedding extensions.vector({dims}) NOT NULL)
SERVER foreign_server
OPTIONS (schema_name 'public', table_name 'test');
"""
IMPORT_FOREIGN_SCHEMA="""
IMPORT FOREIGN SCHEMA public LIMIT TO (test)
FROM SERVER foreign_server
INTO public;
"""
with forigen_conn.cursor() as cursor:
# Create Table
cursor.execute(FDW_DDL.format(dims = DIMS))
cursor.execute(INSERT_TEST.format(array=repeated_array_str))
# Create User
cursor.execute("CREATE USER fdw_user WITH ENCRYPTED PASSWORD 'secret';")
cursor.execute("GRANT SELECT,INSERT,UPDATE,DELETE ON TABLE test TO fdw_user;")
# Create Index
cursor.execute(CREATE_INDEX)
# Vector Search (Test)
cursor.execute(QUERY.format(array=random_array_str))
results = cursor.fetchall()
print(results[0])
with local_conn.cursor() as cursor:
# Create local table
cursor.execute(LOCAL_DDL)
# Insert Data
cursor.execute(INSERT_LOCAL)
# Create local user and grant permission
cursor.execute("CREATE USER local_user WITH ENCRYPTED PASSWORD 'secret';")
cursor.execute("GRANT SELECT,INSERT,UPDATE,DELETE ON ALL TABLES IN SCHEMA public TO local_user;")
cursor.execute("GRANT ALL ON SCHEMA public TO local_user;")
cursor.execute("GRANT USAGE ON FOREIGN DATA WRAPPER postgres_fdw TO local_user;")
# Create Foreign Server
cursor.execute(CREATE_FOREIGN_SERVER)
cursor.execute("GRANT USAGE ON FOREIGN SERVER foreign_server TO local_user;")
# Crate User Mapping
cursor.execute(CREATE_USER_MAPPER)
# Create Foreign Table
cursor.execute(CREATE_FOREIGN_TABLE.format(dims=DIMS))
cursor.execute("GRANT SELECT,INSERT,UPDATE,DELETE ON ALL TABLES IN SCHEMA extensions TO local_user;")
cursor.execute('SET search_path = "$user", public, extensions;')
# Join query
with local_user_conn.cursor() as local_user_cursor:
local_user_cursor.execute("SELECT l.id, l.name FROM local l LEFT JOIN foreign_test f on l.id = f.id ORDER BY f.embedding <-> '[0.40671515, 0.24202824, 0.37059402, 0.50316447, 0.10779921, 0.80774295, 0.8879849, 0.31292745, 0.05584943, 0.8738258]' LIMIT 10;")
results = local_user_cursor.fetchall()
print(results[0])
from pgvecto.rs.
@xieydd I tried this example but it didn't work. I'm not sure how to make it work when remote has the pgvector extension vector
while local only has pgvecto.rs extension vectors
. Have you encountered this error?
from pgvecto.rs.
Related Issues (20)
- test: 0.3 release test HOT 1
- chore: sync return when index creation
- Need a docker compose example HOT 4
- epic: Support SDKs in different programming languages HOT 4
- bug(cnpg): type "vector" does not exist HOT 1
- The default max_stack_depth parameter in the container image may be too small HOT 12
- How could I use AzureOpenAI to build LlamaIndex? HOT 2
- vector type not found in the database - pgvecto_rs.psycopg register_vector HOT 5
- copy with BINARY FORMAT fails with cannot find a dumper for type vector HOT 3
- sdk: Sparse vector indices type mismatch between sdk and function signature with numpy.ndarray
- How to find index size ? HOT 3
- Execute pg_resetwal in docker Unraid HOT 3
- feat: Support vector aggregation function HOT 1
- feat(fdw): How to be compatible with new pgvector types HOT 1
- feat: ANN benchmark HOT 3
- fix(bench): Fix ZillizBench HOT 1
- feat: Add pgvecto.rs to vector hub HOT 2
- unknown x86 target feature HOT 2
- install patched pgrx failed HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pgvecto.rs.