prrao87 / db-hub-fastapi Goto Github PK
View Code? Open in Web Editor NEWAsync bulk data ingestion and querying in various document, graph and vector databases via their Python clients
License: MIT License
Async bulk data ingestion and querying in various document, graph and vector databases via their Python clients
License: MIT License
Currently, the data once read into LanceDB via arrow conversion is missing the price
column. Maybe something is happening during type coercion, causing the column to be ignored? In any case, price
is an important variable and needs to be included in the table for downstream filtering purposes, so this needs to be diagnosed.
In following the README instruction I ran python bulk_import.py
and get an error.
Traceback (most recent call last):
File "/home/paul/development/python/async-db-fastapi/dbs/meilisearch/scripts/bulk_index.py", line 191, in <module>
files = get_json_files("winemag-data", FILE_PATH)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/paul/development/python/async-db-fastapi/dbs/meilisearch/scripts/bulk_index.py", line 42, in get_json_files
raise FileNotFoundError(
FileNotFoundError: No .jsonl files with prefix `winemag-data` found in `/home/paul/development/python/async-db-fastapi/data/winemag-data-130k-v2-jsonl`
This happens because the file is zipped and needs to be extracted first.
Move FastAPI Pydantic schemas to the api
directory, and move the ETL Pydantic schemas to the scripts
directory for each database. This keeps things cleaner, as the schema requirements for upstream/downstream processes are rather different, so they don't need to be together (as I originally thought).
To improve clarity, it makes sense to reduce verbosity of the router names
wine.py
is vague -- it's better to rename the file to be more specific as to what it does, such as retriever.py
wine_router
can be renamed to simply router
, and called using retriever.router
from main.py
in each FastAPI sectionCan only complete the Pydantic v2 upgrade for Qdrant once they update their Python client to 1.3.0, as the Python client itself depends on Pydantic v2.
I saw you updated Neo4j to use Pydantic 2, meilisearch-python-async
has also been updated with support for Pydantic 2. I have been meaning to try that out and see what difference it makes here, but just haven't gotten to it yet.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.