xataio / xata-py Goto Github PK
View Code? Open in Web Editor NEWPython SDK for xata.io
Home Page: https://xata.io/docs/sdk/python/overview
License: Apache License 2.0
Python SDK for xata.io
Home Page: https://xata.io/docs/sdk/python/overview
License: Apache License 2.0
Implement the same smart default approach as db_name
and branch_name
. These values are used from the bootstrapped client and are considered optional for endpoint APIs, if not provided take the value from the client. This creates leaner, less cumbersome API interfaces.
With the code generator in place #16, all endpoints are organized in namespaces. This makes directly accessible endpoints that are curated by hand obsolete. These should be deprecated in favor of the new namespaced endpoints. Use the deprecation
package to mark the methods.
0.3.0
1.0.0
version of the sdk1.0.0
Why not change the methods to be shims for the generated endpoints, this would be a breaking change as the return type would change from dict
to requests.Response
.
If no direct replacement exists, we should explore the path of adding them into a helpers
package.
client.post()
client.get()
client.put()
client.delete()
client.patch()
client.query()
client.get_first()
client.get_by_id()
client.create()
client.create_or_update()
client.create_or_replace()
client.update()
client.delete_record()
client.search()
client.search_table()
For the initial release we plan a base client that doesn't yet create models via code generation, but still simplifies working with Xata in python.
pip
and published automatically to PyPIxata.sh
)api.xata.io
)Currently, all records from a batch are stored in the failed_batches
items. Filter out the records that went through and only keep the failed ones keep.
add the unprocessable batches to a DLQ in the BP.
{
"timestamp": utcnow(),
"exception": Exception(e),
"data": batch,
}
Update: keep error count in stats
object.
remove the methods:
client.get()
client.post()
client.put()
client.patch()
client.delete()
They call into client.request()
and provide only limited value.
The flush_queue
method in the bulk processor can consider a queue to be empty and hence simply terminate the flush, even if the queue is populated. This is a race condition as the queue size in flush_queue
does not use the thread-safe queue size. The one used is only updated after the first batch processing.
BulkProcessor flush_queue completes without emitting all records to Xata.
In the following repro script it may miss one or two batches.
Occurs with: Python SDK 1.2.0 on Python 3.11.6
Repro script:
from xata.client import XataClient
from xata.helpers import BulkProcessor
TARGET_DB = "https://repro-q867qv.us-east-1.xata.sh/db/test"
BRANCH = "main"
client = XataClient(db_url=f"{TARGET_DB}:{BRANCH}")
bp = BulkProcessor(client)
data = []
for i in range(0, 10000):
data.append({"values": i})
bp.put_records("test", data)
bp.flush_queue()
print(bp.stats)
print(bp.failed_batches_queue)
Script output:
python3 test.py
{'total': 9925, 'queue': 0, 'failed_batches': 0, 'tables': {'test': 9925}}
[]
Xata contains 9975 records. The last batch with number 9974-9999 is missing from Xata.
The stats output (9925) is different from the Xata content (9975), neither of which matches the number of records given to the bulk processor.
Initially reported on Discord.
There is a mix of snake case and camel case in the codebase. Opt for the pythonic, PEP8 naming style.
This also applies to the client.get_config()
method, the keys are camel-cased.
The list of failed_batches
in the BulkProcessor
keeps continues to grow and grow if records are failing. Introduce a mechanism to fetch the items and free the space again (pop).
Currently, every exception in the bulk processor throws an exception and terminates the thread. Add an option throw_exception
that allows a bool flag to trigger throw or not to throw. Default: False
.
Create a helper that return solely the URL for a transform operation, currently the file itself is also fetched in one go.
https://github.com/xataio/xata-py/blob/main/xata/api/files.py#L241
If a 429
status code is returned, increase the processing timeout to avoid future rate limit hits.
Feature request: Provide a database backend for Xata in Django (django-xata
).
Description: Django officially supports certain SQL databases, while 3rd party providers may release and maintain their own packages (ref).
Feedback boards card: https://feedback.xata.io/feature-requests/p/django-backend
Instrument the SDK
If you, correctly, initialize the SDK directly with a namespace as follows:
records = XataClient(api_key="", ..., region="region").records()
The bootstrapping of the internals does not work, if you e.g. set a non-default region
, it will be ignored and the default one is used.
databases
getDatabaseMetadata
-> getMetadata
createDatabase
-> create
deleteDatabase
-> delete
updateDatabaseMetadata
-> updateMetadata
listRegions
-> getRegions
users
getUser
-> get
updateUser
-> update
deleteUser
-> delete
workspaces
getWorkspacesList
-> getWorkspaces
createWorkspace
-> create
getWorkspace
-> get
updateWorkspace
-> update
deleteWorkspace
-> delete
getWorkspaceMembersList
-> getMembers
updateWorkspaceMemberRole
-> updateMember
removeWorkspaceMember
-> removeMember
branch
getBranchList
-> getBranches
getBranchDetails
-> getDetails
createBranch
-> create
deleteBranch
-> delete
getBranchMetadata
-> getMetadata
updateBranchMetadata
-> updateMetadata
getBranchStats
-> getStats
resolveBranch
-> resolve
migrations
getBranchMigrationHistory
-> getHistory
getBranchMigrationPlan
-> getPlan
executeBranchMigrationPlan
-> executePlan
getBranchSchemaHistory
-> getHistory
compareBranchSchemas
-> compare
updateBranchSchemas
-> update
previewBranchSchemaEdit
-> preview
applyBranchSchemaEdit
-> apply
pushBranchMigrations
-> push
records
insertRecord
-> insert
getRecord
-> get
insertRecordWithID
-> insertWithId
upsertRecordWithID
-> upsertWithId
deleteRecord
-> delete
updateRecordWithID
-> updateWithId
bulkInsertTableRecords
-> bulkInsert
search_and_filter
queryTable
-> query
vectorSearchTable
-> vectorSearch
askTable
-> ask
summarizeTable
-> summarize
aggregateTable
-> aggregate
table
createTable
-> create
deleteTable
-> delete
updateTable
-> update
getTableSchema
-> getSchema
setTableSchema
-> setSchema
getTableColumns
-> getColumns
addTableColumn
-> addColumn
This is the error when I try to deploy a flask app on Render using a Xata database.
OS -> Windows 11
editor -> VS Code
Render
Python -> 3.11.6
Xata -> 1.2.0
Flask -> 2.3.2
The unit- & integration tests currently reside in one file, which makes the setup slow and the files bloated. Break each file into more domain-specific tests.
multiple methods were deprecated in 0.x
#19 , ensure the examples folder is up to date.
Incorporate https://pypi.org/project/pep8-naming/ into the linting process.
Related to #103, implement this first in order to simplify #103.
The package is written in rust and doesn't offer a none-any
wheel so it fails when trying to run it with pyodide
.
It seems there's an open PR to offer support, opening this issue to keep track of it.
Every API call returns a requests.Response
instance, while this is convenient to get all the information about the HTTP response, it is also clunky to handle. The expectation is to respond with the data (dict
) shortcutting the .json()
method. Nevertheless, should the response provide access to the status code and headers.
The preferred way to bootstrap the SDK is via the workspace URL.
Create a query
and search
helper that handles the pagination under the hood. A user would provide only a query and would get the full result set without the need for a pagination routine, which is handled within the helper.
The xata.databases().create()
method is missing the properties ui
and metadata
.
put:
operationId: createDatabase
summary: Create Database
description: Create Database with identifier name
requestBody:
description: ''
content:
application/json:
schema:
description: ''
type: object
properties:
branchName:
type: string
minLength: 1
region:
type: string
minLength: 1
ui:
type: object
properties:
color:
type: string
metadata:
$ref: '#/components/schemas/BranchMetadata'
example:
branchName: main
region: us-east-1
metadata:
repository: github.com/my/repository
branch: github repository
stage: testing
labels:
- development
required:
- region
def delete(self,) -> Response:
The trailing comma can be omitted.
def delete(self) -> Response:
https://xata.io/docs/api-reference/db/db_branch_name/transaction#execute-a-transaction-on-a-branch
Add the option failIfMissing
to the transaction helper for delete operations with the default value: False
Currently, the client allows the initialization of different base URLs for core
and workspace
. This is not reflected in the namespaced endpoints, as these use a static value. Rework the Namespace
class to use be able to inject a different base_url for core
. The workspace
will be derived from the db_url
param.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.