Make `workspace_id` a smart default.

Implement the same smart default approach as db_name and branch_name. These values are used from the bootstrapped client and are considered optional for endpoint APIs, if not provided take the value from the client. This creates leaner, less cumbersome API interfaces.

Deprecate not namespaced endpoints

With the code generator in place #16, all endpoints are organized in namespaces. This makes directly accessible endpoints that are curated by hand obsolete. These should be deprecated in favor of the new namespaced endpoints. Use the deprecation package to mark the methods.

Methods will be deprecated with the next pre-release version 0.3.0
Methods will be removed with the release of the 1.0.0 version of the sdk

change integration tests to use new endpoints
add the deprecation notice to methods and point to replacement endpoint
remove methods with 1.0.0

Why not change the methods to be shims for the generated endpoints, this would be a breaking change as the return type would change from dict to requests.Response.

If no direct replacement exists, we should explore the path of adding them into a helpers package.

Methods to deprecated

[Meta] Python SDK initial release

For the initial release we plan a base client that doesn't yet create models via code generation, but still simplifies working with Xata in python.

Filter to only failing records in `failed_batches`

Currently, all records from a batch are stored in the failed_batches items. Filter out the records that went through and only keep the failed ones keep.

Road to GA

Tasks

Beta Give feedback

Breakup test files into domains #18

enhancement
Integration tests for all endpoints
More exhaustive unit tests
Add smart defaults for endpoints, e.g. if the param is workspace_id or db_branch_name which should be known by the client, make the param optional to take the client's internal value. Purpose: convenience
Improve documentation on xata.io and readthedocs.org
Deprecate not namespaced endpoints #19

documentation enhancement
Remove unnecessary requests shims #17

breaking change
Add NOTICE #22
License header check #23
Rework examples for GA release #50
Make workspace_id a smart default. #54

breaking change codegen enhancement
Investigate direct namespace invocation #57

bug
Make base URLs configurable #59

codegen enhancement
Add XATA_DATABASE_URL environment variable #60
Backport action #89

enhancement
Fix read-the-docs #131
add versions for helpers & metrics header #91
revisit API names for redundancy, e.g. shorten xata.database().createDatabase() to xata.database().create() #93

breaking change codegen
remove distinguishment between workspace and core in codegen and target directories #108

breaking change codegen
rename namespaces package to api #109

breaking change codegen
[Helper][Transactions] Add failIfMissing option to delete #102

backport 0.x enhancement good first issue helpers/transactions
[FEEDBACK] Return the data immediately as response from the API #101

breaking change enhancement
Align naming conventions #103

4 of 4

breaking change
PEP8 naming conventions check #104

dependencies enhancement
Python SDK migration guide mdx-docs#33

awaiting docs review content update sdk/python
Rename namespace to api request #148

breaking change codegen tests/integration-tests tests/unit-tests
Rework Ask API #147

breaking change codegen tests/integration-tests
Options

DLQ for BulkProcessor

add the unprocessable batches to a DLQ in the BP.

{
"timestamp": utcnow(),
"exception": Exception(e),
"data": batch,
}

Update: keep error count in stats object.

Remove unnecessary `requests` shims

remove the methods:

client.get()
client.post()
client.put()
client.patch()
client.delete()

They call into client.request() and provide only limited value.

remove distinguishment between workspace and core in codegen and target directories

BUG: Ensure `flush_queue` blocks if timing is flaky

The flush_queue method in the bulk processor can consider a queue to be empty and hence simply terminate the flush, even if the queue is populated. This is a race condition as the queue size in flush_queue does not use the thread-safe queue size. The one used is only updated after the first batch processing.

Bulk processor flush_queue leaks batches

BulkProcessor flush_queue completes without emitting all records to Xata.

In the following repro script it may miss one or two batches.

Occurs with: Python SDK 1.2.0 on Python 3.11.6

Repro script:

from xata.client import XataClient
from xata.helpers import BulkProcessor

TARGET_DB = "https://repro-q867qv.us-east-1.xata.sh/db/test"
BRANCH = "main"

client = XataClient(db_url=f"{TARGET_DB}:{BRANCH}")
bp = BulkProcessor(client)

data = []

for i in range(0, 10000):
    data.append({"values": i})

bp.put_records("test", data)

bp.flush_queue()

print(bp.stats)
print(bp.failed_batches_queue)

Script output:

python3 test.py
{'total': 9925, 'queue': 0, 'failed_batches': 0, 'tables': {'test': 9925}}
[]

Xata contains 9975 records. The last batch with number 9974-9999 is missing from Xata.
The stats output (9925) is different from the Xata content (9975), neither of which matches the number of records given to the bulk processor.

Initially reported on Discord.

Align naming conventions

There is a mix of snake case and camel case in the codebase. Opt for the pythonic, PEP8 naming style.

This also applies to the client.get_config() method, the keys are camel-cased.

Tasks

Beta Give feedback

First batch of pep8 naming conventions #107

breaking change codegen
PEP8 naming conventions check #104

dependencies enhancement
Pep8/part 3 #115

codegen
Pep8 part3.2 #125

breaking change codegen tests/integration-tests
Options

Empty list of items in `failed_batches`

The list of failed_batches in the BulkProcessor keeps continues to grow and grow if records are failing. Introduce a mechanism to fetch the items and free the space again (pop).

`throw_exception` option in BulkProcessor

Currently, every exception in the bulk processor throws an exception and terminates the thread. Add an option throw_exception that allows a bool flag to trigger throw or not to throw. Default: False.

rename namespaces package to api

Create Transform Helper

Create a helper that return solely the URL for a transform operation, currently the file itself is also fetched in one go.

https://github.com/xataio/xata-py/blob/main/xata/api/files.py#L241

Slow bulk processing down on rate limited requests

If a 429 status code is returned, increase the processing timeout to avoid future rate limit hits.

Xata backend for Django

Feature request: Provide a database backend for Xata in Django (django-xata).

Description: Django officially supports certain SQL databases, while 3rd party providers may release and maintain their own packages (ref).

Feedback boards card: https://feedback.xata.io/feature-requests/p/django-backend

Add OpenTelemetry traces to the SDK

Instrument the SDK

Investigate direct namespace invocation

If you, correctly, initialize the SDK directly with a namespace as follows:

records = XataClient(api_key="", ..., region="region").records()

The bootstrapping of the internals does not work, if you e.g. set a non-default region, it will be ignored and the default one is used.

[Feedback] Transaction helper

Tasks

Beta Give feedback

The response from trx.run() is a dictionary, so the status code for example needs to be accessed as: if resp["status_code"] == 200: which is different from the response from other methods such as records().insert() where it’s an attribute: if resp.status_code == 200:
no matter the response, the run operation flushes the operations array so there’s no way to re-try the transaction in case of retryable error such as a 429.
Add the option to pass in the branch name in the run() method.
Options

revisit API names for redundancy, e.g. shorten `xata.database().createDatabase()` to `xata.database().create()`

`databases`

getDatabaseMetadata -> getMetadata
createDatabase -> create
deleteDatabase -> delete
updateDatabaseMetadata -> updateMetadata
listRegions -> getRegions

`users`

getUser -> get
updateUser -> update
deleteUser -> delete

`workspaces`

getWorkspacesList -> getWorkspaces
createWorkspace -> create
getWorkspace -> get
updateWorkspace -> update
deleteWorkspace -> delete
getWorkspaceMembersList -> getMembers
updateWorkspaceMemberRole -> updateMember
removeWorkspaceMember -> removeMember

`branch`

getBranchList -> getBranches
getBranchDetails -> getDetails
createBranch -> create
deleteBranch -> delete
getBranchMetadata -> getMetadata
updateBranchMetadata -> updateMetadata
getBranchStats -> getStats
resolveBranch -> resolve

`migrations`

getBranchMigrationHistory -> getHistory
getBranchMigrationPlan -> getPlan
executeBranchMigrationPlan -> executePlan
getBranchSchemaHistory -> getHistory
compareBranchSchemas -> compare
updateBranchSchemas -> update
previewBranchSchemaEdit -> preview
applyBranchSchemaEdit -> apply
pushBranchMigrations -> push

`records`

insertRecord -> insert
getRecord -> get
insertRecordWithID -> insertWithId
upsertRecordWithID -> upsertWithId
deleteRecord -> delete
updateRecordWithID -> updateWithId
bulkInsertTableRecords -> bulkInsert

`search_and_filter`

queryTable -> query
vectorSearchTable -> vectorSearch
askTable -> ask
summarizeTable -> summarize
aggregateTable -> aggregate

`table`

createTable -> create
deleteTable -> delete
updateTable -> update
getTableSchema -> getSchema
setTableSchema -> setSchema
getTableColumns -> getColumns
addTableColumn -> addColumn

ImportError: cannot import name 'Literal' from 'typing'

This is the error when I try to deploy a flask app on Render using a Xata database.

OS -> Windows 11
editor -> VS Code
Render
Python -> 3.11.6
Xata -> 1.2.0
Flask -> 2.3.2

Move Mako dependency into poetry

In order to unblock #41 we had to implement a workaround in #55 to bypass the missing dependency. Move the Mako dependency in the poetry dependency management file.

Breakup test files into domains

The unit- & integration tests currently reside in one file, which makes the setup slow and the files bloated. Break each file into more domain-specific tests.

Rework examples for GA release

multiple methods were deprecated in 0.x #19 , ensure the examples folder is up to date.

Duplicate `branch_name` creates invalid URL

bug reported via: https://discord.com/channels/996791218879086662/1085692924341264475/1085882910399275048

PEP8 naming conventions check

Incorporate https://pypi.org/project/pep8-naming/ into the linting process.

Related to #103, implement this first in order to simplify #103.

Depedency on `orjson` fails running from pyodide

The package is written in rust and doesn't offer a none-any wheel so it fails when trying to run it with pyodide.

It seems there's an open PR to offer support, opening this issue to keep track of it.

pyodide/pyodide#4036
pyodide/pyodide#1282

[FEEDBACK] Return the data immediately as response from the API

Every API call returns a requests.Response instance, while this is convenient to get all the information about the HTTP response, it is also clunky to handle. The expectation is to respond with the data (dict) shortcutting the .json() method. Nevertheless, should the response provide access to the status code and headers.

Add `XATA_DATABASE_URL` environment variable

The preferred way to bootstrap the SDK is via the workspace URL.

[Feedback][Helper] Pagination `getAll()` like

Create a query and search helper that handles the pagination under the hood. A user would provide only a query and would get the full result set without the need for a pagination routine, which is handled within the helper.

Create Database is missing properties

The xata.databases().create() method is missing the properties ui and metadata.

put:
      operationId: createDatabase
      summary: Create Database
      description: Create Database with identifier name
      requestBody:
        description: ''
        content:
          application/json:
            schema:
              description: ''
              type: object
              properties:
                branchName:
                  type: string
                  minLength: 1
                region:
                  type: string
                  minLength: 1
                ui:
                  type: object
                  properties:
                    color:
                      type: string
                metadata:
                  $ref: '#/components/schemas/BranchMetadata'
              example:
                branchName: main
                region: us-east-1
                metadata:
                  repository: github.com/my/repository
                  branch: github repository
                  stage: testing
                  labels:
                    - development
              required:
                - region

Remove `,` after `self` if no params are required in API

def delete(self,) -> Response:

The trailing comma can be omitted.

def delete(self) -> Response:

[Helper][Transactions] Add failIfMissing option to delete

https://xata.io/docs/api-reference/db/db_branch_name/transaction#execute-a-transaction-on-a-branch

Add the option failIfMissing to the transaction helper for delete operations with the default value: False

Make base URLs configurable

Currently, the client allows the initialization of different base URLs for core and workspace. This is not reflected in the namespaced endpoints, as these use a static value. Rework the Namespace class to use be able to inject a different base_url for core. The workspace will be derived from the db_url param.

xataio / xata-py Goto Github PK

xata-py's People

Contributors

Stargazers

Watchers

Forkers

xata-py's Issues

Methods to deprecated

Tasks

Tasks

Tasks

databases

users

workspaces

branch

migrations

records

search_and_filter

table

Recommend Projects

Recommend Topics

Recommend Org

`databases`

`users`

`workspaces`

`branch`

`migrations`

`records`

`search_and_filter`

`table`