In addition to controlling read/writes to our shared licensing repository, L0 Storage should be expanded to perform the same type of functionality for hosted data lakes.
Customers should be able to:
use their publishing ID to request a temporary write token
use the write token to write data to the lake
use their private key to read data from their lake
API IDs are moving to the l0-auth service, meaning the l0-storage service needs to be updated to use JWT validation (with audience validation) for the token routes.
The current worker/upload config only supports https for where the bucket name is also the dns name.
This is incompatible with minio in a docker-compose setup.
The worker should support an optional environment variable override for use in development that toggles https to http.
Note: The minio compose config may additionally need adjusting to define the bucket name, as well as additional toml configs for the worker. Requires testing for confirmation.
When successfully generating a policy, record the address and customer ID to the database.
We can then use this record to determine the relationship between address & customer IDs at any given moment enabling usage monitoring, optionally address pinning, and other features.
Our typical data ingestion model is asynchronous and incremental.
However, occasionally data suppliers have or are accruing zero-party data without using TIKI, but still want to leverage the power of TIKI's data pooling.
Data will provided in bulk (likely in csv, or parquet) to a bucket (likely s3) where it should be cataloged by TIKI. If the data is re-serialized to fit TIKI's format, the original should then be discarded to avoid duplicate storage.
We need an network-edge deployed proxy function that can report back block write sizes so we do not need to constantly keep traversing the entire storage repo to compute use by account id.
We want to create our own simplified version of the S3 post policy to force all clients to use TIKI's proxy to write to the buckets so we can accurately log utilization.
As a developer I want to see error reporting (and ideally performance metrics) on our serverless worker functions.
HOWEVER
Sentry does not offer a pure JS SDK for serverless; they have a nodejs SDK with wrappers for Lambda/GCP.
But! Cloudflare workers are special (no cold start) and don't really support a node runtime (in beta and a bit wonky).
One option is to take inspiration from the node & pure js SDKs and create a new JS serverless SDK using their API
A second possible option is to investigate migrating the worker code to Rust and using the Sentry Rust SDK. It's unclear without more investigation IF this is viable.
Customers should be able to upload immutable docs like terms of service or contract terms and conditions at the application layer (above the scope of a single address).
This will remove duplicative data stored at the transaction layer, both improving performance and making it easier for customers to manage, update, and deploy contract terms.
This is a technical spec/design story, not an implementation story. Implementation stories will be created upon sign-off on the technical spec.