What’s the problem you're trying to solve
In #150, In order to enhance our query performance after users send the API request to run our data endpoint to get the result from the data source. We need to provide a Caching (pre-loading) Layer with the duckDB to enhance query performance.
Describe the solution you’d like
In the part, only focus on parsing the YAML by the above cache
format.
# API schema YAML
cache:
# "order" table keep in duckDB
- cacheTableName: "order"
sql: "select * from order",
# optional
profile: pg
# optional
refreshTime:
every: "5m"
# optional
refreshExpression:
expression: "MAX(created_at)"
every: "5m"
# optional
indexes:
'idx_a': 'col_a'
'idx_b': 'col_b'
# "product" table keep in duckDB
- cacheTableName: "product"
sql: "select * from product",
If the refreshExpression
and refreshTime
are both setup, throw the error when the user type vulcan build
.
Additional Context
Because we will solve the Caching (pre-loading) Layer with the duckDB by multiple issues, we need to define an interface to interact with the solution of other issues.
Here, make the APISchema
like below, we add the cache!: Array<CacheLayerInfo>;
in APISchema
and define the CacheLayerInfo
interface.
// packages/core/src/models/artifact.ts
export class CacheLayerInfo {
cacheTableName!: string;
sql!: string;
refreshTime?: Record<string, string>;
refreshExpression?: Record<string, string>;
// the used profile to query the data from data source
profile!: string;
// index key name -> index column
indexes?: Record<string, string>;
}
export class APISchema {
// graphql operation name
operationName!: string;
// restful url path
urlPath!: string;
// template, could be name or path
templateSource!: string;
@Type(() => RequestSchema)
request!: Array<RequestSchema>;
@Type(() => ErrorInfo)
errors!: Array<ErrorInfo>;
@Type(() => ResponseProperty)
response!: Array<ResponseProperty>;
description?: string;
// The pagination strategy that do paginate when querying
// If not set pagination, then API request not provide the field to do it
pagination?: PaginationSchema;
sample?: Sample;
profiles!: Array<string>;
cache!: Array<CacheLayerInfo>;
}
Otherwise, when parsing the cache table name in YAML, please check whether the metadata which exists cache.vulcan.com
key name or not:
- if existed, but does not define the
cache
setting in the schema.yaml
or not, throw an error, because it means your SQL will use the cache feature, not YAML defined.
- add the schema cache info to
RawAPISchema
if metadata exists cache.vulcan.com
key name.
- If the user defines the same cacheTableName in the cache, it will cause our cache (duckDB) could cover the original one table data, so we should check if it throws the error message
- Check the cacheTableName has value when getting the schemas from parsed result from YAML.
- Check the value format of cacheTableName, because the table column often has a pattern, here suggest you follow the rule: `begin with a letter, not end in an underscore, and should contain only alphanumeric characters ( reference link, It's better if you discover the DuckDB column pattern and check it. )
- Check the "sql" has the value
- Check the "expression" in the "refreshExpression" field has the value that the user set it.
- Get the first profile name from "profiles" field, the reason is that to make sure to use a default profile be the query data source when user set multiple "profiles".
- Check the "indexes" has value and format is not key-value format.
- Check that each key index name pattern in "indexes", the same suggest rule to follow "begin with a letter, not end in an underscore, and should contain only alphanumeric characters" ( reference link, It's better if you discover the DuckDB column pattern and check it. )