Describe the Feature
MetricFlow currently builds generalized datasets for analytical applications(i.e. metrics by dimensions). Experimentation applications require a specific set of steps to denormalize and aggregate metrics in a way that enables statistical testing. The currently configurable process through MetricFlow’s APIs does not support the two-step aggregation required to create statistical tests.
We should add a plan builder that allows users to use MetricFlow to construct metrics for experimentation applications. Users of MetricFlow should not have to reinvent the wheel every time they want to run a product experiment. This is a logical extension of the metric framework in place today.
Story / Assumptions
- Users of MetricFlow have an experiment assignment or feature flagging system in place that exports assignment/exposure logs to a supported DW.
- Users would like to use their existing metric configurations to build metrics for experimentation applications
- Users of MetricFlow can call an API to generate SQL and construct metrics for a list of metrics and a specific experiment. This API can either return data to the users' querying interface or write a dataset back to the data warehouse.
Open/New Questions
- How do we capture experiment assignment configuration?
- Should we allow for an entity_expr for tables that have a subject_id that could be multiple entities?
- When do we load data to support CUPED or more complicated statistical tests?
- What do the inputs look like from common experimentation assignment tools? Do they conform or can they be transformed into a conformed input?
- What do people need to be able to do with the API?
Lots more to do to spec the plan builder and any new node types
Experiment Assignment Configuration
Expected input from experiment assignment:
CREATE TABLE events.experiment_assignments (
subject_id BIGINT
subject_entity STRING
, experiment_name STRING
, variant_name STRING
, first_assigned_at TIMESTAMP
);
which would be configured in MetricFlow as follows:
name: experiment_assignments
sql_table: events.experiment_assignment
experiment_assigments: True – NEW
identifiers:
- name: subject
expr: subject_id
entity_expr: subject_entity – NEW
type: foreign
measures:
- name: assignments
agg: sum
expr: 1
dimensions:
- name: experiment_name
type: categorical
- name: variant_name
type: categorical
- name: first_assigned_at
type: time
type_params:
is_primary: True
time_granularity: second
Generated
Consider a metric
name: transactions
sql_table: fact.transactions
identifiers:
- name: user
expr: user_id
type: foreign
measures:
- name: transactions
agg: sum
expr: 1
create_metric: True
To support the construction of metrics for the experiment application we would need to generate two steps of aggregation. The query below could be broken into a non-aggregated step with a timestamp for more complicated statistical tests.
First, MetricFlow's bread and butter, construct a metric to the granularity of the entity of the experiment (i.e. subject_id):
-- Step 1
CREATE TABLE metricflow.user_assignments_transactions
SELECT
ua.subject_id
, ua.experiment_name
, ua.variant_name
, SUM(1) AS transactions
FROM metricflow.user_assignments ua
LEFT JOIN fact.transactions e
ON ua.subject_id = e.subject_id AND e.created_at BETWEEN ua.first_assigned_at and {{ analysis_date }}
GROUP BY 1,2,3
Second, get the first and second moment at the granularity of the experiment and variant name
-- Step 2
CREATE TABLE metricflow.metric_variant_aggregates
SELECT
experiment_name
, variant_name
, 'transactions' as metric_name
, COUNT(subject_id) AS assignments
, SUM(metric) / COUNT(subject_id) AS metric_mean
, SQRT(VAR(metric)) AS metric_std
FROM metricflow.metric_user_aggregates
GROUP BY 1,2
Finally, the appropriate statistical tests for each variant name compared to the control.
-- Step 3
CREATE TABLE metricflow.metric_comparison
SELECT
c.experiment_name
, c.variant_name AS variant_name_control
, t.variant_name AS variant_name_treatment
, c.metric_name
, t.metric_mean - c.metric_mean AS metric_diff
, t.metric_mean / c.metric_mean - 1 AS metric_pct
, UDFS.TTEST_IND_FROM_STATS(
t.metric_mean, t.metric_std, t.assignments
c.metric_mean, c.metric_std, c.assignments
) AS pvalue
FROM (
SELECT *
FROM metricflow.metric_variant_aggregates
WHERE variante_name = {{ control_name }}
) c
JOIN (
SELECT *
FROM metricflow.metric_variant_aggregates
WHERE variante_name != {{ control_name }}
) t
ON c.experiment_name = t.experiment_name
AND c.metric_name = t.metric_name
Outputs
API
The API to generate experimental datasets could look something like this:
mf experiment_query
--metrics x,y,z
--dimensions a,b,c
--experiment_name button_color
--control_name control
--analysis_date
Output Schemas
Load from Step 2:
CREATE TABLE metricflow.experiment_metric_values
, experiment_name STRING
, variant_name STRING
, dimenison_name STRING
, dimenison_value STRING
, assignments BIGINT
, metric_mean DOUBLE
, metric_std DOUBLE
, ts__day DATE -- analysis_date
PARTITIONED BY (experiment_name STRING, ts__day DATE)
);
Load from Step 3:
CREATE TABLE metricflow.experiment_event_source_summary
, experiment_name STRING
, variant_name_control STRING
, variant_name_treatment STRING
, metric_diff DOUBLE
, metric_pct DOUBLE
, pvalue DOUBLE
, ts__day DATE -- analysis_date
PARTITIONED BY (experiment_name STRING, ts__day DATE)
);
Dimensional Analysis
Dimensional analysis could be achieved by injecting a dimension name and value into steps 1 and 2 and performing all aggregations to that dimension's name and value. The output table could look something like this
CREATE TABLE metricflow.experiment_metric_values
, experiment_name STRING
, variant_name STRING
, dimenison_name STRING
, dimenison_value STRING
, assignments BIGINT
, metric_mean DOUBLE
, metric_std DOUBLE
, ts__day DATE -- analysis_date
PARTITIONED BY (experiment_name STRING, ts__day DATE)
);
Would you like to contribute?
Totally. I’m just here to add value.
Anything Else?
I want to thank @danfrankj and @askeys for providing some initial thoughts that helped me sort through how we could do this.