david26694 / cluster-experiments Goto Github PK
View Code? Open in Web Editor NEWSimulation-based power analysis library
Home Page: https://david26694.github.io/cluster-experiments/
License: MIT License
Simulation-based power analysis library
Home Page: https://david26694.github.io/cluster-experiments/
License: MIT License
Remove data before and after each switch
Currently, we can only run the synthetic control analysis using the class init. In the future, we also want to allow users to run it from cls (using power config)
Originally posted by @david26694 in #168 (comment)
Implement the different null hypothesis expressions
df_power_users[self.treatment_col] = np.random.choice(
[0, 1], size=len(df_power_users), p=[0.1, 0.9]
)
choice should be from A, B
Exact power analysis should have a number of iterations to estimate standard error. Run simulations to understand the number of iterations that are good to have
add it to github action
At least:
Should only work for stratified data, for each strata we should compute average of test and control
At some hour of day, we may need a longer washover than on another hour. Allow users to implement different washover lengths according to time of day / day of week.
Suggested by @pablobd
https://david26694.github.io/cluster-experiments/analysis_with_different_hypothesis.html
This notebook needs more effects, otherwise this is not very representative:
Right now switchback only allows a fixed length, allow for different length for A and B
Fail early if target_col/cluster_cols are different in different components in a power analysis
For each record in our experiment, we have two events' timestamps, login_timestamp and logout_timestamp. We want to apply washover such that if there is a change in treatment between login_timestamp and logout_timestamp.
Calendar:
treatment,time
A,10:00
B,11:00
B,12:00
Record data:
id,start_time,end_time
1,10:50,10:59
2,10:51,11:01
3,11:01,11:05
4,11:01,12:01
5,12:01,12:05
In this case, we want to washover row number 2.
Right now, if the metric is binary and it is of type Boolean (True, False), the power analysis will throw a series of errors that can't be easily traced back to this reason.
We could either perform a conversion in the code, or raise an error specifying that the input metric should be of type integer (0,1).
Add a wrapper to a synthetic control implementation that gives p-values, this should allow us to treat synthetic control as just another analysis method and check if it has higher power than simpler things
For different effects and analysis / split methods, compare simulated and exact power calculations
fyi @ludovico-lanni
Add plugin for flake
A new PowerAnalysis class needs to be set up. Something like:
class ExactPowerAnalysis
, that implements the power_analysis and power_line methods using the linear model formula.
I have this script I used a long time ago:
import numpy as np
import pandas as pd
from cluster_experiments import PowerAnalysis
from scipy.stats import norm
def power_2_tails(df, splitter, alpha, ate, regressor):
"""Power of a test with a given alpha, ate and sigma
Parameters
----------
alpha : float
Significance level
ate : float
Average treatment effect
sigma : float
Standard deviation of the difference of means (already divided by n)
"""
df_treated = splitter.assign_treatment_df(df).assign(
treatment=lambda x: (x.treatment == 'B').astype(int),
)
import statsmodels.api as sm
fitted_ols = sm.OLS.from_formula(f"target ~ treatment + {regressor}", data=df_treated).fit()
# get standard error of the regression coefficient
se = fitted_ols.bse["treatment"]
from scipy.stats import norm
z_alpha = norm.ppf(1 - alpha / 2)
norm_cdf = norm.cdf(z_alpha - ate / se)
norm_cdf_2 = norm.cdf(-z_alpha - ate / se)
return 1 - norm_cdf + norm_cdf_2
# Create fake data
N = 2_000
alpha = 0.05
average_effect = 0.5
sigma = 1
df = pd.DataFrame(
{
"target": np.random.normal(0, sigma, size=2 * N),
"regressor": np.random.normal(0, sigma, size=2 * N),
"better_regressor": np.random.normal(0, sigma, size=2 * N),
}
).assign(
target=lambda x: x.target + x.regressor * 2 + x.better_regressor * 10
)
config = {
"analysis": "ols_non_clustered",
"perturbator": "uniform",
"splitter": "non_clustered",
"n_simulations": 1000,
"covariates": ["regressor"],
"alpha": alpha
}
pw = PowerAnalysis.from_dict(config)
pw_better = PowerAnalysis.from_dict({
**config,
"covariates": ["better_regressor"],
})
EFFECTS = [0, 0.1, 0.2, 0.3, 0.4, 0.5]
powers = pw.power_line(df, average_effects=EFFECTS)
powers_better = pw_better.power_line(df, average_effects=EFFECTS)
powers_exact = {}
powers_exact_better = {}
for average_effect in EFFECTS:
powers_exact[average_effect] = power_2_tails(df, pw.splitter, alpha, average_effect, "regressor")
powers_exact_better[average_effect] = power_2_tails(df, pw.splitter, alpha, average_effect, "better_regressor")
powers_better, powers
powers_exact_better, powers_exact
import matplotlib.pyplot as plt
plt.plot(EFFECTS, powers.values(), label="regressor")
plt.plot(EFFECTS, powers_better.values(), label="better regressor")
plt.plot(EFFECTS, powers_exact.values(), label="exact regressor")
plt.plot(EFFECTS, powers_exact_better.values(), label="exact better regressor")
plt.legend()
plt.xlabel("Average effect")
plt.ylabel("Power")
plt.title("Power of the test")
plt.show()
to compare exact vs simulation.
Implement a washover process that applies several washover methods in sequence
To see if docs-deploy is going to work
https://arxiv.org/pdf/2105.14705.pdf. This is a simpler method than clustered-standard errors than OLS
We can follow this.
A possible perturbator, given an ATE
Such that
Raised by @aureliolova.
time_col = 'activation_time'
switch_frequency = '6h'
cluster_cols = ['city']
splitter = SwitchbackSplitter(
time_col = time_col,
cluster_cols = cluster_cols,
switch_frequency = switch_frequency,
)
split_df = splitter.assign_treatment_df(data)
If time_col is not in cluster_cols we should raise some error. Another alternative is appending it in the init method
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.