Comments (11)
That means random_common_cause_refutor only change the graph structure by adding a new nodes as a confounder, and the value of treatment and outcome remain unchanged, right?
Yes.
So how about add_unobserved_common_cause? Do you change the sample of treatment & outcome by adding (coef * common_cause) to orignal treatment & outcome (e.g. Treatment'/Outcome' = Treatment/Outcome + α*common_cause)?
Even here, theoretically, only the graph is changed and we add an unobserved confounder that has a correlation with treatment and outcome. To implement such a change, the default method actually modifies the treatment and outcome as you say above. There are other methods (sensitivity analysis methods) under this same function that don't change treatment/outcome and instead follow a different approach.
The key difference between two refutation methods is that in 1) random: the confounder is not correlated with either outcome or treatment; 2) add_unobserved: the missing confounder is assumed to be causing both treatment and outcome.
from dowhy.
Thanks for raising this. We will update the documentation in the next few weeks.
Meanwhile, here's the answer.
-
random common cause: adds a randomly generated common cause. Estimated effect should not change.
-
add unobserved common cause: adds a common cause that has some correlation with the treatment and outcome. The correlation is a parameter--typically these refutation methods output a plot on how the estimate changes as the correlation is increased. The interpretation of the test is subjective.
-
data_subset_refuter: Consider a k% subset of the dataset and reruns the estimator. Estimate should not change
-
bootstrap refuter: Similar refuter. Here we construct a bootstrapped version of the same size as the dataset. The re-estimated effect should not change.
for more info, you can refer to https://arxiv.org/abs/2011.04216
from dowhy.
@xwbxxx I've just written a very detailed guide to the refuter methods here https://causalwizard.app/inference/article/bootstrap-refuters-dowhy which might be helpful for you.
from dowhy.
Thank you for your reply! That helps a lot!
random common cause: adds a randomly generated common cause. Estimated effect should not change.
That means random_common_cause_refutor only change the graph structure by adding a new nodes as a confounder, and the value of treatment and outcome remain unchanged, right?
So how about add_unobserved_common_cause? Do you change the sample of treatment & outcome by adding (coef * common_cause) to orignal treatment & outcome (e.g. Treatment'/Outcome' = Treatment/Outcome + α*common_cause)?
from dowhy.
This issue is stale because it has been open for 14 days with no activity.
from dowhy.
Thanks a lot, now I thoroughly understand their similarities and differences.
However, when I try to interpret the results, once again, I'm confused. Here are my results of bootstrap and subset refutor:
Considering the new effect, they are both close to the estimated effect and proves the robustness of the estimator (at least to some extent). But their p values are quite different.
So how should I interpret the result based on the p values?
from dowhy.
Yeah, the new effect is almost the same in both cases, so the estimator is okay. Not sure why you are getting a p-value of 0 for the bootstrap refuter. That would usually indicate that the estimator failed the test. Can you share some code to reproduce the issue?
from dowhy.
Here is my code. My dowhy version is 0.11. The p-value of 0 happened when I try to refute backdoor.linear_regression
estimator.
from dowhy import CausalModel
import dowhy.datasets
import pandas as pd
import numpy as np
# Config dict to set the logging level
import logging.config
DEFAULT_LOGGING = {
'version': 1,
'disable_existing_loggers': False,
'loggers': {
'': {
'level': 'WARN',
},
}
}
logging.config.dictConfig(DEFAULT_LOGGING)
# Value of the coefficient [BETA]
BETA = 10
# Number of Common Causes
NUM_COMMON_CAUSES = 2
# Number of Instruments
NUM_INSTRUMENTS = 1
# Number of Samples
NUM_SAMPLES = 200000
# Treatment is Binary
TREATMENT_IS_BINARY = False
data = dowhy.datasets.linear_dataset(beta=BETA,
num_common_causes=NUM_COMMON_CAUSES,
num_instruments=NUM_INSTRUMENTS,
num_samples=NUM_SAMPLES,
treatment_is_binary=TREATMENT_IS_BINARY)
model = CausalModel(
data=data['df'],
treatment=data['treatment_name'],
outcome=data['outcome_name'],
graph=data['gml_graph'],
instruments=data['instrument_names']
)
model.view_model()
identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)
print("----------------------------------------------------")
def backdoor_linear():
causal_estimate_bd = model.estimate_effect(identified_estimand,
method_name="backdoor.linear_regression",
target_units="ate")
print("Causal effect of backdoor: ", causal_estimate_bd.value)
print("-----------------")
# random_common_cause ================================================================
random_common_cause = model.refute_estimate(identified_estimand, causal_estimate_bd,
method_name="random_common_cause")
print(random_common_cause)
print("-----------------")
# placebo_treatment ================================================================
placebo_treatment = model.refute_estimate(identified_estimand, causal_estimate_bd,
method_name="placebo_treatment_refuter")
print(placebo_treatment)
print("-----------------")
# dummy_outcome ================================================================
dummy_outcome = model.refute_estimate(identified_estimand, causal_estimate_bd,
method_name="dummy_outcome_refuter")
print(dummy_outcome[0])
print("-----------------")
# data_subset ================================================================
res_subset = model.refute_estimate(identified_estimand, causal_estimate_bd,
method_name="data_subset_refuter",
subset_fraction=0.8)
print(res_subset)
print("-----------------")
# bootstrap ================================================================
bootstrap = model.refute_estimate(identified_estimand, causal_estimate_bd,
method_name="bootstrap_refuter")
print(bootstrap)
print("----------------------------------------------------")
def instrumental_variable():
causal_estimate_iv = model.estimate_effect(identified_estimand, method_name="iv.instrumental_variable", )
print("Causal effect of instrument variable: ", causal_estimate_iv.value)
# placebo_treatment ================================================================
placebo_treatment = model.refute_estimate(identified_estimand, causal_estimate_iv,
placebo_type="permute",
method_name="placebo_treatment_refuter")
print(placebo_treatment)
print("-----------------")
# causal_estimate_iv_2 = model.estimate_effect(identified_estimand,
# method_name="iv.instrumental_variable",
# method_params={'iv_instrument_name': 'Z0'})
# placebo_treatment_2 = model.refute_estimate(identified_estimand, causal_estimate_iv_2,
# placebo_type="permute",
# method_name="placebo_treatment_refuter")
# print(placebo_treatment_2)
# print("-----------------")
# random_common_cause ================================================================
random_common_cause = model.refute_estimate(identified_estimand, causal_estimate_iv,
method_name="random_common_cause")
print(random_common_cause)
print("-----------------")
# dummy_outcome ================================================================
dummy_outcome = model.refute_estimate(identified_estimand, causal_estimate_iv,
method_name="dummy_outcome_refuter")
print(dummy_outcome[0])
print("-----------------")
# data_subset ================================================================
res_subset = model.refute_estimate(identified_estimand, causal_estimate_iv,
method_name="data_subset_refuter",
subset_fraction=0.8)
print(res_subset)
print("-----------------")
# bootstrap ==================================================================
bootstrap = model.refute_estimate(identified_estimand, causal_estimate_iv,
method_name="bootstrap_refuter")
print(bootstrap)
backdoor_linear()
instrumental_variable()
from dowhy.
I think I found the reason for p value=0 in the case of two similar results of new effect and expected effect.
def perform_bootstrap_test(estimate, simulations: List):
# This calculates a two-sided percentile p-value
# See footnotes in https://journals.sagepub.com/doi/full/10.1177/2515245920911881
half_p_value = np.mean([(x > estimate.value) + 0.5 * (x == estimate.value) for x in simulations])
return 2 * min(half_p_value, 1 - half_p_value)
The p value is derived from the function above, indicating the probability that estimate
(expected effect) falls in the distribution of simulations
(new effects from refutor).
If the distribution of simulations
is narrow, or even worse, takes the same value, the estimate
won't fall in the distribution, and the p value gets 0.
from dowhy.
This issue is stale because it has been open for 14 days with no activity.
from dowhy.
This issue was closed because it has been inactive for 7 days since being marked as stale.
from dowhy.
Related Issues (20)
- Python 3.12 support HOT 9
- Feature relevance/Influence HOT 26
- Graphviz installation : --include-path not recognized anymore HOT 4
- Does this package support non-English languages? HOT 3
- Question about Dummy Outcome Refuter HOT 2
- Inconsistency in the placebo_treatment_refuter when using estimate_effect of IV HOT 1
- numpy.dual is dropped but it still occurs in dowhy HOT 2
- NetworkXError: graph should be directed acyclic HOT 4
- Refutation & Overlap Error ("data_subset_refuter", "add_unobserved_common_cause", assess_support_and_overlap_overrule) HOT 2
- No Backdoor Path Available
- Clarification on how to use gcm properly for confounders adjustment HOT 5
- Can you provide code demo for each function? HOT 2
- How is propensity score matching implemented? HOT 2
- Interpreting mean while using logistic regression to estimate causal effect. HOT 1
- model.estimate_effect and model.refute_astimate throws 'A column-vector y was passed ...' error
- RuntimeWarning: divide by zero encountered in divide when using evaluate_causal_model HOT 3
- Auto assign_causal_mechanisms is taking so much time in gcm HOT 11
- falsify_graph HOT 8
- Remove use of CausalModel from test files and notebooks
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dowhy.