Comments (12)
Hey @JonasRSV , thanks for bringing up this example. This kind of an indirect effect graph is more commonly used for estimating causal mediation effects. Since DoWhy currently does not support mediation effects, so the code simply assumes existence of direct edge.
I can answer better if you don't mind providing more details about the goal of your analysis. Can you clarify the effect that you are trying to estimate? From the description, I understand that you want to estimate the effect of error_code on days_on_grace, but in the current graph there are no observed common causes (confounders) and thus it translates to problem with a cause, outcome and no confounders. Is that the correct interpretation?
from dowhy.
Yes mediation effect is what i was looking for. This was just an example.
I am looking forward for that feature!
from dowhy.
Hey, is this implemented now? can i do mediation analyses using dowhy?
To clarify, i have an edge between treatment and outcome as well as a mediator variable. So i am able to draw the graph.
from dowhy.
Not yet @sangyh. Can you share your causal graph and a motivating example of the effect that you want to calculate.
Can work on adding it.
from dowhy.
I am also interested in the mediation analysis. In Pearl's book, my understanding is that mediation can be addressed by choosing whether to control for the mediator or not. I have the current DAG. Any thoughts on how to develop the mediation myself for the estimation problem?
from dowhy.
@samou1 Are you looking to calculate the effect of LCD on T2D? Here's a way to do it.
- The direct effect of LCD (changing from LCD1 to LCD2) on T2D is given by
E[T2D| LCD2, BMI, G, A] - E[T2D, LCD1, BMI, G, A] P(BMI|LCD1, G, A) P(G,A)
where G is gender and A is age, and the above formula is for a specific value of (BMI, G, A). To find the average direct effect, just sum the above formula for each value of (BMI, G, A). - the total effect of LCD on T2D is estimated by conditioning on Gender and Age (backdoor identification TE(LCD))
- The direct effect of BMI on T2D is estimated by conditioning on Gender, Age and LCD (backdoor identification DE(BMI))
from dowhy.
@sangyh @samou1 @JonasRSV Mediation effects are now supported in DoWhy! Do try it out and share your feedback. Here's a full example notebook.
Summary
There are two new estimand types in identify_effect
:
- nonparametric-nde: This the natural direct effect of treatment on outcome (T->Y)
- nonparametric-nie: This is the natural indirect effect, mediated by another variable (T->M->Y).
For estimation, the implemented estimator is simple: it is a two stage linear regression estimator. But the API is general, you can specify a first_stage_model
and a second_stage_model
. Will be adding a non-linear estimator soon. Here's a code sample.
For the direct effect of treatment on outcome
# Natural direct effect (nde)
identified_estimand_nde = model.identify_effect(estimand_type="nonparametric-nde",
proceed_when_unidentifiable=True)
print(identified_estimand_nde)
import dowhy.causal_estimators.linear_regression_estimator
causal_estimate_nde = model.estimate_effect(identified_estimand_nde,
method_name="mediation.two_stage_regression",
confidence_intervals=False,
test_significance=False,
method_params = {
'first_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator,
'second_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator
}
)
print(causal_estimate_nde)
For the indirect effect of treatment on outcome
# Natural indirect effect (nie)
identified_estimand_nie = model.identify_effect(estimand_type="nonparametric-nie",
proceed_when_unidentifiable=True)
print(identified_estimand_nie)
causal_estimate_nie = model.estimate_effect(identified_estimand_nie,
method_name="mediation.two_stage_regression",
confidence_intervals=False,
test_significance=False,
method_params = {
'first_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator,
'second_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator
}
)
print(causal_estimate_nie)
The frontdoor criterion is also supported through the same two stage estimator. To use frontdoor, write:
import dowhy.causal_estimators.linear_regression_estimator
causal_estimate = model.estimate_effect(identified_estimand,
method_name="frontdoor.two_stage_regression",
confidence_intervals=False,
test_significance=False,
method_params = {
'first_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator,
'second_stage_model': dowhy.causal_estimators.linear_regression_estimator.LinearRegressionEstimator
}
)
print(causal_estimate)
For a full code example, you can check out the notebook on mediation effects with DoWhy: https://github.com/microsoft/dowhy/blob/master/docs/source/example_notebooks/dowhy_mediation_analysis.ipynb
from dowhy.
Hi Amit, thanks for the update and implementing this. To clarify, this is the Baron and Kenny approach to mediation and not pearl's approach?
In this case, I would need some tests for linearity I presume.
from dowhy.
@sangyh yes, the estimator implements the Baron and Kenny approach. However the modeling and identification steps before it are done using Pearl's approach. So given a causal graph with mediation (and other confounders), DoWhy can find out the right variables to include in the regression formula.
I also plan to add the non-parametric estimator based on Pearl's identification results. That should be implemented in the coming weeks. The linear case was the simplest to implement, so I started with that.
from dowhy.
Thanks Amit. I realized i have a confounder causing the mediator and outcome variables, so afraid BK approach will not work. I will try implementing pearl's approach if you haven't already implemented this in DoWhy.
In your comment to @samou1, what is 'each value of (BMI, G, A)' when all 3 are continuous variables?
from dowhy.
When all three are continuous variables, then the sum for each value of (BMI, G,A) becomes an integration over the same variables, weighted by the probability P(BMI, G, A). If integration is numerically difficult, you can discretize the variables to reasonable buckets and then try.
Unfortunately it may take a few weeks before the Pearlian non-parametric estimator is implemented. Do let me know how your implementation goes for this estimator @sangyh .
from dowhy.
Hi all!
I am struggling to identify the correct estimand when using multiple mediators:
I am using Gender_Male as a treatment and Hourly_Salary as an outcome. And I am interested in the natural direct vs. natural indirect effects.
When running: model.identify_effect(estimand_type="nonparametric-nde"), I only get the estimand for ONE mediator, which seems to be randomly selected:
Can someone explain this behavior? Can dowhy not handle multiple mediators? Thank you very much in advance!
from dowhy.
Related Issues (20)
- Support polars data frames HOT 2
- What is the purpose of providing observation data in gcm.conventional_samples()?
- Python 3.12 support HOT 9
- Clarify the differences among refute methods HOT 11
- Feature relevance/Influence HOT 26
- Graphviz installation : --include-path not recognized anymore HOT 4
- Does this package support non-English languages? HOT 3
- Question about Dummy Outcome Refuter HOT 2
- Inconsistency in the placebo_treatment_refuter when using estimate_effect of IV HOT 1
- numpy.dual is dropped but it still occurs in dowhy HOT 2
- NetworkXError: graph should be directed acyclic HOT 4
- Refutation & Overlap Error ("data_subset_refuter", "add_unobserved_common_cause", assess_support_and_overlap_overrule) HOT 2
- No Backdoor Path Available
- Clarification on how to use gcm properly for confounders adjustment HOT 5
- Can you provide code demo for each function? HOT 2
- How is propensity score matching implemented? HOT 2
- Interpreting mean while using logistic regression to estimate causal effect. HOT 1
- model.estimate_effect and model.refute_astimate throws 'A column-vector y was passed ...' error
- RuntimeWarning: divide by zero encountered in divide when using evaluate_causal_model HOT 3
- Auto assign_causal_mechanisms is taking so much time in gcm HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dowhy.