Giter VIP home page Giter VIP logo

Comments (11)

amit-sharma avatar amit-sharma commented on July 23, 2024 2

That means random_common_cause_refutor only change the graph structure by adding a new nodes as a confounder, and the value of treatment and outcome remain unchanged, right?

Yes.

So how about add_unobserved_common_cause? Do you change the sample of treatment & outcome by adding (coef * common_cause) to orignal treatment & outcome (e.g. Treatment'/Outcome' = Treatment/Outcome + α*common_cause)?

Even here, theoretically, only the graph is changed and we add an unobserved confounder that has a correlation with treatment and outcome. To implement such a change, the default method actually modifies the treatment and outcome as you say above. There are other methods (sensitivity analysis methods) under this same function that don't change treatment/outcome and instead follow a different approach.

The key difference between two refutation methods is that in 1) random: the confounder is not correlated with either outcome or treatment; 2) add_unobserved: the missing confounder is assumed to be causing both treatment and outcome.

from dowhy.

amit-sharma avatar amit-sharma commented on July 23, 2024 1

Thanks for raising this. We will update the documentation in the next few weeks.
Meanwhile, here's the answer.

  • random common cause: adds a randomly generated common cause. Estimated effect should not change.

  • add unobserved common cause: adds a common cause that has some correlation with the treatment and outcome. The correlation is a parameter--typically these refutation methods output a plot on how the estimate changes as the correlation is increased. The interpretation of the test is subjective.

  • data_subset_refuter: Consider a k% subset of the dataset and reruns the estimator. Estimate should not change

  • bootstrap refuter: Similar refuter. Here we construct a bootstrapped version of the same size as the dataset. The re-estimated effect should not change.

for more info, you can refer to https://arxiv.org/abs/2011.04216

from dowhy.

drawlinson avatar drawlinson commented on July 23, 2024 1

@xwbxxx I've just written a very detailed guide to the refuter methods here https://causalwizard.app/inference/article/bootstrap-refuters-dowhy which might be helpful for you.

from dowhy.

xwbxxx avatar xwbxxx commented on July 23, 2024

Thank you for your reply! That helps a lot!

random common cause: adds a randomly generated common cause. Estimated effect should not change.

That means random_common_cause_refutor only change the graph structure by adding a new nodes as a confounder, and the value of treatment and outcome remain unchanged, right?

So how about add_unobserved_common_cause? Do you change the sample of treatment & outcome by adding (coef * common_cause) to orignal treatment & outcome (e.g. Treatment'/Outcome' = Treatment/Outcome + α*common_cause)?

from dowhy.

github-actions avatar github-actions commented on July 23, 2024

This issue is stale because it has been open for 14 days with no activity.

from dowhy.

xwbxxx avatar xwbxxx commented on July 23, 2024

Thanks a lot, now I thoroughly understand their similarities and differences.

However, when I try to interpret the results, once again, I'm confused. Here are my results of bootstrap and subset refutor:
image
Considering the new effect, they are both close to the estimated effect and proves the robustness of the estimator (at least to some extent). But their p values are quite different.

So how should I interpret the result based on the p values?

from dowhy.

amit-sharma avatar amit-sharma commented on July 23, 2024

Yeah, the new effect is almost the same in both cases, so the estimator is okay. Not sure why you are getting a p-value of 0 for the bootstrap refuter. That would usually indicate that the estimator failed the test. Can you share some code to reproduce the issue?

from dowhy.

xwbxxx avatar xwbxxx commented on July 23, 2024

Here is my code. My dowhy version is 0.11. The p-value of 0 happened when I try to refute backdoor.linear_regression estimator.

from dowhy import CausalModel
import dowhy.datasets
import pandas as pd
import numpy as np

# Config dict to set the logging level
import logging.config

DEFAULT_LOGGING = {
    'version': 1,
    'disable_existing_loggers': False,
    'loggers': {
        '': {
            'level': 'WARN',
        },
    }
}

logging.config.dictConfig(DEFAULT_LOGGING)

# Value of the coefficient [BETA]
BETA = 10
# Number of Common Causes
NUM_COMMON_CAUSES = 2
# Number of Instruments
NUM_INSTRUMENTS = 1
# Number of Samples
NUM_SAMPLES = 200000
# Treatment is Binary
TREATMENT_IS_BINARY = False
data = dowhy.datasets.linear_dataset(beta=BETA,
                                     num_common_causes=NUM_COMMON_CAUSES,
                                     num_instruments=NUM_INSTRUMENTS,
                                     num_samples=NUM_SAMPLES,
                                     treatment_is_binary=TREATMENT_IS_BINARY)

model = CausalModel(
    data=data['df'],
    treatment=data['treatment_name'],
    outcome=data['outcome_name'],
    graph=data['gml_graph'],
    instruments=data['instrument_names']
)

model.view_model()
identified_estimand = model.identify_effect(proceed_when_unidentifiable=True)
print(identified_estimand)
print("----------------------------------------------------")


def backdoor_linear():
    causal_estimate_bd = model.estimate_effect(identified_estimand,
                                               method_name="backdoor.linear_regression",
                                               target_units="ate")

    print("Causal effect of backdoor: ", causal_estimate_bd.value)
    print("-----------------")

    # random_common_cause ================================================================
    random_common_cause = model.refute_estimate(identified_estimand, causal_estimate_bd,
                                                method_name="random_common_cause")
    print(random_common_cause)
    print("-----------------")

    # placebo_treatment ================================================================
    placebo_treatment = model.refute_estimate(identified_estimand, causal_estimate_bd,
                                              method_name="placebo_treatment_refuter")
    print(placebo_treatment)
    print("-----------------")

    # dummy_outcome ================================================================
    dummy_outcome = model.refute_estimate(identified_estimand, causal_estimate_bd,
                                          method_name="dummy_outcome_refuter")
    print(dummy_outcome[0])
    print("-----------------")

    # data_subset ================================================================
    res_subset = model.refute_estimate(identified_estimand, causal_estimate_bd,
                                       method_name="data_subset_refuter",
                                       subset_fraction=0.8)
    print(res_subset)
    print("-----------------")

    # bootstrap ================================================================
    bootstrap = model.refute_estimate(identified_estimand, causal_estimate_bd,
                                      method_name="bootstrap_refuter")
    print(bootstrap)
    print("----------------------------------------------------")


def instrumental_variable():
    causal_estimate_iv = model.estimate_effect(identified_estimand, method_name="iv.instrumental_variable", )
    print("Causal effect of instrument variable: ", causal_estimate_iv.value)

    # placebo_treatment ================================================================
    placebo_treatment = model.refute_estimate(identified_estimand, causal_estimate_iv,
                                              placebo_type="permute",
                                              method_name="placebo_treatment_refuter")
    print(placebo_treatment)
    print("-----------------")

    # causal_estimate_iv_2 = model.estimate_effect(identified_estimand,
    #                                              method_name="iv.instrumental_variable",
    #                                              method_params={'iv_instrument_name': 'Z0'})
    # placebo_treatment_2 = model.refute_estimate(identified_estimand, causal_estimate_iv_2,
    #                                             placebo_type="permute",
    #                                             method_name="placebo_treatment_refuter")
    # print(placebo_treatment_2)
    # print("-----------------")
    # random_common_cause ================================================================
    random_common_cause = model.refute_estimate(identified_estimand, causal_estimate_iv,
                                                method_name="random_common_cause")
    print(random_common_cause)
    print("-----------------")

    # dummy_outcome ================================================================
    dummy_outcome = model.refute_estimate(identified_estimand, causal_estimate_iv,
                                          method_name="dummy_outcome_refuter")
    print(dummy_outcome[0])
    print("-----------------")

    # data_subset ================================================================
    res_subset = model.refute_estimate(identified_estimand, causal_estimate_iv,
                                       method_name="data_subset_refuter",
                                       subset_fraction=0.8)
    print(res_subset)
    print("-----------------")

    # bootstrap ==================================================================
    bootstrap = model.refute_estimate(identified_estimand, causal_estimate_iv,
                                      method_name="bootstrap_refuter")
    print(bootstrap)


backdoor_linear()
instrumental_variable()

from dowhy.

xwbxxx avatar xwbxxx commented on July 23, 2024

I think I found the reason for p value=0 in the case of two similar results of new effect and expected effect.

def perform_bootstrap_test(estimate, simulations: List):
    # This calculates a two-sided percentile p-value
    # See footnotes in https://journals.sagepub.com/doi/full/10.1177/2515245920911881
    half_p_value = np.mean([(x > estimate.value) + 0.5 * (x == estimate.value) for x in simulations])
    return 2 * min(half_p_value, 1 - half_p_value)

The p value is derived from the function above, indicating the probability that estimate (expected effect) falls in the distribution of simulations (new effects from refutor).
If the distribution of simulations is narrow, or even worse, takes the same value, the estimate won't fall in the distribution, and the p value gets 0.

from dowhy.

github-actions avatar github-actions commented on July 23, 2024

This issue is stale because it has been open for 14 days with no activity.

from dowhy.

github-actions avatar github-actions commented on July 23, 2024

This issue was closed because it has been inactive for 7 days since being marked as stale.

from dowhy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.