235k views
2 votes
Simulate data from this DAG: X → Y → Z. Now fit a model that predicts Y using both X and Z. What kind of confound arises, in terms of inferring the causal influence of X on Y?

User Gagaro
by
8.0k points

1 Answer

3 votes

Final answer:

Fitting a model to predict Y using X and Z in the DAG structure X → Y → Z may result in collider bias, where controlling for the collider Z induces a spurious association between X and Y, leading to incorrect conclusions about the causal influence of X on Y.

Step-by-step explanation:

When we have a directed acyclic graph (DAG) with the structure X → Y → Z, and we fit a model to predict Y using both X and Z as predictors, we may introduce a type of confounding known as collider bias or collider-stratification bias. This form of confounding occurs because Z is a collider on the path from X to Y, and conditioning on Z opens up a backdoor path that can create spurious associations between X and Y. Specifically, even if X has no direct effect on Y, controlling for Z can induce a correlation between X and Y if Z is correlated with other variables that affect both X and Y.

Colliders are variables that are influenced by at least two other variables, and controlling for a collider can inadvertently adjust for factors that are not of interest, leading to biased estimates in causal inference. In this scenario, the fitted model may incorrectly attribute the causal influence of X on Y due to the effect of Z which is influenced by Y. To avoid this bias, one should not control for Z when estimating the causal effect of X on Y in this DAG structure.

User Bruce Christensen
by
8.4k points