6.3 Statistical Control Done Right

The central problem of observational data is confounding:

  • the presence of a common cause that lurks behind the potential cause of interest and the outcome of interest.

A confounding influence can introduce what is often called a spurious correlation, which ought not to be confused with a causal effect.

  • The extraordinarily influence of randomized experiments in testing causal inferences is based on the simple fact that if the independent variable is randomly assigned—for example, by the flip of a coin—by design it cannot share a common cause with the outcome.

How can a DAG be used to figure out how to remove all such noncausal associations so that only the true causal effect remains?

6.3.1 Building a DAG

To derive a valid causal conclusion, one must ensure the DAG includes everything that is relevant to the causal effect of interest. What is missing

If we want to derive a valid causal conclusion, we need to build a causal DAG that is complete because it includes all common causes of all pairs of variables that are already included in the DAG (Spirtes, Glymour, & Scheines, 2000).

That is, any additional variable that either directly or indirectly causally affects at least two variables already included in the DAG should be included.

6.3.2 Building a DAG: Back-Door Paths

After a DAG is built, back-door paths can be discerned.

Back-door paths are all paths that start with an arrow pointing to the independent variable and end with an arrow pointing to the dependent variable.

If we plan to investigate the causal relationship between child maltreatment and internalizing what are the back-door paths in our example DAG?

Back-Door Paths

  • child maltreatment ← support ← income → externalizing
  • child maltreatment ← income → externalizing

Back-Door Problems

  • Back-door paths are problematic whenever they transmit an association.
  • In this case, both backdoor paths consist of only chains and forks, thus, these two back-door paths are open, and they can transmit a spurious association.

Back-Door Solutions

  • The zero-order correlation between child maltreatment and externalizing is a mix of the true causal effect (child maltreatment → externalizing) of interest plus any noncausal association transmitted by the two back-door paths.
  • To remove the undesirable noncausal association, we must block the two back-door paths.

Blocking Back-Door Paths

The purpose of third-variable control is to block open back-door paths.

  • If all back-door paths between the independent and dependent variables can be blocked, then the causal effect connecting the independent and dependent variables can be identified.

  • Such a causal effect would be considered identifiable, always under the assumption that the DAG captures the true underlying causal web.

  • A back-door path can be blocked by “cutting” the transmission of association at any point in the path by statistically controlling a node.

What variables would we want to control for to identify the causal effect of child maltreatment on externalizing?