Lecture 01
Goals of the course:
- Quantify causal questions using the mathematical language of potential outcomes (one framework)
- Design studies to estimate causal effects
- Analyze data from these studies to estimate causal effects
- Assess robustness of analysis to violations of underlying modeling assumptions.
Potential Outcomes Model for Defining Effects Caused by a Treatment
Definitions
These are due to Neyman 1923, Rubin 1974. More background can be found in Holland (1986).
Let be a set of treatments. The simplest example is perhaps
where 0 corresponds to a control
and 1 corresponds to a treatment.
A "unit" refers to a sample to which a treatment can be applied. E.g. a patient, or perhaps a model run? We use the notation
denotes the potential outcome if treatment is applied.
denotes the potential outcome if control is applied.
The causal effect of treatment compared to control for unit can be expressed as
Let's put together a climate example: measure whether tropical cyclone forms under deep/shallow dynamics in a weather-scale experiment.
would show that deep atmosphere causes tropical cyclone for unit
.
means deep atmosphere does not cause for unit
.
means deep atmosphere inhibits tropical cyclone from forming for unit
.
Some notes on causal effects:
- The causal effect of a treatment can only be defined in reference to another treatment (e.g. a control). Do these treatments have to be mutually exclusive?
- This framework focuses on effects which result from causes (effect of deep/shallow atmosphere on tropical cyclones) rather than causes of effects (why did tropical cyclone form?)
In the real world (why did Judy get lung cancer?) you have the problem of infinite regress (she has lung cancer because she smoked, because her parents smoked, because her parents hated each other...).
In the computational world this might not be true?
- Potential outcomes gives actionable information on how to live our lives. And can do so in purely observational situations.
- Cause-effect relationships have to have a temporal ordering.
- Can't have effect before a cause.
- Can't have causal simultaneity --> impossible to distinguish directionality.
- Relationship to do calculus
Before-after study: temporal stability and causal transience
Fundamental problem of causal inference:
- We cannot observe both
and
in the real world, and therefore we cannot observe the causal effect of the active treatment
Temporal stability
- Temporal stability assumption: The value of
does not depend on when we apply
to unit
and then measure.
- If this holds, we can take a sequence of measurements, then we can measure
by a sequence of experiments.
Causal Transience
- The value of
is not impacted by applying control to unit
, then measure
.
- This gives us the ability to measure both
and
for the same unit
via a sequence of experiments under limited assumptions.
Example of when this is dubious: measure the impact of a treatment of an illness (tendency of patients to get better over time).
Lab controlled experiments and Unit Homogeneity
This is the assumption that different units respond identically to treatment, e.g. for
.
E.g. knockout experiments on mice: engineer nearly genetically identical mice and vary a single gene. Potential outcomes should have the same distribution across units.
Statistical approaches to causality
A statistical approach to causal inference, we seek to infer some analogue of the difference between potential outcomes
A frequent estimand is the Average Treatment Effect (ATE):
which is linear in a way that allows us to approximate the treatment effect using only the marginal expectations
There are estimands which do not immediately have this property and more work must be done.
- The median of
A first look at SUTVA
If we are interested in Why is this still challenging if we focus on marginals instead of the joint distribution?
Suppose
is the chosen treatment that unit
recieves. SUTVA (which will be introduced later) gives us
and thus the observed data is
The population which is treated (and for which we can observe the treatment effect) is disjoint from the population for which we
get to observe the control.
To put this mathematically,
But in general Especially in observational settings, the way treatments are assigned (or self selected)
can bias these populations.
Modeling causality as a missing-data problem
The crux of Rubin's causal model is considering causality as a missing-data problem (e.g. Pearl takes significant issue with this).
Fundamentally, the "science table" tends to look like this:
? | 2 | ? |
6 | ? | ? |
? | 8 | ? |
? | 10 | ? |
Are these entries missing completely at random (MCAR)? We don't usually know
Randomized experiments vs Observational studies
One sufficient condition which gives is independence:
One can get this in a randomized experiment
Randomized expieriments
If we assign individuals into treatment groups, then we can enforce this independence by design.
A simple starting example: a completely randomized experiment:
individuals are given treatment,
are given control,
. The assignment proportions may be imbalanced e.g. if treatment is very expensive.
is the set of allowable treatments.
This is the complete randomization.
- Assignment is independent of potential outcomes. Both in an informal and formal sense.
Observational studies:
An observational study must be done when it's not feasible (e.g. ethical) to do a controled experimentation. Self selection is possible, so the above independence condition does not hold.
- Observational experiments MUST be done in certain situations.
- Poorly designed observational studies can be complete garbage.