Question to author:
What do we know about counterfactuals
in linear models?
Author's reply
Glad you asked.
Here is a neat result concerning the testability
of counterfactuals in linear systems.
We know that counterfactual queries of
the form P(Yx=y|e) may or may not be
empirically identifiable, even in experimental studies.
For example, the probability of causation,
P(Yx=y|x',y') is in
general not identifiable from experimental data
(Causality, p. 290, Corollary 9.2.12) when X and Y
are binary.1
(Footnote-1: A complete graphical criterion for distinguishing
testable from nontestable counterfactuals is given
in Shpitzer and Pearl (2007, upcoming)).
This note shows that things are much friendlier in linear analysis:
Claim A. Any counterfactual query of the form E(Yx |e) is empirically identifiable in linear causal models, with e an arbitrary evidence.
Claim B. E(Yx|e) is given by
Claim A is not surprising. It has been established in generality by Balke and Pearl (1994b) where expressions involving the covariance matrix were used for the various terms in (1).
Claim B offers an intuitively compelling interpretation of (1) that reads as follows: Given evidence e, to calculate E(Yx |e), (i.e., the expectation of Y under the hypothetical assumption that X were x, rather than its current value), first calculate the best estimate of Y conditioned on the evidence e, E(Y|e), then add to it whatever change is expected in Y when X undergoes a forced increase from its current best estimate, E(X|e), to its hypothetical value X=x. That last addition is none other but the effect coefficient T, times the expected change in X, i.e., T[x - E(X|e)]
Note: Eq. (1) can also be written in do(x) notation as
Proof:
(with help from Ilya Shpitzer)
Assume, without loss of generality, that we are dealing with a zero-mean model. Since the model is linear, we can write the relation between X and Y as:
It is always possible to bring the function determining Y into the form (3) by recursively substituting the functions for each rhs variable that has X as an ancestor, and grouping all the X terms together to form TX. Clearly, T is the Wright-rule sum of the path costs originating from X and ending in Y (Wright, 1921).
From (3) we can write:
The last term in (5) can be evaluated by taking expectations on both sides of (3), giving:
Some Familiar Problems Cast in Linear Outfits
Three Special cases of e are worth noting:
Example-1. e: X =x', Y = y'
(The linear equivalent of the probability of causation)
From (1) we obtain directly
This is intuitively compelling. The hypothetical expectation of Y is simply the observed value of Y, y', plus the anticipated change in Y due to the change x-x' in X.
Example-2. e: X = x' (effect of treatment on treated)
Example-3. e; Y = y'
(Gee, my temperature is Y=y', what if I had taken
x tablets of aspirin. How many did you take? Don't remember.)
Our counterfactual problem (page 216) reads: Given that the current price is P=p0, what would be the expected value of the demand Q if we were to control the price at P = p1? Making the correspondence P = X, Q = Y, e = {P=p0, i, w}, we see that this problem is identical to Example 2 above (effect of treatment on the treated), subject to conditioning on i and w. Hence, since T = b1, we can immediately write
Eq. (8) replaces Eq. (7.17) on page (217). Note that the parameters of the price equation
Remark 1:
Example 1 is not really surprising; we know that
the probability of causation is empirically identifiable
under the assumption of monotonicity (Causality, p. 293).
But examples 2 and 3 trigger the following conjecture:
Conjecture
Any counterfactual query of the form
P(Yx |e) is empirically identifiable
when Y is monotonic
relative to X.