CAUSALITY - Discussion (Yudkowsky) Date: April 24, 2006
From: Eliezer S. Yudkowsky, Research Fellow, Singularity Institute for Artificial Intelligence, Santa Barbara, CA
Subject: The validity of G-estimation

Question to author:
The following paragraph appears on p. 103, shortly after eq. 3.63 in my copy of Causality:

"To place this result in the context of our analysis in this chapter, we note that the class of semi-Markovian models satisfying assumption (3.62) corresponds to complete DAGs in which all arrowheads pointing to Xk originate from observed variables."

It looks to me like this is a sufficient, but not necessary, condition to satisfy 3.62. It appears to me that the necessary condition is that no confounder exist between any Xi and Lj with i < j and that no confounder exist between any Xi and the outcome variable Y. However, a confounding arc between any Xi and Xj, or a confounding arc between Li and Xj with i <= j, should not render the causal effect non-identifiable. For example, even if a confounding arc exists between X2 and X3 (but no other confounding arcs exist in the model), the causal effect on Y of setting X2=x2 and X3=x3 should be the same as the distribution on Y if we observe x2 and x3.

It is also not necessary that the DAG be complete.


Author reply:
You are right that the DAG need not be complete, and that the condition cited in p. 103 is sufficient but not necessary for either
(3.62)
or the G-estimation formula
(3.63)
to hold. Corrections to the wordings of page 103 were posted on this website.

Your suggestion to allow confounding arcs beween Xi and Xj, is valid. However, allowing a confounding arc between Li and Xj (with i < j) is too permissive, as can be seen by the non-identified models of Figure 3.9 (b), (c), (d) and (g) in Causality.

In general, condition (3.62) is both over-restrictive and lacks intuitive basis. A more general and intuitive condition leading to (3.63) is formulated in (4.5) (Causality, p 122), which reads as follows:

(3.62*) General condition for g-estimation
P(y|g = x) is identifiable and is given by (3.63) if every action-avoiding back-door path from Xk to Y is blocked by some subset Lk of non-descendants of Xk. (By "action-avoiding" we mean a path containing no arrows entering an X variable later than Xk.)

Comment 1: The new definition leads to improvements over (3,62), namely, there are cases where the g-formula (3.63) is valid with a subset Lk of the past but not with the entire past.
Example 1:

Assuming U1 and U2 are unobserved, and temporal order: U1,Z, X1, U2,Y we see that (3.62*), hence (3.63), are satisfied with L1 = 0, while taking the whole past L1 = Z would violate both.

(3.62) is also satisfied with the choice L1=0, but not with L1=Z.

Comment 2: Defining Lk as the set of "nondescendants" of Xk (as opposed to temporal predecessors of Xk) also broadens (3.62).

Example 2:

with temporal order: U1,X1, S,Y

Both (3.62) and (3.62*) are satisfied with L1 = S, but not with L1 = 0.

Comment 3: There are cases where (3.62) will not be satisfied even with the new interpretation of Lk, but the graphical condition (3.62*) is.

Example 3: (constructed by Ilya Shpitser)

It is easy to see that (3.62*) is satisfied; all back-door action-avoiding paths from X1 to Y are blocked by X0, Z, Z'.

At the same time, it is possible to show, though by a rather intricate method (see the Twin Network Method, page 213) that Y{x1, x2} is not independent of X1, given Z, Z' and X0.

(In the twin network model there is a d-connected path from X1 to Yx, as follows: X1 <--> Z <--> Z* --> Z'* --> Y*) Therefore, (3.62) is not satisfied for Y{x1,x2} and X1.)

This example demonstrates one weakness of the Potential Response approach initially taken by Robins in deriving (3.63). The counterfactual condition (3.62) that legitimizes the use of the g-estimation formula is void of intuitive support, hence, epidemiologists who apply this formula are doing so under no guidance of substantive medical knowledge. Fortunately, graphical methods are slowly making their way into epidemiological practice, and more and more people begin to understand the assumptions behind g-estimation.

(Warning: Those who currently reign causal analysis in statistics are incurably graph-o-phobic and ruthlessly resist attempts to enlighten their students, readers and co-workers with graphical methods. This slows down progress in statistical research, but will eventually be overrun by commonsense.)


Next discussion (CS262Z: Identifying conditional plans

Return to Discussions