CAUSALITY - Discussion (Sjolander)

CAUSALITY - Discussion (Sjolander) Date: March 10, 2006
From: Arvid Sjolander, Dept. of Medical Epidemiology and Biostatistics, Karolinska Institutet
Subject: d-separation of counterfactuals

Question to author:
At the bottom of page 214 you mention that "any variable that is d-separated from Z* would also be d-separated from U_Z" (and vice versa I guess?). You conclude that "if U_Z obeys a certain independence relationship then Z_x (more generally, Z_paZ) must obey that relationship as well" (and vice versa?). I guess that what you mean is

where V is any variable defined by v = f(>pa_V,u_V), Q and P are any variables except V and U_V, and denotes independence (hope the symbols look the same on your computer as on my).

On the top of page 215 you use the twin network to show that

and then refer to the previous argument to state that

I don't see why (3) follows from (1) and (2). How can (1) justify the replacement of U_Y with Y_z and U_Z with Z_x in (2)? Certainly Y_z is a function of U_Y and Z_x a function of U_Z, which motivates one to write (2) as

but how can you remove U_Y and U_Z from (4)?

If I would examine if (3) is true I would naively construct the following triple network:

The left part corresponds to the world in which no intervention is imposed, the middle part in which do(X=x) is imposed, and in the right part do(Z=z) is imposed. In this network (3) does not hold since the path is open by conditioning on Y. Is there anything wrong with this use of the (generalized) twin network method?

Author's first reply (with Ilya Shpitser)
You are, indeed, correct.
Y_x is not independent of X given Y_z, Z_x, Y.

However, for example, Y_x is independent of X given Z_x.

Your use of d-separation in the twin network to understand independencies between underlying counterfactual quantities is correct, and so is your generalization of the 'twin' network to more than two possible worlds. We called this the 'parallel worlds model' in the path-specific effects paper.

I have spoken to Judea about (3) last spring. Our conclusion was that (3) is not true in general, but is true if the functions from U to their children are one to one. I believe this is the case Judea had in mind when he wrote that section of the book.

Arvid's followup: 1) I see that a one-to-one correspondence between U_v and V_{pa_V} is a sufficient criteria for letting one of them serve as a proxy for the other. If pa_V={}, i.e. V doesn't have any parents except U_v, the criteria is automatically true since we can always choose a "non-redundant coding" of U_v such that each value V corresponds to exactly one value of U_v. If not however, it seems to me that this criteria implies quiet strong restrictions on the structural relationships in a causal model. For example it completely rules out the following simple relationship between dichotomous {z,u_y} and y=f(z,u_z): y=1 if z=1 and u_y=1
y=0 else

Is there ever a reason for assuming the criteria would hold true?

2) As a bonus your answer helped me with a similar question on the paper "Direct and Indirect Effects" (Pearl 2001), namely "why does (7) hold in the graph in Figure 1(a)?". If one draws a triple graph it is obvious that the path is open. If however U₂ is completely determined by W, we block the path by conditioning on W, as in (7). But then again, do we any reason to believe in this one-to-one correspondance in this particular setting?

Reply to 1:
I think many important classes of causal models exhibit one-to-one relationships, for example structural equation models assume all functions are linear, and so one-to-one. Frequently we also have causal models that contain deterministic relationships between nodes which will result in similar changes to the way d-separation works. I agree, though, that the original discussion in Judea's book should make clear it's talking about the one-to-one case.

Reply to 2:

(7) actually holds in the general case in Fig 1 (a).

This is because Y_xz is independent of Z_x* | W_x*.

But we know by rule 3 of do-calculus that W_x* = W.

Next discussion (Yudkowsky: The validity of G-estimation)
Return to Discussions