**Date: December 20-22, 2000
From: Bill Shipley, Universite de Sherbrooke, (Quebec) CANADA
Subject: Is the **

**
Bill Shipley asked:**

In most experiments, the external
manipulation consists of adding (or subtracting) some amount from *X*
without removing pre-existing causes of *X*. For example, adding 5
*kg*/*h* of fertilizer to a field, adding 5 *mg/l*
of insulin to subjects
etc. Here, the pre-existing causes of the manipulated variable still
exert effects but a new variable (*M*) is added.

... The problem that I see with the *do*(*x*) operator as a general
operator of external manipulation is that it requires two things:
(1) removing any pre-existing causes of *x* and (2) setting
*x* to some value. This corresponds
to some types of external manipulations, but not all (or even most)
external manipulations. I would introduce an *add*(*x=n*)
operator, meaning "*add*, external to the pre-existing causal process,
an amount '*n*' of *x*''.
Graphically, this consists of augmenting the pre-existing causal graph
with a new edge, namely *M-n*-->*X*. Algebraically, this would consist
of
adding a new term -*n*- as a cause of *X*.

**Author's answer:**

In many cases, your "additive intervention" represents
indeed the only way we can intervene on a variable *X*.
In fact, the general notion of intervention
(*Causality*, page 113) involves replacing the equation of *
X* by any other equation that fits the circumstances, not
necessarily a constant * X* = *x*.

What you are proposing corresponds to replacing the
old equation of *X*, * x* = *f*(*pa*_{X})
by a new equation: * x* = *f*(*pa _{X}*) + 1
This replacement is usually treated under the heading
"instrumental variables", since it is equivalent to
writing

There are three points to notice:
1. The additive manipulation CAN be represented
in the *do*( ) framework -- we merely apply the *do*( ) operator
to the instrument *I*, and not to * X* itself. This is
a different kind of manipulation that needs to be
distinguished from *do*(*x*) because,
as you noticed, the effect on y would be different.

2. Scientists working with instrumental variables
(e.g., epidemiologists) are not satisfied with estimating
the effect of the instrument on *Y*, but are trying
hard to estimate the effect of *X* itself. The former is known
as "the effect of intention to treat" the latter "the effect of
treatment" (see *Causality*, page 261).

3. Consider the loopy example where LISREL fails *
y* = * bx* +*e*_{1} + *I*, * x* = * ay* +
*e*_{2}. If we interpret "total effects" as the response of *Y*
to a unit change of the instrument *I*, then LISREL's
formula obtains: The effect of * I* on * Y* is *b*/(1*-ab*)
However, if we adhere to the notion of
"per unit change in *X*", as opposed to "per unit change in an
instrument of *X*", we get back the *do*-formula.
The effect of * X* on * Y* is *b*, not *b*/(1-*ab*),
even though the manipulation is done through an instrument.
In other words, we change * I* from 0 to 1 and observe the
changes in * X* and in *Y*; if we divide the change in * Y*
by the change in *X*, we get *b*, not *b*/(1-*ab*).

To summarize: Yes, additive manipulation is sometimes
useful to model, normally it is done through instrumental variables,
and we still need to distinguish between the effect of the instrument
and the effect of *X*. The former is not stable (*Causality*, page 261)
the latter is. Lisrel's formula corresponds to the effect of an
instrument, not to the effect of *X*.

**
Bill Shipley further asked:**

Thanks for the clarification. It seems to me that the simplest, and
most straight-forward, way of modeling and representing manipulations
of a causal system is to simply (1) modify the causal graph of the
unmanipulated system to represent the proposed manipulation, (2)
translate this new graph into structural equations, and (3) derive
predictions (including conditional predictions) from the resulting
equations; this is how I have treated the notion in my book. Why
worry about *do*(*x*) at all? In particular, one can model quite
sophisticated manipulations this way. For
instance, one might well ask what would happen if one added an
amount * z* to some variable * x* in the causal graph, in which * z* is
dependent on some other variable in the graph.

**
Author's Reply:**

If the manipulation is sophisticated,
then we need to go back to the equations and specify
precisely what is being changed, how, with what instrument,
conditioned on what information, etc., and then impose the
appropriate modification on the model to account for these nuances.
(e.g., see example of "process control", page 74 of my book)

However, science thrives on standards, because standards serve (at least) two purposes: communication and theoretical focus.

Mathematicians, for example, have decided that the
derivative operator "*dy*/*dx*",
is a nice standard for communicating information about change,
So, that is what we teach in calculus, although other operators
might also serve the purpose, for example, * x* * dy*/*dx* or
(*dy*/*dx*)/y etc.

1. Communication: If we were to eliminate the terms
"treatment effect" from epidemiology, and replace
it with detailed descriptions of how the effect
was measured, we would practically choke all
communication among epidemiologists.
A standard was therefore established: what we measure in
a controlled randomized experiment will be called
"treatment effect", the rest will be considered
variations on the theme. The "*do*-operator" represents
standard faithfully.

The same goes for SEM. Sewall Wright talked about "effect
coefficients" and established them as the standard
of "direct effect" in path analysis
(before it got molested with regressional jargon,
and LISREL formulas). Again, the "*do*-operator" conforms directly to this standard.

2. Theoretical focus.
Many (if not all) of the variants of manipulations can be
reduced to "*do*", or to several applications of "*do*".
Theoretical results established for "*do*" are then
applicable to those variants.
Examples: Les Hayduk's "poke and release" manipulation is expressible
as "*do*" in the temporal unfolding of a structural model.
Another example, questions of identification
for expressions involving "*do*" are applicable
to questions of identification of more sophisticated
effects. On page 113 of *Causality*, I show that if
the total effect *P*(*y*|*do*(*x*)) is identifiable, then
so also is the effect of conditional actions *P*(*y*|*do*(*x* if
*Z*=*z*)). The same goes for many other
theoretical results in the book; they were developed
for the "*do*" operator, they borrow from each other,
and they are applicable to many variants.

Finally, the *do* operator is the appropriate operator for
interpreting the conditional part of counterfactual sentences
(see page 204) and counterfactuals are abundant in scientific
discourse (see page 217-219).
I have yet to see a competing candidate with comparable
versatility, generality, formal power and (not the least)
conceptual appeal.
(Correction, I have yet to see ANY competing candidate.)
**
**

**
**