CAUSALITY - Discussion (Singpurwalla) Date: December 26, 2000
From: Nozer D. Singpurwalla, The George Washington University
Subject: Has causality been defined?

Professor Nozer Singpurwalla, from The George Washington

My basic point is that since causality has not been defined, the causal calculus is a technology which could use a foundation. However, the calculus does give useful insights and is thus valuable. Finally, according to my understanding of the causal calculus, I am inclined to state that the calculus of probability is the calculus of causality, notwithstanding Dennis' [Lindley] concerns about Suppes probabilistic causality.

I am surprised you think: "causality has not been defined" when Chapter 1 defines causality in terms of two well defined mathematical objects: a probability function and a directed acyclic graph, and Chapter 7 further extends the definition using a set of functions. I assume that what you meant by "has not been defined" was that it has not been defined in terms of a probability function alone. In the paragraphs below I will argue that: (1) causality has been well defined, (2) causality cannot possibly be defined in terms of probabilities alone, and (3) the criterion that a concept can only be well defined if it is defined in terms of probabilities, and probabilities alone, is not a valid criterion.

(The following are excerpts from my comments on Dennis Lindley's review of my book, forthcoming, Journal of Statistical Planning and Inference.)

Some readers have expressed the opinion that causality is still an undefined concept and that, although the do calculus can be an effective mathematical tool in certain tasks, it does not bring us any closer to the deep and ultimate understanding of causation, one that is based solely on classical probability theory.

Unfortunately, aspirations for reducing causality to probability are both untenable and unwarranted. Philosophers have given up such aspirations twenty years ago, and were forced to admit extra-probabilistic concepts (such as ``counterfactuals'' or ``causal relevance'') into the analysis of causation (see Causality}, Section 7.5). The reason is quite simple; probability theory deals with beliefs about an uncertain, yet static world, while causality deals with changes that occur in the world itself. Causality deals with how probability functions change in response to new conditions and interventions that originate from outside the probability space, while probability theory, even when given a fully specified joint density function on all variables in the space, cannot tell us how that function would change under external interventions. Thus, ``doing'' is not reducible to ``seeing'', and there is no point trying to fuse the two together. Drawing analogy to visual perception, the information contained in a probability function is analogous to a precise description of a three-dimensional object; it is sufficient for predicting how that object will be viewed from any angle outside the object, but it is insufficient for predicting how the object will be viewed if manipulated and squeezed by external forces. The additional information needed for making such predictions is analogous to the causal information that the do calculus extracts from a directed acyclic graph (DAG).

From a mathematical perspective, it is a mistake to say that causality is still undefined. The do calculus, for example, is based on two well-defined mathematical objects: a probability function P and a DAG D; the first is standard in statistical analysis while the second is a newcomer that tells us (in a qualitative, yet formal language) which mechanisms would remain invariant to a given intervention. Given these two mathematical objects, the definition of "cause" is clear and crisp; variable X is a probabilistic cause of variable Y if P(y|do(x)) ≠ P(y) for some values x and y. Since each of P(y|do(x)) and P(y) is well-defined in terms of the pair (P, D), the relation ``probabilistic cause'' is, likewise, well-defined. Similar definitions can be constructed for other nuances of causal discourse, for example, ``causal effect'', ``direct cause'', ``indirect cause'' ``event-to-event cause'', ``necessary cause'', ``sufficient cause'', ``likely cause'' and ``actual cause'' (see Causality, pages 222-3, 286-7, 319; some of these definitions invoke functional models).

Not all statisticians are satisfied with these mathematical definitions. Some suspect definitions that are based on unfamiliar non-algebraic objects (i.e., the DAG) and some mistrust abstract definitions that are based on un-verifiable models. Indeed, no mathematical machinery can ever verify whether a given DAG really represents the causal mechanisms that generate the data -- such verification is left either to human judgment or to experimental studies that invoke interventions. I submit, however, that neither suspicion nor mistrust are justified in the case at hand; DAGs are no less formal than mathematical equations, and questions of model verification need be kept apart from those of conceptual definition. Consider, for example, the concept of a distribution mean. We certainly perceive this notion to be well-defined, for it can be computed from any given (non-pathological) distribution function, even before ensuring that we can estimate that distribution from the data. We would certainly not declare the mean ``ill-defined'' if, for any reason, we find it hard to estimate the distribution from the available data. Quite the contrary; by defining the mean in the abstract, as a functional of any hypothetical distribution, we can often prove that the defining distribution need not be estimated at all, and that the mean can be estimated (consistently) directly from the data. An analogous logic applies to causation. Causal quantities are first defined in the abstract, using the pair (P, D), and the abstract definition then provides a theoretical framework for deciding, given the type of data available, what aspects of the DAG are necessary for establishing the desired causal quantity.

The separation between concept definition and model verification is even more pronounced in the Bayesian framework, where purely judgmental concepts, such as the prior distribution of the mean, are perfectly acceptable, as long as they can be assessed reliably from one's experience or knowledge. Professor Lindley's observation that ``causal mechanisms may be easier to come by than one might initially think'' further implies that, from a Bayesian perspective, the newcomer concept of a DAG is not an alien after all. If a Bayesian is free to assess p(y|see(x)) and p(y|do(x)) in any way, as separate evaluations, the Bayesian should also be permitted to assess and assert his/her beliefs in the validity of the mechanisms portrayed in the DAG. And there is no need to cast these beliefs in the language of probabilities to render the analysis legitimate. Adding probabilistic veneer to these beliefs may make the do calculus appear more traditional, but would not change the fact that the objects of assessment are still causal mechanisms, and that these objects have their own special way of generating predictions about interventions. Professor Lindley's observation reminds us that it is not the language in which we cast judgments that legitimizes the analysis, but whether those judgments can reliably be assessed from our store of knowledge and from the peculiar form in which this knowledge is organized.

If it were not for loss of reliability (of judgment), one could easily translate the information conveyed in a DAG into purely probabilistic formulae, using hypothetical variables. (Translation rules are provided in Section 7.3 of Causality, p. 232) Indeed, this is how the potential-outcome approach of Neyman and Rubin has achieved statistical legitimacy: judgments about causal relationships among observables are expressed as statements about probability functions that involve mixtures of observable and counterfactual variables. The difficulty with this approach, and the main reason for its slow acceptance in statistics, is that judgments about counterfactuals are much harder to assess than judgments about causal mechanisms. For instance, to communicate the simple assumption that symptoms do not cause diseases, we would have to use a rather unnatural expression and say that the probability of the counterfactual event ``disease had symptoms been absent'' is equal to the probability of ``disease had symptoms been present'' Judgments of conditional independencies among such counterfactual events are even harder for researchers to comprehend or to evaluate.

In summary, I suggest that it is through friendly conceptual semantics and powerful mathematical machinery that causal analysis will regain its proper place in statistics. I also submit that the theoretical foundations of causality are sharper and stronger when viewed as supplement to, not as part of, probability theory.