Date: July 30 2001
From: Two Anonymous Statisticians
Subject: Why isn't confounding a statistical concept?

In June 2001, I received two anonymous reviews of my paper, "Causal Inference in the Health Sciences" [pdf] (contributed to Health Services and Outcomes Research Methodology (HSORM), Special Issue on Causal Inference). The questions raised by the reviewers astounded me, for they reminded me of the way most statisticians still think about causality and of the immense educational effort that still lies ahead. In the hope of contributing to this effort, I have decided to post my reply on this web page.

Excerpts from Reviewers' comments:
Reviewer 1.

"The contrast between statistical and causal concepts is overdrawn. Randomization, instrumental variables, and so forth have clear statistical definitions. Some of the definitions, of course, may not be adequate to sustain causal inferences -- a more subtle point that would need more careful development." ... "[the paper urges] "that any systematic approach to causal analysis must require new mathematical notation". This is false: there is a long tradition of informal -- but systematic and successful -- causal inference in the medical sciences."

Reviewer 2.

"The paper makes many sweeping comments which rely on distinguishing "statistical" and "causal" concepts,... Also, included in the list of causal (and therefore, according to the paper, non-statistical) concepts is, for example, confounding, which is solidly founded in standard, frequentist statistics. Statisticians are inclined to say things like "U is a potential confounder for examining the effect of treatment X on outcome Y when both U and X and U and Y are not independent. So why isn't confounding a statistical concept?... If the author wants me to believe this, he's going to have to show at least one example of how the usual analyses fail."

Author's Response to Referee's Reports (mailed July 30, 2001):
The comments of Reviewer #1 are marred with personal antagonism and ideologically inspired objections. This reviewer seems to advocate an informal approach to causation (whatever that means; it reminds me of informal statistics before Bernoulli), and as such, he/she comes to this forum with set ideas against the basic aims and principles of my paper. Given this background, I do not expect to convince this reviewer of the wisdom of the mathematical approach described in my paper, at least not in this forum. Let history decide between us.

The target of this paper are readers of Reviewer #2's persuasion, who attempt to reconcile the claims of this paper (occasionally sweeping, I admit) with traditional statistical wisdom. To this reviewer I have the following comments.

You question the usefulness of my proposed demarcation line between statistical and causal concepts. Let me try to demonstrate this usefulness by considering the example that you bring up: Confounding. You write that "confounding is solidly founded in standard, frequentist statistics." and that statisticians are inclined to say things like "U is a potential confounder for examining the effect of treatment X on outcome Y when both U and X and U and Y are not independent. So why isn't confounding a statistical concept?"

Chapter 6 of my book goes to great length explaining why this definition fails on both sufficiency and necessity tests, and why all variants of this definition must fail by first principles. I will bring just a couple of examples to demonstrate the point. Consider a variable U that is affected by both X and Y, say one that turns 1 whenever both X and Y reach high levels. U satisfies your criterion and, yet, U is not a confounder for examining the effect of treatment X on outcome -- in fact U can safely be ignored in our analysis. (The same goes for any variable that is a direct cause of X and not a direct cause of Y, like Z in Fig. 2 of my paper). As a second example, consider a variable U that resides "on the causal pathway" from X to Y. This variable, too, satisfies your criterion yet it is not a confounder -- generation of statisticians have been warned (see Cox 1958) not to adjust for such variables. One might argue that your definition is merely a necessary, but not sufficient condition for confounding. But this, too, fails. Chapter 6 (page 185-186) describes an example where there is no variable that satisfies your definition and, still, the effect of treatment X on outcome Y is confounded.

One can also construct an example (Fig. 6.5) where U is a confounder (i.e., must be adjusted to remove effect bias) and, still, U is not associated with either X or Y.

I am not the first to discover discrepancies between confounding and its various statistical "definitions". Miettinen and Cook and Robins and Greenland have been arguing this point in epidemiology since the mid 1980's -- to no avail. Investigators continue to equate collapsibility with no-confounding and continue to adjust for the wrong variables. Moreover, the general conception that any important concept (e.g. confounding, instrumental variables) MUST have statistical definition, is so deeply entrenched in the health sciences that even today, 15 months past the publication of my book, people with the highest qualifications and purest of intentions continue to ask: "So why isn't confounding a statistical concept".

Taking your advice, I have revised the paper and made my claims less sweeping, but, off the record, I believe that any attempt to correct this tradition would necessarily sound sweeping, and nothing but a sweeping campaign can ever eradicate these misconceptions about confounding, statistics, and causation. Statistical education is firmly in the hands of people of reviewer-1's persuasion. And people with your quest for understanding are rarely given a public forum to ask "So why isn't confounding a statistical concept?".

Now let us examine the logic of my argument. Before seeing any of the counterexamples, and using only the power of the statistical-causal demarcation line, one should be able to detect that your proposed definition (of confounding) must be flawed. How?

  1. Having a correct criterion for identifying confounders would enable us to correctly assess the effect of treatment on outcome.
  2. Any such assessment amounts to drawing causal conclusions from data, hence it MUST be predicated on some causal assumptions.
  3. Your definition is purely statistical, since it can be verified from the joint distribution of X, Y and U. Hence it is void of causal assumptions, hence (from 2), it must be flawed.

The same argument applies to the concepts of "randomization" and "instrumental variables" (Ironically, reviewer #1 states in a typical authoritative posture: "Randomization, instrumental variables, and so forth have clear statistical definitions"). He who recognizes that these concepts must contain causal information would be able to isolate and explicate the causal assumptions underlying studies based on these concepts. In contrast, he who thrives on blurring the causal-statistical distinction (e.g., reviewer #1) will seek no such explication, and will continue to await the miracle of disambiguation ("a more subtle point that would need more careful development")

I hope that my revised manuscript manages to demonstrate , as you have requested, "how the usual analysis fails" and how the proposed demarcation line can help protect investigators from this and other blunders (and there are many others).

Next Discussion (Hautaniemi: Zadeh's `CAUSALITY IS UNDEFINABLLE')