External Publication

Dichotomization

Datamethods Discussion Forum [Unofficial] June 7, 2026

The practice of responder analysis seems to have become embedded in regulatory agency guidances in the late 1990s and early 2000s. Subsequent publications introducing the concept of “Minimally Clinically Important Difference” (MCID) probably caused it to become entrenched even further.

271674233_Responder_analyses-_A_PhRMA_position_paper

https://pmc.ncbi.nlm.nih.gov/articles/PMC2164942

Many papers critical of responder analysis (and the outcome dichotomization that is performed in its service) cite the loss of power due to dichotomization and the arbitrary nature of definitions of treatment “response” as the practice’s main flaws. But it seems like what’s needed, most fundamentally , in order to abolish responder analysis, is an explanation, understandable to both statisticians and non-statisticians , of the fact that the entire practice hinges on causally invalid assumptions. Once this point is widely understood and internalized, all the other arguments against the practice of outcome dichotomization, centred on statistical inefficiency , will become moot (?)

I realize that some prominent experts have tried (valiantly and repeatedly) to convey this message, but it clearly hasn’t been understood by those with decision-making power.

This article was published in 1998 in BMJ by Guyatt et al:1

https://pmc.ncbi.nlm.nih.gov/articles/PMC1112685/

Some excerpts:

“…Clinicians and investigators tend to assume that if the mean difference between a treatment and a control is appreciably less than the smallest change that is important, then the treatment has a trivial effect. This may not be so…”

“…Consider a situation in which 25% of the treated patients improved by a magnitude of 1.0, while the other 75% did not improve at all (mean change of 0). This would mean that the 25% of those treated obtained a moderate benefit from the intervention…”

“…We have developed a method for estimating the proportion of patients who benefit from a treatment when the outcome is a continuous variable. We outline this method using two examples, one a crossover trial and the other a parallel group design…”

“…We reasoned that the number of patients who had obtained important benefit from treatment would be the number with a difference of 0.5 or more favouring the treatment period, minus the number with a difference of 0.5 or more favouring the control period. This measure is analogous to the conventional risk difference, with 1 divided by the difference in risk being the number needed to treat…”

“…Once investigators have excluded chance as an explanation for differences between groups they can examine the proportions of patients who have deteriorated, remained the same, or improved as an aid in interpreting the importance of the results…”

“…This approach emphasises the need to establish ranges of health related quality of life, symptoms, and functional status questionnaire changes that represent trivial, small but important, moderate, and large changes. When they understand these ranges, investigators reporting clinical trials should present not only mean differences but also the difference in the proportion of patients who experience important improvement, and the associated number needed to treat…”

The article triggered a blunt letter to the editor from Stephen Senn, in response: 2

https://pmc.ncbi.nlm.nih.gov/articles/PMC1113763/

"Guyatt et al’s proposal for analysing randomised trials 1 is misguided, flies in the face of elementary statistical theory, and should be resisted…"

“…Guyatt et al have implicitly assumed that which of two treatments is better for a patient can be determined by comparing one period of treatment on each…”

The above link also shows a rebuttal from the authors. Some excerpts:

“…Contrary to Senn’s interpretation, we do not propose deciding on which individual patients in the trial benefit but rather the overall proportion who obtained a particular magnitude of benefit.”

“…Senn’s logic fails when he argues that nothing from the two clinical trials is inconsistent with the theory that all patients benefitted equally. Quite the contrary, the key is that randomising patients to treatment and control and aggregating results across patients permits independent estimates of the main effect of treatment and the other three sources of variance…”

_"…In our previous work on n of 1 clinical trials we recommend multiple periods of treatment and control in order to establish the efficacy for individual patients._1-1 The principle is the same—multiple observations, whether from a single patient or multiple patients, permit separate estimation of the main treatment effect and other sources of variation…"

As far as I can tell, Senn is objecting - forcefully - to the birth of the practice of “responder analysis.” But the language in the dialogue is centred around the partitioning of variances and is beyond my understanding as a physician. Guyatt et al acknowledge that assessing therapeutic effect in an individual patient (i.e., assessing whether a drug has “caused” a particular change in his clinical status) requires that we observe multiple crossover periods (using an “n-of-1” design). But they simultaneously assert that it’s possible to determine the proportion of patients within a treatment group who “benefit,” even with out a crossover design (?) The word “benefit” is unambiguously causal and their meaning here is clear: “benefit”=“response”= “causality.” Senn disagrees with Guyatt’s assertion. He seems to be saying that the barriers to causal inference, when assessing individuals who comprise a group exposed to a single period of treatment exposure , are the same as the ones we would face when assessing causality for asingle patient exposed to a single period of treatment exposure.

Proportions of a treatment arm are made up of individuals. And without observing multiple periods of exposure, we can’t (with certain exceptions- Is this patient a “responder”? Essential clinical considerations) infer that any particular trial patient “benefitted” from treatment. As noted previously in this thread, “change from baseline” in a patient following a single period of treatment exposure does NOT necessarily reflect a “benefit” from treatment, since many factors can contribute to change from baseline in diseases with fluctuating natural histories. And if we can’t infer that a given patient’s change from baseline, within a treatment arm, represents a causal effect of the exposure, it follows that we also can’t infer that a change from baseline of any particular magnitude represented a causal effect of the exposure. It then follows that we can’t validly infer what “proportion” of patients within an RCT treatment arm achieved various degrees of benefit (?)

There must be some way to express these ideas in the language of causal inference epidemiology (?) If so, could such a presentation be made understandable to a lay audience (?)

Guyatt GH, Juniper EF, Walter SD, Griffith LE, Goldstein RS. Interpreting treatment effects in randomised trials. BMJ. 1998 Feb 28;316(7132):690-3. doi: 10.1136/bmj.316.7132.690. PMID: 9522799; PMCID: PMC1112685.
Senn S. Applying results of randomised trials to patients. N of 1 trials are needed. BMJ. 1998 Aug 22;317(7157):537-8. PMID: 9712612; PMCID: PMC1113763.

Discussion in the ATmosphere