Collider in RCT Subgroup Analysis
Datamethods Discussion Forum [Unofficial]
May 31, 2026
This thread raises a problem for the predictive vs prognostic biomarker researchers. The decision of which statistical model to use requires researchers to commit to a mechanistic position before touching the data. The field largely doesn’t do this because it fits models, finds significance, and then constructs the biological narrative post-hoc. Which means in practice, the additive vs. product term choice is often made on statistical grounds, whichever fits better, when it should be made on ontological grounds before the analysis begins. In short, a prognostic biomarker must be orthogonal to the treatment mechanism while a predictive biomarker must be entangled with the treatment mechanism. The latter determination alone guides the analytical decision since neither tests of interaction nor replicability are reliable. The interaction test cannot confirm genuine pathway entanglement, replication confirms consistency not mechanism, predictive biomarkers are always also prognostic, so additive signal is always present and the analytic result is ontologically inert with respect to the distinction. So the choice of model, additive vs. product term, cannot be derived from the data itself. The data will fit both to varying degrees and won’t tell us which is correct.
Discussion in the ATmosphere