External Publication
Visit Post

Collider in RCT Subgroup Analysis

Datamethods Discussion Forum [Unofficial] May 18, 2026
Source
As a clinician/researcher for 40 yrs one of the most important thing I have learned is to distinguish real biological data-generating processes (BDGPs) from synthetic data-generating processes (SDGPs). An SDGP is a gate-generated analytical construct that does not correspond to a single coherent biological causal system. Your paper is provocative and deeply insightful because it appears interpretable within the SDGP framework as describing the creation of a post-treatment SDGP. Traditional cause-agnostic syndrome RCTs create SDGPs upstream through consensus enrollment gates (e.g., sepsis, ARDS), whereas under your framework treatment-responsive subgrouping appears to create SDGPs downstream by conditioning on treatment-associated pathway states or biomarker responses. In both cases, the analytical population is generated by conditioning on a gate rather than by identifying a coherent biological causal system. The resulting estimands become structurally unstable, composition-dependent quantities that may reverse or vary despite unchanged underlying biology. Thus syndrome disease-mixing and post-treatment response grouping may represent parallel manifestations of synthetic gate conditioning occurring at different temporal locations within the causal structure. An important implication is that pre-specification of a subgroup may not rescue the analysis if the subgroup itself constitutes a SDGP. Pre-specification does not eliminate structural instability arising from synthetic conditioning.

Discussion in the ATmosphere

Loading comments...