{
  "$type": "site.standard.document",
  "bskyPostRef": {
    "cid": "bafyreifps4wgvdvpocixlitj4vw7m4goalpcpgjsn2rk3ewbybpn3todam",
    "uri": "at://did:plc:wwyqal4cnqhuwyacdj7rqq3n/app.bsky.feed.post/3mib56aaon3i2"
  },
  "path": "/t/thinking-clearly-about-association-studies-risk-factors-and-causal-salad-included/28679#post_5",
  "publishedAt": "2026-03-29T16:39:13.000Z",
  "site": "https://discourse.datamethods.org",
  "tags": [
    "@Pavlos_Msaouel"
  ],
  "textContent": "f2harrell:\n\n> An open question is whether association studies help or hurt the last two steps, i.e., whether empirical association analysis should inform the development of the causal diagram.\n\nThe circular logic required to answer this question always does my head in. If everyone agrees that DAG-free studies touting “associations” are insufficient for causal inference, how can we then justify _using_ these same studies as “evidence” to inform construction of a DAG for a _subsequent_ observational study on the same topic (??)\n\nPublished associational studies (often derived from administrative databases) are often DAG-free, with weak supporting biological plausibility. Potential _non_ -causal explanations for the findings can be easy to identify. There is a real risk of harm to patients from this type of research. Often, authors gaslight readers by saying that their study isn’t capable of proving causation, while simultaneously either implying, or outright advising, that they act on the results.\n\n@Pavlos_Msaouel has been using DAGs elegantly to _inform design_ _of his oncology RCTs_. The DAGs in his papers seem, clinically-speaking, eminently sensible. I’d be interested to hear how he develops them.\n\nI’ve asked myself why Pavlos’ concise DAGs, applied for the purpose of _improving RCT design_ , seem so much more credible to me (as a physician) than the few DAGs that I’ve seen constructed for the purpose of deriving causal inferences from observational studies (which look, to me, like hopelessly subjective birds’ nests). I thought of a few possible explanations. First, I suspect that the evidence underpinning _his_ DAGs is of a much higher volume and rigour, as necessitated by the high-stakes nature of the drug development gauntlet (?) Second, it occurred to me that _his_ incentives, and the incentives of _drug sponsors_ , are very different from those of academic observational researchers. As non-scientific as this might sound, it’s human nature for us to factor in the researchers’ incentive structure when gauging the credibility of published research. And finally, the _consequences_ of him getting the DAG “wrong” will work _against_ him, not in his _favour_ (as can potentially be the case for sloppily-constructed DAGs in the _observational_ research realm).\n\nDifferent types of researchers can be differentially incentivized to construct DAGs in the design phase of their studies. In using DAGs to inform oncology RCT design, it seems like Pavlos is trying to optimize _assay sensitivity_ (the ability of his trials to detect signals of intrinsic therapeutic efficacy). If _his_ DAGs are unreliable/incorrect, the consequence is not likely to be “false” signals of therapeutic efficacy (which drug sponsors could then milk for profit), but rather failure to detect signals of therapeutic efficacy that are actually present (which would cause patients to lose out on potentially life-lengthening therapeutic advances). His incentive (and that of the drug sponsor) to both _use_ DAGs and to “get them right” is very strong.\n\nThe incentive structure around DAG construction seems more complicated for _observational_ researchers than clinical trialists. An observational researcher who is motivated primarily by the prospect of contributing one strand to an evidentiary web for an important causal clinical question will, ideally, first survey subject matter experts to hone the clinical question and then construct a DAG. Only THEN will/should he examine his data source to see if he has the right data to address the question. If he is honest, he will likely OFTEN find that the administrative data at his disposal are insufficiently comprehensive/granular to inform the clinical question- and he will abandon his plan. In _this_ situation, construction of a solid DAG has effectively _penalized_ him _personally_ (as he won’t be able to publish and advance his career- and, therefore, feed his family). And even if he DOES make an effort to construct a DAG, there’s a good chance that the _quality_ of the DAG will be _inversely_ proportional to the likelihood of the research getting published (if the DAG contributes to a less eye-catching result). This incentive misalignment does not reflect personal failure on the part of the researcher, but rather a corrupted research ecosystem.\n\nThe “publish or perish” culture in academic research needs to be addressed before good research practices will take root. The onus is on funders not to fund badly-designed research and on universities not to make promotion contingent on publication metrics.",
  "title": "Thinking Clearly about Association Studies (Risk Factors and Causal Salad included)"
}