External Publication

Clustering in the denominator: non-independence of starts in racing fatality studies

Datamethods Discussion Forum [Unofficial] March 31, 2026

trumanfrancis:

Any pointers to relevant literature or analogous problems in other fields would be very welcome.

James, often your outlook on these problems seems to be that of a reviewer of a paper, which I think puts you at a disadvantage very much like that of the Gordian Knot - Wikipedia. Better simply to cut through the thing!

The whole problem set-up here reminds me of the Efficient-market hypothesis - Wikipedia. There are (I am guessing) many well-informed actors — trainers, vets, jockeys — making decisions about when/whether/how to race a horse. These decisions are also made on the basis of private information, even such as a trainer’s ‘gut feel’ I would suppose. So the risks a statistician might hope to detect have already been thoroughly ‘priced in’.

One way to cut the knot would be to posit a rational-actors model, and attempt to relate the risk of fatal MSI to the (time-dependent) economic value of the horse to its owners, who would be viewed as solving some kind of stochastic optimization problem.

Discussion in the ATmosphere