External Publication

Relaxing Assumptions and Targeted Estimands with MOST

Datamethods Discussion Forum [Unofficial] May 26, 2026

Thanks for the follow-up, Johannes.

I think this is exactly my point (although my point is still a question, to be fair). I don’t have a criteria in mind yet for MOST being “faithful enough” to the Y, but I’m sure this can be further debated on its own. What I had in mind is your last concern: should we leave aside MOST if we are only interested in something local (eg , time spent in a specific state)? My gut feeling is no (even though your example challenges that): I still believe that in general there is interest in having an overall model (the one Frank would like to be flexible enough, yet not so flexible as to overfit) even if your question revolves around something local. I can see two reasons for that wish of a global model:

We are never interested in something local only, we are interested in several things that could all be deduced from a general MOST model. This is not directly answering the question, but putting a bit of perspective.
Even if we only looked at one question only, I believe the structure that MOST brings through the model is in general beneficial in terms of bias variance trade-off. Though, again, your example challenges this.

To put a bit of “meat” on my thinking, though this is debatable of course, I was thinking about having one parent MOST model (fine-tuned from a global perspective), that would also be (hyper)parametrized by a quantity (or several) I could then fine-tune differently for the different local questions I would be asking. On the top of my head, something in the spirit of a fused Lasso for the partial PO wrt outcome levels (penalty for adjacent coefficients being likely the same). Perhaps optimization of the hyperparameter(s) would then lead to different children MOST that would be more tailored for local questions (eg for time in state k, I will likely end up with PO for states around that one and then no PO for states further away). But, to complicate and have fun, I would also constrain the amount of fine-tuning wrt the parent model (overfitting concerns and multiplicity considerations if several children models end up being quite similar). Note that this is very similar to Frank’s second suggestion in his last message, I’m just being stubborn by sticking to a ‘parent’ model rather than using two models as he suggested and choose in a principled manner the one that is most appropriate.

Now, I’m intrigued by your example. Could you specify what is partial in the PPO model in the third column?

Note: to simplify a bit, my suggestion can in a first place be put in a non-longitudinal setting. If we have an ordinal outcome, how can we approach several local questions (one specific outcome level) while still trying to recognize similarity in the treatment effect across outcome levels whenever that is appropriate?

Discussion in the ATmosphere