Good reading on when "doubly robust" is worth the effort?
Not exactly
They try to estimate performance metrics of a prediction model under covariate shift and when the outcome is not available, kind of reminds me de-groot work:
PubMed
Adjusting for differential-verification bias in diagnostic-accuracy studies:...
In studies of diagnostic accuracy, the performance of an index test is assessed by verifying its results against those of a reference standard. If verification of index-test results by the preferred reference standard can be performed only in a...
I’m not sure if it is really likely to have a misspecified conditional loss or a misspecified weights for the shift, it seems more likely to me to have unstable performance metrics because of the weights.
Low propensity → high ipw → unstable performance metric. Ideally I would like to see all 3 variations of the correction (conditional loss, weighting, and doubly robust) and judge for my self.
Sounds reasonable?
Discussion in the ATmosphere