External Publication

Best model reduction approach for Fine-Gray prediction model

Datamethods Discussion Forum [Unofficial] March 27, 2026

Yes, as described here data reduction (unsupervised learning) involves reducing the number of variables to model in a way that is completely masked to Y. That way you don’t create model uncertainty / overfit, and interpretation can be enhanced by not trying to separate collinear variable. My favorite default method is sparse PCA for which there are two examples in those RMS notes. The one that is near the end of the ebook involves nonlinear sparse PCA.

Variable selection methods (other than ones based solely on subject matter knowledge), even when smartly using shrinkage as in lasso are unlikely to be stable and will miss important variables.

Discussion in the ATmosphere