Is GVIF meaningful for a reduced interaction block created from manually coded factor × treatment terms?
I’m fitting a survival model with a multi-level factor (HistologyClass) and binary treatment variables (Radiotherapy, Chemotherapy). I do not want the full HistologyClass * Treatment interaction, because I only want a selected subset of clinically relevant histology-specific treatment interactions.
So instead of writing the full interaction in the formula, I manually create terms such as:
RT_Medulloblastoma = I(Radiotherapy == "Yes") * I(HistologyClass == "Medulloblastoma")
CT_Embryonal = I(Chemotherapy == "Yes") * I(HistologyClass == "Embryonal")
and then fit a model like:
Y ~ HistologyClass + Radiotherapy + Chemotherapy +
RT_Medulloblastoma + CT_Embryonal + ...
My question is specifically about GVIF, not significance testing or whether to keep/drop terms.
Each manually coded interaction is just a separate 1 d.f. column, but collectively they seem to form a reduced multi-d.f. interaction block. Is it mathematically valid to compute a GVIF for that reduced block? If so how does one compute that?
I’m using rms::orm, there isn’t a function for calculating GVIF in rms and the car package doesn’t accept orm models, therefore I am planning to write it out myself.
Discussion in the ATmosphere