I'm developing a logistic regression used for prediction. I have pre-selected, based on prev. literature, 15 candidate predictors (fitting my ~200 events).
Now, I want a reduced/more parsimonious model for ease of use, and I want to use model approximation for it (as suggested in RMS by Harrel). I will need to present a multivariable summary table of selected predictors based off their approximation. I wonder, when reporting P-values for each coef in the reduced model, what should I use (i.e, re-estimate the reduced model against outcome)
sample R code:
library(rms)
lp <- predict(fullmodel)
af <- as.formula(paste0("lp ~ ",paste0(ivars, collapse = "+")))
a <- fit.mult.impute(af,ols,sigma=1,xtrans = derivation,x=TRUE,y=TRUE)
s <- fastbw(a, aics=10000)
betas <- s$Coefficients
X <- cbind(1,fullmodel$x)
ap <- X %*% t(betas)
m <- ncol(ap) - 1
r2 <- frac <- numeric(m)
fullchisq <- fullmodel$stats['Model L.R.']
Labels <- attr(s$result,"dimnames")[[1]]
for (i in 1:m) {
lpa <- ap[,i]
r2[i] <- cor(lpa ,lp)^2
fapprox <- fit.mult.impute(futility~lpa,lrm,xtrans=derivation,x=TRUE,y=TRUE)
frac[i] <- fapprox$stats['Model L.R.']/fullchisq
}