I've been reading some posts about data imputation using multiple imputation, specifically the MICE R package. I get the main idea of creating multiple datasets with imputed data. The part that is not clear to me is the linear regressions + pooling results. At the end I'd do something like this:
modelFit1 <- with(data,lm(var3~ var1+var2))
pool(modelFit1)
summary(pool(modelFit1))
The summary shows some coefficient estimates, should I use those to predict the final values for the missing data (var3)? If that's correct, what should I do if I also want to impute data for var2? Can I use var3 and var1 (kind of circular)?
with(data, lm(var1~1))and thenpool(the same for var2 and var3). Another possibility is to use the functionpool.scalar. The help file ofpool.scalarhas an example of how to estimate the mean with the imputed datasets. $\endgroup$