Multilevel multiple imputation in practice using R [closed]

Ask Question

Asked 2 years, 3 months ago

Modified 2 years, 3 months ago

Viewed 173 times

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 9 months ago.

Improve this question

I'm currently involved in a project where I want to address missing data using multiple imputation. I'm using healthcare data in a longitudinal setting with 16 time points, where observations are nested twice: each observation represents therapy-time (i.e., each therapy ID is given 16 times, unless it was censored), and each therapy is nested within persons (i.e., a person can contribute multiple therapies to the data set). The analysis itself is planned out entirely, but I'm struggling a little with multiple imputation respecting the data structure and I'm unfortunately only somewhat familiar with MI theory. Briefly, there are ~10 variables that need imputation. Each of them I want to impute using fixed effects for $t$, a random intercept for my two ID-variables, and ~20 other variables. No imputation is needed for treatment or outcome (confounders only). However, when I use different packages in R to impute data:

MI using MICE never worked. All my imputation models using the multilevel methods failed to converge, even when using a single predictor. I'm unsure why, but I've kind of given up on it for now, even though I would've preferred FCS MI.
JOMO takes an eternity to run. I tried doing a little test run using a reasonable imputation model across my ~180000 observations, but after more than an hour I didn't even finish 100 burn-in iterations.
Imputations using panImpute through the mitml-package is reasonably fast and I get models to converge. What I'm worried about is that pan imputes continuous variables drawing from a multivariate normal model, but most of my variables are heavily skewed (e.g., have floor effects that prevent me from transforming the variables appropriately) or are binary/categorical. I've seen some studies stating that imputation performance can still be reasonable under these conditions, but I'm a little afraid of just going with it and hoping that the imputation model ends up being good.

What I'd like to ask is:

Has anyone else encountered the same problems when using MICE for multilevel imputations?
Would you trust MI datasets imputed using pan when barely any of the imputed variables follow a normal distribution?
Are there other packages I should take a look at? I know about Amelia, but I'm not sure if it has any advantages over pan.

Would appreciate any help regarding any of my questions! Thanks!

asked Aug 24, 2023 at 15:47

Malik

2181 silver badge5 bronze badges

$\begingroup$ Please edit the question to limit it to a specific problem with enough detail to identify an adequate answer. $\endgroup$

Community
– Community Bot

2023-08-24 15:51:59 +00:00
Commented Aug 24, 2023 at 15:51

Add a comment |

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.

Stack Exchange Network

Multilevel multiple imputation in practice using R [closed]

0

Hot Network Questions

Multilevel multiple imputation in practice using R [closed]

0

Related

Hot Network Questions