Questions tagged [multiple-imputation]
Use this tag for questions involving multiple imputation, which refers to a set of stochastic imputation routines aimed at preserving the multivariate features of the data.
557 questions
1
vote
1
answer
66
views
The impaction difference between image of random variable and clinical practice in clinical trial
In the clinical trial design, we often make assumptions about clinical endpoints. For example, we may assume the blood pressure followed a normal distribution, such as $ X \sim N(\theta, \sigma^2)$, $...
2
votes
1
answer
386
views
Multiple imputed data set from SPSS to import into MPLUS
I'm developing a scale for my dissertation project. I need help with how to handle missing data before I start the factor analysis procedure.
Recently I ran a multiple imputation in SPSS (20 ...
3
votes
1
answer
99
views
What is the compatibility of imputer and analyst models?
In a paper from Atem, et al 2018 (DOI: 10.1002/bimj.201800275), they claim the following in section 3 regarding the so called "imputer/imputation model" - i.e. the model used to impute ...
1
vote
1
answer
304
views
How to handle missing data in my situation?
I have two variables that I intend to use for creating prediction models for which I'm unsure how to handle missing values. The reason is that both are separated into multiple columns.
All ...
1
vote
0
answers
49
views
Different distributions - original and imputed values
I have performed chained multiple imputation in Stata and performed 2 robustness checks to compare distribution of the original and imputed values
Visual exploration
Kolmogorov-Smirnov test of ...
1
vote
0
answers
218
views
Pooling SHAP values from multiple imputed data
I have multiple imputed data and will be conducting an identical lightGBM model with the same input features in each of the imputed datasets. My aim is to calculate SHAP values (SHapley Additive ...
1
vote
0
answers
78
views
How does one deal with missing data of unbalanced longitudinal data? And what about multivariate case?
For balanced longitudinal data, the missing data can be handled by multiple imputation in wide format where I assume MAR.
However, for unbalanced longitudinal data(i.e. number of measurements varying ...
0
votes
0
answers
308
views
Questions regarding work with multiple imputation data in SPSS
I used the multiple imputation function integrated in SPSS (method: auto [meaning Markov Chain Monte Carlo or in case of monotonicity SPSS reverts to Monotone]; 5 imputations) and now I'm running into ...
0
votes
0
answers
119
views
How to choose the best converging imputations for my CFA model and pool them
I have 70 imputations of my original data set. I want to choose 50 of them which converged after less than 120 iterations on my CFA model:
...
0
votes
0
answers
466
views
Jump to Reference (J2R) Imputation with only Baseline for R
I have a data set with baseline and t1 values in an RCT. Is a Jump to Reference (J2R) imputation possible here? The implementations I have found so far all require t2 values next to t1, which I do not ...
0
votes
0
answers
40
views
Find consumption values based on previous data in R
I have the value of production waste (m3/day) and also the consumption (kwh/day) of some locations. You can see in the output ...
1
vote
1
answer
85
views
is it possible to use an existing non-linear relationship (GAM) to estimate back one of the independent variable?
Hi I have used a set of data to estimate the relationship between v0 (response variable) and v1-v6 (independent variables) through a non-linear model using GAM. The model is fitted with data between ...
3
votes
2
answers
831
views
Draw survival curves of 2 groups after multiple imputation on covariates
I wonder how to draw survival curves (Kaplan-Meier) when there is no missing information on the survival variables but on the stratification covariate.
For example, we know for all patients the follow-...
2
votes
1
answer
831
views
Multiple imputation when doing a Cox model
I'm struggling when trying to understand some aspects of multiple imputation when intending to do Cox regressions with my data.
First of all, my dataset is not adapted for survival analysis yet. I ...
3
votes
1
answer
414
views
Data points for some control variables missing in regression - still feasible?
I'm currently working on an event study to examine abnormal returns.
In the first step, I've calculated abnormal returns in regards to a certain type of company event, consisting of roughly 13,000 ...
2
votes
1
answer
659
views
Does it make sense to compare different imputation techniques?
My supervisor suggests that I impute the missing data in my dataset through various methods (namely complete cases, k-nearest neighbors, last observation carried forward and multiple imputation with ...
0
votes
1
answer
2k
views
Iterations in Multiple Imputation
Despite reading this other StatsExchange post, I am still struggling to understand what iterations do in multiple imputation, i.e. the parameter "maxit" in the mice() function.
My ...
1
vote
1
answer
75
views
Can I use alternative pooling technique after multiple imputation?
Research problem: Comparison of means (T-test, ANOVA) between 90 countries.
Analytical problem: I have a large data set of over 120 000 observations. Each observation was measures by 8 variables: 2 ...
1
vote
1
answer
616
views
Confirming cubic spline was done on imputed datasets (imputed by mice Package) and the estimate is the pooled based on Rubin's rule
I am performing restricted cubic spline (Cox proportional hazard ratio) after imputing 10 datasets using mice package.
My variables as follow:
Outcome: DM
Exposure: BMI
time to events: time
Covariates:...
0
votes
0
answers
252
views
Change in Correlation after Imputation of Missing Outcome Data for Machine Learning Prediction Task
I am attempting to generate a predictive model for a continuous outcome using machine learning models. However, some observations in the original dataset have missing outcome data (and missing ...
1
vote
1
answer
1k
views
How do I get the final mean value of imputed data?
I have run multiple imputations on my data and need to now export a final dataset that I can then calculate the mean of each imputed variable. How would one do this?
I can't pool the data straight ...
2
votes
0
answers
191
views
How to implement Rubin's Rules to assess model fit on imputed test data with continuous outcome? (e.g. RMSE and 95% CI)
I'm working on a project now which involves the use of multiple imputation while developing machine learning models (using a training/test split, ~7000 observations total) for a continuous outcome. I ...
3
votes
1
answer
99
views
Missingness of data due to network issues
I have a time-series dataset that has 120 missing rows due to consecutive network issues and I am trying to impute these values using MICE in Python. As the source of missingness is a total ...
0
votes
0
answers
88
views
Expectation-Maximization high missing rates and multiple variables
I know MICE can be used for imputation of multiple variables simultaneously. The expectation maximization approach (EM) can be used to impute missing data. Typically, one should only be using ...
1
vote
2
answers
534
views
I cannot understand the formula for between-imputation variance in multiple imputation
Multiple Imputation (MI) for estimating desired a desired statistic but with missing data
Following ^Shafer (page 4), and ^Austin et al. (section "Analyses in the M imputed data sets"), ...
0
votes
1
answer
836
views
Choosing MICE multiple datasets
I have read and watched several tutorials about MICE. My confusion is about step 1: creating several copies of the original dataset and imputing different values in each copy. In some tutorials, I ...
1
vote
0
answers
59
views
Single/ multiple imputation in post-selection/-regularization context
Context of problem:
In some situations researchers face high-dimensional problems with $p > n$, where $p$ is the number of covariates to be considered in a regression model and $n$ is the sample ...
0
votes
1
answer
1k
views
Multiple Imputation for Predictors Only, Excluding Missing Outcome Data
I am working with a dataset containing ~300 predictors and ~3000 observations and building a predictive model using elastic net (and hoping to generalize to an external validation set). While the ...
1
vote
0
answers
926
views
Categorizing continuous variable after multiple imputation in mice
I am trying to create new variables after multiple imputation.
I have the following variables:
data: mydata
total number of observations=500
HDL: continuous (no missing values)
Physical activity: ...
1
vote
0
answers
60
views
How to perform multiple imputation in time series to ensure sensible values per subject id
I have a time series of binary outcomes (case yes/no) and several covariates. The goal is to estimate incidence and prevalence of the outcome. I would like to perform multiple imputation for both ...
1
vote
0
answers
480
views
Residual standard error for multiple imputed regression
How to calculate the standard deviation of the residuals for a pooled regression? (The idea, formula and/or R code are all welcomed)
As far as I understand it is the standard deviation of the ...
2
votes
1
answer
275
views
Pooling Survreg Results Across Multiply Imputed Datasets - Warning: log(1 - 2 * pnorm(width/2)) : NaNs produced
I am trying to run an interval regression using the survival r package (as described here https://stats.oarc.ucla.edu/r/dae/interval-regression/), but I am running into difficulties when trying to ...
0
votes
0
answers
72
views
Developing a flexible estimation strategy for longitudinal data with heavy clustering, covariates with missings and interval censoring
This is a general question for R users that are familar with interval censoring in survival analyses. I have clinical registry data at hand and aim to compare the one-year incidence for two groups (...
1
vote
1
answer
1k
views
How do we check MCAR, MAR in general?
Most of methods of imputations requires either MAR or MCAR. How do we check the assumption on MAR or MCAR in general?
In how to check missing data is missing at random or not?, Turgeon said $H_0:MCAR$ ...
1
vote
0
answers
80
views
Pooling Profile Penalised LTRs in multiple imputation
I am analysizing data from a clinical trial.
I used multiple imputation to impute the (binary) outcome variable, which is the only variable with missing data.
All of the covariates are categorical and ...
0
votes
1
answer
753
views
How to perform Restricted cubic spline (Cox adjusted) after multiple imputation with mice?
Hello I would like to perform restricted cubic spline (Cox adjusted) after multiple imputation with mice.I use rms package. but after imputation when I use the function ...
1
vote
0
answers
991
views
Missing values after multiple imputation
Update: It looks like I mispecified my prediction matrix. By setting the columns of covariates I did not want imputed to 0, the problem was solved. I had set other columns such as IDs to 0, but I was ...
1
vote
1
answer
1k
views
Multiple imputation and inverse probability weighting for multiple treatment?
I am analyzing observational study data. My predictor variable is tg4 with 4 categories (0,1,2,3) and my response variable is dm ...
1
vote
0
answers
303
views
Imputation with MICE biased under MCAR in extreme case of missingness
I am currently using mice package to impute missing values for a simulation project. I use different performance metrics (eg. bias, precision etc) to compare different methods(eg. complete cases, ...
0
votes
0
answers
3k
views
Can I do multiple imputation for a categorical data using mice() in r?
Okay, so please bear in mind I am very new to R and statistics in general! My professor has drilled into us that multiple imputation is the best way to deal with missing data (haven't been shown how ...
1
vote
1
answer
2k
views
Rubin's Rule of pooled confidence interval
Rubin's Rule for multiple imputation states that
you are to construct a single interval after pooling into a single set of estimates and standard errors:
$$
\bar{\theta} \pm t_{df,1-\frac{\alpha}{2}}*...
0
votes
1
answer
278
views
Prop.test with Multiple Imputation
I have a large dataset with a large amount of missing data, so I used the mice package to create multiple imputations to fill in the missing values.
Now I am trying ...
0
votes
1
answer
518
views
Apply multiple imputation to non-random missing data
I am working on a multi-center project (3 centers; recruit ~20 participants in each center with a total of 60 participants). One of my independent variables (depression scale; continuous variable) was ...
1
vote
1
answer
2k
views
Missing data in survival analysis (time-to-event; event)
I am performing survival analysis (Kaplan-Meier, Cox-regression) for 1500 patients. I noticed that I have missing values for 15 patients in regards to their time-to-event and their dichotomous outcome-...
2
votes
1
answer
292
views
Partial eta squared calculation with multiple imputation data
I have 10 multiple imputation datasets ($N = 97$, two groups) and am running ANCOVA (controlling for pre-test values) to look at post-test group differences. Working in SPSS and can't really invest ...
3
votes
1
answer
1k
views
Model Diagnostics in Multiple Imputation
When using multiple imputation, what is the best way to run model diagnostics? In a related post here (Multiple Imputation and Regression Model Diagnostics), one option in the accepted answer was ...
1
vote
0
answers
244
views
Correcting for Heteroscedasticity in multiple imputed datasets
I have a question regarding the homogeneity of variance in three regression models of diffrent datasets belonging to the same multiple imputed data.
As I used multiple imputation I have to check ...
0
votes
1
answer
706
views
Is it possible to create survfit and survdiff objects using imputed data?
I have a dataset in which 45% of participants have missing data. Given the high proportion of missingness, I planned to conduct survival analyses on both an imputed and non-imputed dataset.
I am ...
0
votes
1
answer
208
views
Will MICE imputation accuracy be harmed by removing all duplicates?
I'm working with a very large amount of data, using the PMM method via R Mice. The data has a healthy number of continuous variables. I'm removing all the duplicates entries before starting the ...
2
votes
2
answers
114
views
Separate imputation models for separate substantive models on the same dataset?
I am working with a dataset for which I have generated three hypotheses, i.e. I built three substantive models (two logistic regressions and one cox regression with a different dependent variable). ...