Questions tagged [multiple-imputation]
Use this tag for questions involving multiple imputation, which refers to a set of stochastic imputation routines aimed at preserving the multivariate features of the data.
557 questions
2
votes
0
answers
38
views
Restrict training data to only rows with values for most important variable? [closed]
My training data is mostly missing values for the feature that I know will be the most important variable. This missingness is semi-random. For example, I know the value is missing for this feature ...
1
vote
0
answers
38
views
I am using multiple imputation by chained equations on a logistic regression model; can I use Rubin's Rules to get CIs for the marginal effect?
I've got a binary outcome, 10-20 predictors (some numeric, some binary). There is one focal predictor. I would like to present the effect as a marginal effect (i.e., an adjusted prevalence difference),...
1
vote
0
answers
34
views
Estimating the total fuel use for a fleet of delivery trucks that have completed multiple trips, where some individual trip fuel usages are missing
I am working on a project to estimate the total fuel use for a fleet of delivery trucks that have each completed multiple trips. For many trips, the exact fuel usage is recorded, but for a significant ...
2
votes
0
answers
53
views
Is a MICE approach to multivariable imputation well controlled for survival data?
When imputing data by an algorithm such as "mice", it occurs to me that the algorithm takes no account of the structure or representation of a survival outcome which is stored as an event ...
0
votes
0
answers
55
views
LASSO and cross validation when dealing with missing data
I want to simulate data with missing values and use them to compare the predictive performance of several machine learning algorithms, including LASSO. All analyses will be performed in R, using the ...
2
votes
1
answer
124
views
What are we trying to accomplish with multiple imputation [duplicate]
Suppose there are two groups- a treatment and control. There are two covariates, say age and treatment. 100 participants per group. Age is observed for all 200 participants. However, the response ...
1
vote
0
answers
51
views
Multiple imputation and modelling using penalised splines [closed]
I ran multiple imputation in R using mice. Only one categorical variable had missingness and I specified the imputation model to imputate it using ...
5
votes
2
answers
321
views
Adjusting confidence intervals for multiple comparisons within a model conducted on a multiply imputed dataset
I've used multiple imputation on a dataset and run a logistic regression model (using mice in R).
This is my output
...
3
votes
1
answer
180
views
Is it appropriate to use *missForest* for imputing missing data and critical/recommended threshold?
I'm currently working with a dataset from a molecular epidemiology study involving an controls and cases for a cardiovascular event. The dataset includes several categorical health and lifestyle-...
4
votes
1
answer
118
views
Negative F-value from micombine.F in miceadds after pooling positive F-values
I'm using the mice and miceadds packages in R to perform multiple imputation and then analyze the results.
Here's what I did:
I performed multiple imputation on my dataset using the mice package.
For ...
1
vote
0
answers
60
views
Pooled Point Prediction Intervals for MICE imputed data
I am performing binary prediction on a dataset which contains missingness, and so I am leveraging Multiple Imputation (MI).
For example, creating a train-test split, I perform MI on the training data ...
0
votes
0
answers
83
views
Multiple Imputation in RCTs
I want to impute data in an RCT using the mice package in R and have some questions regarding the imputation of missing outcomes. Outcomes were assessed at 5 assessment points, T1-T5.
Scale-level or ...
0
votes
1
answer
127
views
Do we handle missing demographic data the same way we handle missing data for other sort of variables?
I am missing data on demographic variables such as age, gender, ethnicity. I have used stochastic regression to impute the missing data on all other variables of interest, such as psychological ...
8
votes
2
answers
734
views
Does the temporal relationship between variables matter when imputing missing values?
I am in the situation where I have multiple variables, containing missing values, measured at time $t_0$, and some others measured at time $t_1$, which can be several years later.
I need to impute ...
0
votes
0
answers
78
views
Calculating Partial Eta Squared for Pooled Results After Multiple Imputation in R
I’ve been struggling with this question for a while, so any help is much appreciated!
I’m trying to calculate an effect size (partial eta squared or $\eta^{2}_p$) for an ANCOVA model using pooled data ...
0
votes
4
answers
262
views
Is it valid to drop a variable in complete case analysis but include it in multiple imputation analysis?
I am analyzing a dataset with variables such as Age, Sex, and Education, where some variables have missing values. One of the variables (Education) has over 60% missing data. For my analyses, I am ...
2
votes
2
answers
273
views
Intuition behind generative methods for imputing missing data
I’m learning different approaches to impute a tabular dataset of mixed continuous and categorical variables, and with data assumed to be missing completely at random. I converted the categorical data ...
0
votes
0
answers
127
views
Estimation of pooled prevalence (with 95%CI) in imputed and weighted data in R
I am trying to estimate the prevalence of a binary variable "x" and its confidence interval after multiple imputations (using mice) and applying weights in R. I use Rubin's rules for the ...
0
votes
0
answers
46
views
Role of missingness patterns in imputation by joint modeling
I'm interested in multiple imputation by joint modeling when all variables are incomplete. van Buuren describes the algorithm as follows:
In the critical step 10, I am confused because for any ...
1
vote
0
answers
38
views
Risk score developement from admin health data: when the test for the risk factor and outcomes are confounded by indication
I have health records of immunodepressed patients who may have event histories like
[high risk demographics] -> [low lymfocyte count] -> [high viral load] -> [clinical events]
From those data ...
0
votes
0
answers
69
views
Efficient Imputation method for big longitudinal dataset in R
I have very big dataset of around 3 million rows and 50 variables of different types. The dataset is longitudinal in long format (around 350 000 unique individuals). I want to impute missing data ...
3
votes
1
answer
176
views
Repeated Measures ANOVA/Linear Mixed Effects Model and Missing Data
I'm conducting a study measuring happiness across 4 time points, aiming to determine if there's an increase in overall happiness. The required sample size is 24 for four time points and 28 for three. ...
1
vote
0
answers
52
views
principal component analysis with missing data [duplicate]
What would be the best approach to deal with missing data in a dataset when we want to run a PCA and then use the participant component scores extracted from the PCA as predictors in a mediation model?...
2
votes
1
answer
615
views
Missing data imputation in longitudinal data in R
I have very big dataset (around 10 million rows) with repeated measures of around 500 000 individuals, irregularly spaced through time. My final goal is to do IPTW and fit a weighted cox regression ...
3
votes
1
answer
101
views
Comparing nested models for fixed main effect when there are interactions
I have a linear mixed model, which uses a multiply imputed dataset. I saw that LRT could be used to assess Fixed effect significance in linear mixed model. I used ...
12
votes
5
answers
2k
views
How much missing data is too much? part 2: statistical power to impute?
A question is how many missing values are too many to be handled. It has been asked in the context of applying specific software and method (MICE).
I am interested in understanding a bit better what ...
0
votes
1
answer
150
views
Strong potential of MNAR issue - can we still use MI i.e., mice() in R?
Study goal: estimate the proportion of patients who experience outcome Y (1=Yes, 0=No) within max 5 years of follow-up.
Missing data issue: Outcome Y is missing for a large proportion of people (96% ...
1
vote
2
answers
83
views
Mixed model vs. imputation for questionnaire scale score?
I would like to fit a statistical model where the dependent (response) variable is a validated scale score from a questionnaire. For each subject, this dependent variable is calculated from the values ...
0
votes
0
answers
133
views
Missing Data on Survival Analysis
I want to run a survival analysis (say, Cox model) with time of origin at birth and a disease (say, cancer) as the event,. Covariates are around 5 demographic variables (age, sex, etc.).
The problem ...
1
vote
1
answer
239
views
How to pool results of type II ANCOVA from multiply imputed data set?
I'm using MICE to impute a small data set. I am going to use ANCOVA of type II through Anova function of R package car. However, ...
4
votes
2
answers
677
views
Assessing model fit in logistic regression with multiple imputation
I'm wondering if there is any established method for assessing model fit in logistic regression conducted with multiple imputed datasets. To the best of my knowledge, there are two primary approaches ...
1
vote
1
answer
163
views
Multiple imputation with some missing baseline data and with some missing longitudinal data
How shall I impute the data in the following situation: I have some baseline covariates collected and longitudinal data. Both baseline covariates and longitudinal data have some missing data.
Shall I ...
2
votes
2
answers
111
views
Imputation Missing data
I have a longitudinal data set with 2 dependent variables (couple) - a husband and a wife. There were 2 waves for the husbands and 3 waves for the wives. Since there is a lot of missing data, I ...
3
votes
1
answer
238
views
How to conduct statistical analysis on multiply imputed data?
I have a data.frame named mydata with 6 columns: status, times, t1, t2, t3, t4. However, t1, t2, t3, and t4 contain missing values in this dataset. I intend to impute these missing values using the ...
1
vote
0
answers
58
views
Multiple Imputation for Missing Outcome Data
I have spent an extensive amount of time trying to understand the possible role of MICE in helping to "fill in" missing outcome data. I am relatively new to both multiple imputation and ...
3
votes
1
answer
220
views
Bootstrap standard errors after multiple imputation
Following Rubin's rules for multiple imputation, I've calculated pooled estimates, group means in this case, with pooled standard errors.
I checked this with a bootstrap and, assuming pooled standard ...
2
votes
1
answer
115
views
Multiple Imputation method in RCT
We decided to use the multiple imputation method in a RCT to solve the problem of some follow-up missing data (for completely random reasons). I was planning on using the Multiple Imputation method ...
1
vote
0
answers
64
views
How to access weighted group means and standard errors after using mice and WeightIt? [closed]
Some background: I imputed and weighted data from two groups of people, one group in a certain organization and one outside of it. Ultimately, I want to compare how they develop psychological trait X ...
2
votes
2
answers
433
views
Multiple Imputation for missing outcomes in Cox regression
Imagine an RCT with a time-to-event outcome which is analyzed using a Cox regression. There are four assessments (T1=before randomization, T2=3 weeks, T3=6 weeks, T4=12 weeks). Under the censoring at ...
2
votes
1
answer
87
views
Multiple imputation process
Whether it is a method for dealing with monotonic or arbitrary missing data (FCS or MICE), there is a process I do not understand.
Let's take the example of linear regression for continuous variables:
...
0
votes
0
answers
100
views
Multiple imputation of longitudinal data in SPSS
I'm attempting to analyse a longitudinal, retrospective dataset with measurements at various time-points. The data-set has a significant amount of missing data, up to 30% for the main outcome variable,...
0
votes
0
answers
84
views
Does survey R package allow me to do beta regression?
I have a complex survey dataset with a response (dependent variable) bounded between 0 and 1, where I have applied multiple imputation to the dataset to account for missing data. The response formally ...
7
votes
3
answers
2k
views
How to analyze a dichotomous outcome with 50% missing data?
I am researching predictors of dropout from a training program. I want so to see if personality traits add incremental variance above well-established predictors like age, fitness, and education. So, ...
2
votes
2
answers
315
views
How to compare 2 multiply imputed nested Cox proportional hazards models?
I've got 2 nested Cox models, which I fit to 10 imputed datasets. Pooling the regression coefficient estimates and associated p-values I've done already.
I'm trying to work out if adding one extra ...
2
votes
1
answer
132
views
NA results after pooling estimates and coeff of mixed effects cox model from MICE imputation dataset
I need your help with my problem. So, after step of imputation missing data through MICE method, I got multiple imputed dataset. Then, I pooled the estimates and coefficients with mixed effect cox ...
6
votes
4
answers
1k
views
Why is multiple imputation not used more widely in Data Science? [closed]
I posted this question a few days ago on datascience.SE because I thought it was more relevant there:
Why is multiple imputation not used more widely in Data Science?
I have a background in ...
1
vote
0
answers
72
views
Multiple imputation: deleting cases before imputation
Note: The question has been edited to make it more focused, and the title has been changed to make it clearer.
I have read questions/answers about how to select variables for imputation. This question ...
2
votes
1
answer
298
views
Visit and order sequence for multiple imputation in mice r
I want to use the R package MICE for Multiple Imputation and I have a question concerning the order of my dataset - regarding the order of my variables on the one hand and the order of my cases on the ...
1
vote
0
answers
174
views
After the mutiple imputation (MICE package in R), I still found that some variables are still with missing values. How to deal with it?
I have a relatively large data set with around 12000 samples with 550 variables. Originally, I have around 800 variables, I used a rule that if missing rate in each variable is larger than 80% I will ...
2
votes
2
answers
1k
views
Pooling p-values from hypothesis tests after multiple imputation
I'm working on a project that is using some more advanced statistical methods and coding than I'm normally used to and would appreciate some help. The project required me to do multiple imputation, ...