Newest 'multiple-imputation' Questions

2 votes

0 answers

38 views

Restrict training data to only rows with values for most important variable? [closed]

My training data is mostly missing values for the feature that I know will be the most important variable. This missingness is semi-random. For example, I know the value is missing for this feature ...

mdrishan

237

asked Nov 18 at 17:40

1 vote

0 answers

38 views

I am using multiple imputation by chained equations on a logistic regression model; can I use Rubin's Rules to get CIs for the marginal effect?

I've got a binary outcome, 10-20 predictors (some numeric, some binary). There is one focal predictor. I would like to present the effect as a marginal effect (i.e., an adjusted prevalence difference),...

WhatIsTuningInGLASSO

21

asked Nov 5 at 19:29

1 vote

0 answers

34 views

Estimating the total fuel use for a fleet of delivery trucks that have completed multiple trips, where some individual trip fuel usages are missing

I am working on a project to estimate the total fuel use for a fleet of delivery trucks that have each completed multiple trips. For many trips, the exact fuel usage is recorded, but for a significant ...

user1403856

11

asked Nov 3 at 17:41

2 votes

0 answers

53 views

Is a MICE approach to multivariable imputation well controlled for survival data?

When imputing data by an algorithm such as "mice", it occurs to me that the algorithm takes no account of the structure or representation of a survival outcome which is stored as an event ...

AdamO

67.5k

asked Jul 24 at 4:55

0 votes

0 answers

55 views

LASSO and cross validation when dealing with missing data

I want to simulate data with missing values and use them to compare the predictive performance of several machine learning algorithms, including LASSO. All analyses will be performed in R, using the ...

Benykō-Zamurai

563

asked Jul 23 at 12:38

2 votes

1 answer

124 views

What are we trying to accomplish with multiple imputation [duplicate]

Suppose there are two groups- a treatment and control. There are two covariates, say age and treatment. 100 participants per group. Age is observed for all 200 participants. However, the response ...

John L

2,786

asked Jun 29 at 20:58

1 vote

0 answers

51 views

Multiple imputation and modelling using penalised splines [closed]

I ran multiple imputation in R using mice. Only one categorical variable had missingness and I specified the imputation model to imputate it using ...

cheddar97

11

asked Jun 9 at 16:27

5 votes

2 answers

321 views

Adjusting confidence intervals for multiple comparisons within a model conducted on a multiply imputed dataset

I've used multiple imputation on a dataset and run a logistic regression model (using mice in R). This is my output ...

llewmills

2,345

asked May 27 at 8:25

3 votes

1 answer

180 views

Is it appropriate to use missForest for imputing missing data and critical/recommended threshold?

I'm currently working with a dataset from a molecular epidemiology study involving an controls and cases for a cardiovascular event. The dataset includes several categorical health and lifestyle-...

Javier Hernando

713

asked May 16 at 7:20

4 votes

1 answer

118 views

Negative F-value from micombine.F in miceadds after pooling positive F-values

I'm using the mice and miceadds packages in R to perform multiple imputation and then analyze the results. Here's what I did: I performed multiple imputation on my dataset using the mice package. For ...

Danilo Calero Sequeira

53

asked May 9 at 9:46

1 vote

0 answers

60 views

Pooled Point Prediction Intervals for MICE imputed data

I am performing binary prediction on a dataset which contains missingness, and so I am leveraging Multiple Imputation (MI). For example, creating a train-test split, I perform MI on the training data ...

benedictjones

11

asked Apr 28 at 12:42

0 votes

0 answers

83 views

Multiple Imputation in RCTs

I want to impute data in an RCT using the mice package in R and have some questions regarding the imputation of missing outcomes. Outcomes were assessed at 5 assessment points, T1-T5. Scale-level or ...

Sebastian

133

asked Feb 19 at 13:57

0 votes

1 answer

127 views

Do we handle missing demographic data the same way we handle missing data for other sort of variables?

I am missing data on demographic variables such as age, gender, ethnicity. I have used stochastic regression to impute the missing data on all other variables of interest, such as psychological ...

Lee Zhiyuan

331

asked Feb 6 at 8:01

8 votes

2 answers

734 views

Does the temporal relationship between variables matter when imputing missing values?

I am in the situation where I have multiple variables, containing missing values, measured at time $t_0$, and some others measured at time $t_1$, which can be several years later. I need to impute ...

wrong_path

926

asked Jan 31 at 8:07

0 votes

0 answers

78 views

Calculating Partial Eta Squared for Pooled Results After Multiple Imputation in R

I’ve been struggling with this question for a while, so any help is much appreciated! I’m trying to calculate an effect size (partial eta squared or $\eta^{2}_p$) for an ANCOVA model using pooled data ...

Andy

1

asked Jan 16 at 12:31

0 votes

4 answers

262 views

Is it valid to drop a variable in complete case analysis but include it in multiple imputation analysis?

I am analyzing a dataset with variables such as Age, Sex, and Education, where some variables have missing values. One of the variables (Education) has over 60% missing data. For my analyses, I am ...

eshuns

15

asked Jan 13 at 13:28

2 votes

2 answers

273 views

Intuition behind generative methods for imputing missing data

I’m learning different approaches to impute a tabular dataset of mixed continuous and categorical variables, and with data assumed to be missing completely at random. I converted the categorical data ...

hiu

77

asked Dec 22, 2024 at 1:27

0 votes

0 answers

127 views

Estimation of pooled prevalence (with 95%CI) in imputed and weighted data in R

I am trying to estimate the prevalence of a binary variable "x" and its confidence interval after multiple imputations (using mice) and applying weights in R. I use Rubin's rules for the ...

Elodie L

1

asked Dec 12, 2024 at 10:36

0 votes

0 answers

46 views

Role of missingness patterns in imputation by joint modeling

I'm interested in multiple imputation by joint modeling when all variables are incomplete. van Buuren describes the algorithm as follows: In the critical step 10, I am confused because for any ...

half-pass

3,850

asked Dec 6, 2024 at 23:56

1 vote

0 answers

38 views

Risk score developement from admin health data: when the test for the risk factor and outcomes are confounded by indication

I have health records of immunodepressed patients who may have event histories like [high risk demographics] -> [low lymfocyte count] -> [high viral load] -> [clinical events] From those data ...

Helene Hoegsbro Thygesen

481

asked Dec 5, 2024 at 12:36

0 votes

0 answers

69 views

Efficient Imputation method for big longitudinal dataset in R

I have very big dataset of around 3 million rows and 50 variables of different types. The dataset is longitudinal in long format (around 350 000 unique individuals). I want to impute missing data ...

Tasosmav

53

asked Nov 26, 2024 at 9:22

3 votes

1 answer

176 views

Repeated Measures ANOVA/Linear Mixed Effects Model and Missing Data

I'm conducting a study measuring happiness across 4 time points, aiming to determine if there's an increase in overall happiness. The required sample size is 24 for four time points and 28 for three. ...

anna eyre

141

asked Nov 12, 2024 at 23:06

1 vote

0 answers

52 views

principal component analysis with missing data [duplicate]

What would be the best approach to deal with missing data in a dataset when we want to run a PCA and then use the participant component scores extracted from the PCA as predictors in a mediation model?...

CatM

526

asked Oct 29, 2024 at 20:26

2 votes

1 answer

615 views

Missing data imputation in longitudinal data in R

I have very big dataset (around 10 million rows) with repeated measures of around 500 000 individuals, irregularly spaced through time. My final goal is to do IPTW and fit a weighted cox regression ...

Tasosmav

53

asked Sep 17, 2024 at 13:16

3 votes

1 answer

101 views

Comparing nested models for fixed main effect when there are interactions

I have a linear mixed model, which uses a multiply imputed dataset. I saw that LRT could be used to assess Fixed effect significance in linear mixed model. I used ...

Alexandra Chapdelaine

41

asked Sep 7, 2024 at 18:41

12 votes

5 answers

2k views

How much missing data is too much? part 2: statistical power to impute?

A question is how many missing values are too many to be handled. It has been asked in the context of applying specific software and method (MICE). I am interested in understanding a bit better what ...

Johan

346

asked Aug 27, 2024 at 11:17

0 votes

1 answer

150 views

Strong potential of MNAR issue - can we still use MI i.e., mice() in R?

Study goal: estimate the proportion of patients who experience outcome Y (1=Yes, 0=No) within max 5 years of follow-up. Missing data issue: Outcome Y is missing for a large proportion of people (96% ...

mmaliniak

1

asked Aug 20, 2024 at 19:23

1 vote

2 answers

83 views

Mixed model vs. imputation for questionnaire scale score?

I would like to fit a statistical model where the dependent (response) variable is a validated scale score from a questionnaire. For each subject, this dependent variable is calculated from the values ...

user167591

1,173

asked Aug 20, 2024 at 13:47

0 votes

0 answers

133 views

Missing Data on Survival Analysis

I want to run a survival analysis (say, Cox model) with time of origin at birth and a disease (say, cancer) as the event,. Covariates are around 5 demographic variables (age, sex, etc.). The problem ...

processing_statistician

317

asked Aug 13, 2024 at 19:20

1 vote

1 answer

239 views

How to pool results of type II ANCOVA from multiply imputed data set?

I'm using MICE to impute a small data set. I am going to use ANCOVA of type II through Anova function of R package car. However, ...

wdg

335

asked Aug 7, 2024 at 1:41

4 votes

2 answers

677 views

Assessing model fit in logistic regression with multiple imputation

I'm wondering if there is any established method for assessing model fit in logistic regression conducted with multiple imputed datasets. To the best of my knowledge, there are two primary approaches ...

JuBe96

43

asked Aug 5, 2024 at 18:22

1 vote

1 answer

163 views

Multiple imputation with some missing baseline data and with some missing longitudinal data

How shall I impute the data in the following situation: I have some baseline covariates collected and longitudinal data. Both baseline covariates and longitudinal data have some missing data. Shall I ...

Kate

347

asked Jul 15, 2024 at 13:36

2 votes

2 answers

111 views

Imputation Missing data

I have a longitudinal data set with 2 dependent variables (couple) - a husband and a wife. There were 2 waves for the husbands and 3 waves for the wives. Since there is a lot of missing data, I ...

eagersquirrel

41

asked Jul 1, 2024 at 10:36

3 votes

1 answer

238 views

How to conduct statistical analysis on multiply imputed data?

I have a data.frame named mydata with 6 columns: status, times, t1, t2, t3, t4. However, t1, t2, t3, and t4 contain missing values in this dataset. I intend to impute these missing values using the ...

dbcoffee

219

asked Jun 26, 2024 at 16:07

1 vote

0 answers

58 views

Multiple Imputation for Missing Outcome Data

I have spent an extensive amount of time trying to understand the possible role of MICE in helping to "fill in" missing outcome data. I am relatively new to both multiple imputation and ...

R Har

11

asked May 31, 2024 at 23:57

3 votes

1 answer

220 views

Bootstrap standard errors after multiple imputation

Following Rubin's rules for multiple imputation, I've calculated pooled estimates, group means in this case, with pooled standard errors. I checked this with a bootstrap and, assuming pooled standard ...

jay.sf

1,049

asked May 21, 2024 at 6:28

2 votes

1 answer

115 views

Multiple Imputation method in RCT

We decided to use the multiple imputation method in a RCT to solve the problem of some follow-up missing data (for completely random reasons). I was planning on using the Multiple Imputation method ...

Mai

21

asked May 17, 2024 at 12:19

1 vote

0 answers

64 views

How to access weighted group means and standard errors after using mice and WeightIt? [closed]

Some background: I imputed and weighted data from two groups of people, one group in a certain organization and one outside of it. Ultimately, I want to compare how they develop psychological trait X ...

MHx01

33

asked May 5, 2024 at 13:50

2 votes

2 answers

433 views

Multiple Imputation for missing outcomes in Cox regression

Imagine an RCT with a time-to-event outcome which is analyzed using a Cox regression. There are four assessments (T1=before randomization, T2=3 weeks, T3=6 weeks, T4=12 weeks). Under the censoring at ...

Survival

149

asked Apr 30, 2024 at 10:30

2 votes

1 answer

87 views

Multiple imputation process

Whether it is a method for dealing with monotonic or arbitrary missing data (FCS or MICE), there is a process I do not understand. Let's take the example of linear regression for continuous variables: ...

Guillaume

83

asked Apr 16, 2024 at 8:57

0 votes

0 answers

100 views

Multiple imputation of longitudinal data in SPSS

I'm attempting to analyse a longitudinal, retrospective dataset with measurements at various time-points. The data-set has a significant amount of missing data, up to 30% for the main outcome variable,...

R.A. Been

1

asked Apr 15, 2024 at 13:00

0 votes

0 answers

84 views

Does survey R package allow me to do beta regression?

I have a complex survey dataset with a response (dependent variable) bounded between 0 and 1, where I have applied multiple imputation to the dataset to account for missing data. The response formally ...

user45765

1,465

asked Mar 28, 2024 at 1:36

7 votes

3 answers

2k views

How to analyze a dichotomous outcome with 50% missing data?

I am researching predictors of dropout from a training program. I want so to see if personality traits add incremental variance above well-established predictors like age, fitness, and education. So, ...

E_H

351

asked Mar 23, 2024 at 9:49

2 votes

2 answers

315 views

How to compare 2 multiply imputed nested Cox proportional hazards models?

I've got 2 nested Cox models, which I fit to 10 imputed datasets. Pooling the regression coefficient estimates and associated p-values I've done already. I'm trying to work out if adding one extra ...

Isaac Allen

51

asked Mar 20, 2024 at 14:11

2 votes

1 answer

132 views

NA results after pooling estimates and coeff of mixed effects cox model from MICE imputation dataset

I need your help with my problem. So, after step of imputation missing data through MICE method, I got multiple imputed dataset. Then, I pooled the estimates and coefficients with mixed effect cox ...

Hoang-Giang Pham

21

asked Feb 26, 2024 at 10:12

6 votes

4 answers

1k views

Why is multiple imputation not used more widely in Data Science? [closed]

I posted this question a few days ago on datascience.SE because I thought it was more relevant there: Why is multiple imputation not used more widely in Data Science? I have a background in ...

Joe King

4,192

asked Jan 18, 2024 at 10:32

1 vote

0 answers

72 views

Multiple imputation: deleting cases before imputation

Note: The question has been edited to make it more focused, and the title has been changed to make it clearer. I have read questions/answers about how to select variables for imputation. This question ...

Verity

11

asked Jan 17, 2024 at 9:59

2 votes

1 answer

298 views

Visit and order sequence for multiple imputation in mice r

I want to use the R package MICE for Multiple Imputation and I have a question concerning the order of my dataset - regarding the order of my variables on the one hand and the order of my cases on the ...

rNewbie

23

asked Jan 5, 2024 at 19:34

1 vote

0 answers

174 views

After the mutiple imputation (MICE package in R), I still found that some variables are still with missing values. How to deal with it?

I have a relatively large data set with around 12000 samples with 550 variables. Originally, I have around 800 variables, I used a rule that if missing rate in each variable is larger than 80% I will ...

Steven Xu

71

asked Jan 5, 2024 at 14:36

2 votes

2 answers

1k views

Pooling p-values from hypothesis tests after multiple imputation

I'm working on a project that is using some more advanced statistical methods and coding than I'm normally used to and would appreciate some help. The project required me to do multiple imputation, ...

smirza

31

asked Dec 14, 2023 at 19:31

Questions tagged [multiple-imputation]