Newest 'model-selection' Questions

0 votes

0 answers

44 views

How to plot AIC, BIC of all possible models?

Suppose I was given a data set, say, golf, in the form of an MLR model. Given that best subset selection is choosing the top 5 best models of each size, how would ...

DavyJonessss

1

asked Nov 16 at 0:57

1 vote

0 answers

22 views

3-way holdout for performance evaluation but 2-way for model selection

The paper https://arxiv.org/pdf/1811.12808 by Sebastian Raschka explains how to perform 3-way holdout method, and also how to compute the final model (used in production). During computation of the ...

Ayrat

43

asked Nov 15 at 11:44

2 votes

0 answers

61 views

Do k-folds risk sampling bias and, if so, how do we avoid it?

In cross-validation, $k$-folds are a common way to train, compare and validate models. Often we want to find an optimal set of hyperparameters for our models. There are many ways to probe the ...

Markus Klyver

311

asked Oct 18 at 16:51

0 votes

0 answers

52 views

Dealing with high concurvity and variable selection in GAMMs with imbalanced data (mgcv::bam)

I am using GAMMs to model the probability of occurrence of a species, applying logistic regressions with mgcv::bam() to presence-pseudoabsence data. The dataset ...

airC

41

asked Oct 15 at 15:12

0 votes

0 answers

41 views

How do I conduct backward selection on my OLS regression with Newey-West standard errors?

I have run an OLS regression and detected that it contains autocorrelation and heteroskedasticity. To deal with this I intend to use Newey-West standard errors. But I am not sure what is the proper ...

Mateo Bergman

1

asked Oct 4 at 10:20

0 votes

0 answers

55 views

LASSO and cross validation when dealing with missing data

I want to simulate data with missing values and use them to compare the predictive performance of several machine learning algorithms, including LASSO. All analyses will be performed in R, using the ...

Benykō-Zamurai

563

asked Jul 23 at 12:38

0 votes

1 answer

76 views

How to model feeder choice in bees while ignoring unbalanced feeding events per bout?

I'm analyzing an experiment I ran with bumblebees, and really struggling with choosing the appropriate model. In the experiment, each bee made feeder choices across two temperature conditions: ...

bee-researcher

1

asked Jul 19 at 17:35

1 vote

0 answers

64 views

How to justify the number of background points in MaxEnt species distribution modeling?

I'm building a species distribution model using MaxEnt with 260 presence points, collected opportunistically within a relatively small study area (a single administrative department in France). I'm ...

Martin Eden

11

asked Jul 8 at 10:08

0 votes

0 answers

41 views

How to interpret AIC model selection and uninformative parameters

I have a model set with 36 candidate models and 4 models with an AIC less than or equal to 2.0. I do not want to model average because I don't think my candidate set really fits in with the caveats ...

Amanda Goldberg

283

asked Jul 4 at 23:50

1 vote

1 answer

43 views

DCC-GARCH: Valid to have different GARCH models for each series?

Most DCC-GARCH tutorials and guides I found online often use "replicate" in creating their DCC specification, i.e. ...

Matt

43

asked Jun 14 at 4:13

0 votes

1 answer

93 views

DCC-GARCH: Correct way of choosing between the normal distribution and t-distribution

DCC-GARCH is comprised of two stages: (1) estimating the univariate GARCH and (2) estimating the correlations through DCC. My time series (bond yields) is not normally distributed, as they rejected ...

Matt

43

asked Jun 11 at 14:36

1 vote

1 answer

65 views

DCC GARCH - Is there any merit in setting omega to zero?

I estimated the univariate GARCH models for each series, and all coefficients are statistically significant. However, upon putting them into one DCC-GARCH model with a DCC(1,1) spec, the individual ...

Matt

43

asked Jun 9 at 2:47

1 vote

1 answer

79 views

Can Goodness-of-Fit Test be used for Model Selection?

I would like to know whether Goodness of Fit Tests (like Pearson's Chi-squared test or Kolmogorov-Smirnov Test) be used to select which probabilistic distribution model certain empirical observation ...

Luthfi Ahmad

11

asked Jun 4 at 3:12

0 votes

1 answer

52 views

Why do overfitted models in finite mixture regression sometimes have the smallest BIC despite the true number of components being selected frequently?

Learning about EM algorithms and finite mixture models and I've run into a particularly unintuitive problem. I'm trying to fit a finite mixture regression model on simulated data, where the true ...

dancing_monkeys

35

asked May 22 at 20:56

0 votes

0 answers

76 views

Linear regression after multiple imputation: Should assumptions be checked before or after AIC-based model selection?

I’m currently working on multiple regression analyses with a small sample (n = 36), using multiple imputation via the mice package in R (5 imputed datasets). The ...

statsInPractice

1

asked May 22 at 8:01

1 vote

0 answers

42 views

Parsing maritime location ranges

I'm attempting to train a model to parse maritime location ranges. These are strings that can be resolved into a geographical area or a list of shipping ports. An example could be ...

Stromgren

119

asked May 20 at 9:29

6 votes

1 answer

280 views

Automatic ARIMA model selection

There are many resources explaining why automatic variable selection is bad (e.g. here). Regarding the selection of $p$, $d$, $q$ parameters in ARIMA models, the Hyndman-Khandakar algorithm combines ...

Thomas

600

asked May 13 at 8:24

0 votes

0 answers

46 views

Beta-binomial mixed model: spline of time as fixed effect, keep random slope for time if variance is very small but LRT significant?

I’m modeling longitudinal substance use (number of days consumed over 30 days) for ~930 patients with repeated measures. The outcome is modeled with a beta-binomial distribution (logit link, glmmTMB ...

W. IC.

23

asked Apr 27 at 13:54

0 votes

0 answers

51 views

Variable selection methods

I am currently trying to build a model to link water quality metrics (e.g. biochemical oxygen demand, chemical oxygen demand) with regional characteristics data (e.g. population, GDP) through multiple ...

Osuke Miyamaru

35

asked Apr 21 at 9:20

6 votes

2 answers

521 views

Variable selection strategy for descriptive modeling

From Shmueli's paper "To Explain or to Predict?", which also has a section about descriptive modeling (section 1.3): (see also this page) [Descriptive modeling] is aimed at summarizing or ...

Thomas

600

asked Apr 11 at 10:46

1 vote

1 answer

133 views

Why is the step size $\hat \gamma$ in Least Angle Regression (LARS) smaller than $\bar \gamma=\frac{\hat C}{A}$?

I'm currently studying the Least Angle Regression algorithm by Efron et al. (https://arxiv.org/abs/math/0406456). After equation (2.22) in Efron et al., the authors claim the following: It is easy to ...

flushel

155

asked Apr 7 at 15:14

2 votes

0 answers

80 views

Number of features selection using AUC

Can AUC be used for model selection, and how can the excessive number of features/parameters be penalized in this case? In frequentist framework we have various model selection criteria, like AIC, BIC,...

Roger V.

5,091

asked Mar 19 at 9:24

5 votes

0 answers

144 views

How do I handle this very non-normal response variable?

In R, I want to use a repeated measures analysis with a mixed regression model to analyze how the mean of my response variable (mean bee pollination score) varies based on 1) week, 2) number of bee ...

Emily

51

asked Mar 5 at 20:59

0 votes

0 answers

48 views

Model selection for fixed effect and crossed random effect structure in glmer

I'm new to (generalized) linear mixed effects models. Any help would be appreciated! Below is my study design with dummy data. I'm exploring the effects of the parameters I manipulated in game 1 on ...

fox_jane

1

asked Feb 27 at 2:19

2 votes

1 answer

117 views

Interaction Effect on the dependent variable?

I would like to run a model in R with two binary dependent variables. I know how to model an interaction on the independent variable, but is it possible to do this on the dependent variable too? If my ...

Milli

21

asked Feb 17 at 13:43

0 votes

0 answers

31 views

Linear regression [duplicate]

If I have a single model say y = ax^2 + bx + c, can I use 3 linear regression algorithms y=ax^2, y=ax and y=a to learn the original function if use the same data set. Please help me out here.

Neelesh Samptur

1

asked Feb 15 at 0:10

0 votes

0 answers

53 views

Choosing ARIMA order from ACF PACF plot

I'm doing project using ARIMA and i face a problem where I cannot choose the order for ARIMA model. I know that i had to choose the order by identifying the significant lag, but the PACF plot showing ...

Milda KS

1

asked Feb 9 at 5:09

4 votes

1 answer

114 views

Smooth AIC selection

Suppose I have a family of $N$ models for the same data, indexed by $n\in\{1,\dots,N\}$. And suppose that model $n\in\{1,\dots,N\}$ has log-likelihood given by: $$L(X_n \theta_n),$$ where $L:\mathbb{R}...

cfp

565

asked Feb 5 at 16:57

1 vote

1 answer

71 views

Is it okay to select any of the surrogate models in nested cv?

Let's say I pick any of the winning surrogate models in my nested cv (in theory if you do k outer folds you could have k surrogate models) to simplify things, lets say I pick the first model and just ...

iYOA

185

asked Feb 4 at 15:34

2 votes

0 answers

87 views

Why is a holdout test set an unbiased estimator of the selected model’s generalization error?

Let $\mathcal{D}_{\text{train}}$ be a training dataset, and let $D_{\text{test}} = \{(x_{\text{test}}, y_{\text{test}})\}$ be a single holdout test point drawn independently from the same distribution ...

iYOA

185

asked Feb 3 at 15:17

0 votes

0 answers

78 views

Interpreting Nested CV Results When Selected Model Didn't Win All Outer Folds

In nested cross validation, I'm seeing an interesting scenario that I'd like to understand better: Using 4-fold outer CV, my model selection process chose Model A overall (it performed best on average ...

iYOA

185

asked Jan 31 at 23:09

1 vote

1 answer

141 views

Feature selection and outlier detection in panel regression with fixed effects

I am trying to fit the following panel regression with fixed entity effects $$Y_{it} = \alpha_i + \sum_j \beta_jX^{(j)}_{it} + \epsilon_{it},$$ where the index $j$ labels the different features. Some ...

Mark Dubin

11

asked Jan 11 at 17:54

0 votes

0 answers

45 views

ARMA estimation before GJR-GARCH. How to proceed with multiple time-series?

I want to study the Conditional Variance of various crypto-currencies returns series (13, of which 5 meme, 8 "serious"). Since my main focus is the asymmetric response of the variance ...

Hodmezor

1

asked Dec 21, 2024 at 14:37

1 vote

1 answer

107 views

How to calculate the BIC for each mixture component

I want to fit a mixture of Gaussian to simulated data. Then, I need to calculate the Bayesian information criteria for each mixture component. My point is that, after the model convergence, I ...

Dr. Statistics

304

asked Dec 18, 2024 at 13:02

6 votes

1 answer

476 views

Preventing data leakage in time-series data splitting

I am working on a fault detection problem for a mechanical system where the goal is to determine the fault type. I use a dataset that for each type of fault (target label) has three sizes and each ...

S.H.W

99

asked Dec 17, 2024 at 15:38

1 vote

1 answer

104 views

Lasso and cross validation: model selection

Apologies for cross-posting I am starting to use Lasso and cross validation for model selection to explain a dependent variable using linear models, but I can not understand why all p-values ...

Rodrigo Badilla

13

asked Dec 16, 2024 at 11:55

1 vote

1 answer

93 views

Two questions about the VC theory (on the generalization error bound)

In Andrews Ng's machine learning notes (https://cs229.stanford.edu/main_notes.pdf), he introduced the following bound for the difference between generalization error and training error (see the ...

ExcitedSnail

3,090

asked Dec 13, 2024 at 14:41

0 votes

0 answers

90 views

How to tune hyperparameters for low calibration error under small dataset

I'm studying which variant of variational autoencoders (VAE) gives better expected calibration error (ECE) (see also this doc) under small dataset. According to google's tuning playbook, to compare ...

Kaiwen

307

asked Dec 10, 2024 at 10:36

5 votes

1 answer

320 views

Why is AIC useful for comparing GAMs? Only for prediction?

I have a follow-up question to this OP. I hope to understand the difference between comparing 2 models with AIC, and interpreting the summary output of the full model - specifically for GAMs. Gavin ...

Nate

2,537

asked Dec 9, 2024 at 17:46

1 vote

1 answer

92 views

Interaction terms in logistic regression model of patient mortality

Admittedly, I am a bit inexperienced in the world of statistics and data modeling but am trying my best to learn on the job. As a first time user, I apologize if there are any formatting errors here! ...

LonelyBadger12

13

asked Dec 5, 2024 at 18:29

23 votes

4 answers

3k views

Is it (always) better to build a model prior to viewing the data?

When it comes to data exploration, aside from checking for outliers (human error), correlated covariates, and missing values, is there a downside to viewing relationships between a response variable ...

Nate

2,537

asked Dec 2, 2024 at 1:50

16 votes

2 answers

841 views

Advantages of information criteria over cross-validation

I understand AIC is asymptotically equivalent to leave-one-out cross-validation and that BIC has a similar asymptotic equivalence to leave-k-out cross-validation. My question is, other than ...

Louis F-H

271

asked Nov 24, 2024 at 17:27

0 votes

1 answer

111 views

Problems with using ACF and PACF for ARMA modelling

This is the ACF and PACF for my the first difference of my variable $\Delta y_t,$ I used the ADF test, the PP test, the Schmidt Phillips test and the DFGLS test, and got the same result that my ...

alyosha

1

asked Nov 23, 2024 at 19:32

3 votes

2 answers

309 views

Why use nested validation when doing both hyper-parameter tuning and model selection?

The monograph Cross Validation contains a section on nested cross-validation for hyper-parameter optimisation (page 6). The author refers to this paper for a reason why it is better to decouple hp-...

Ayrat

43

asked Nov 22, 2024 at 15:49

2 votes

0 answers

59 views

weird results from Bai-Ng PCs selection criteria implementation of "dfms" on R

I am trying to select the number of Principal Components of this data following the optimality criteria of Bai and Ng (2002), on R. The function ICr from the ...

oibaFox

21

asked Nov 21, 2024 at 0:10

1 vote

0 answers

56 views

Statistical Tests for Model Selection in Nested Cross Validation?

I’m using nested cross-validation to evaluate multiple models and hyperparameter configurations. After running trials with different random seeds (outer: 3-fold with 10 seeds, inner: 5-fold with 50 ...

iYOA

185

asked Nov 19, 2024 at 17:31

0 votes

0 answers

32 views

Variable selection for checking casual relationship of regression model: should or should not? [duplicate]

I am looking for documents and online sources to understand whether or not I should exclude variables from my model through model selection (variable selection). I also tried to use methods of Least ...

Student coding

137

asked Nov 17, 2024 at 13:31

6 votes

2 answers

330 views

Is hierachical regression with aggressive p-deletion really much 'better' than stepwise?

In many medical science fields "hierarchical regression" is a popular method. The approach is to break variables into categories, add one category of variables at a time and then remove ...

purple-blade

313

asked Nov 7, 2024 at 15:43

2 votes

1 answer

153 views

linear mixed effects models (lme): model comparison via AIC() or anova() function

I have a quick question concerning model selection for linear mixed effects models: When directly comparing AICs of two models (either including or excluding an additional fixed effect) versus ...

Julia

41

asked Oct 24, 2024 at 11:54

3 votes

1 answer

135 views

Why is stepwise selection of variable still taught in university statistics classes? [closed]

I have on more than one occasion come across both recently-published textbooks and classes that teach the use of stepwise methods for model construction. Why is this still done, given the problems ...

Community wiki

Bryan

Questions tagged [model-selection]