Newest 'overfitting' Questions

1 vote

0 answers

47 views

Potential CNN Overfitting Due to Limited Training Data

Neural Network Beginner here. I am currently implementing a CNN on PyTorch for recognizing Japanese handwritten letters, which has 46 classes of outputs. I found a dataset on Kaggle https://www.kaggle....

Krish Thyagarajan

11

asked Sep 7 at 16:33

0 votes

0 answers

51 views

Generalization Error PCA (with closed formula) versus Ridge

There is something I have an intuition on but my numerical toy examples do not confirm, and I really want to understand where is my mistake. I suppose that I have a random vector $X = (X_1, \cdots, ...

arthur_elbrdn

83

asked Jul 28 at 14:21

3 votes

3 answers

294 views

How might softmax cause overfit in a neural model, even treated from a Bayesian perspective?

The title is perhaps purposely provocative, but still reflects my ignorance. I am trying to understand carefully why, despite a very nice Bayesian interpretation, softmax might overfit, since I've ...

Chris

322

asked Jul 2 at 21:34

1 vote

0 answers

28 views

Inference validity of an ordered logit model with only 50 observations

How accurate are the estatimates of an order logit model with only 51 observations? Here is my stata output from the model:

Oindrila Roy

11

asked Jun 25 at 19:32

0 votes

1 answer

52 views

Why do overfitted models in finite mixture regression sometimes have the smallest BIC despite the true number of components being selected frequently?

Learning about EM algorithms and finite mixture models and I've run into a particularly unintuitive problem. I'm trying to fit a finite mixture regression model on simulated data, where the true ...

dancing_monkeys

35

asked May 22 at 20:56

1 vote

0 answers

60 views

Overfitting problem in classification CNN

So I have a school project which is to train a CNN with our own architecture to be able to classify marine mammals with a minimum accuracy of 0.82 I have been trying a lot of things and different way ...

erodrigu

11

asked Mar 26 at 21:11

2 votes

0 answers

80 views

Number of features selection using AUC

Can AUC be used for model selection, and how can the excessive number of features/parameters be penalized in this case? In frequentist framework we have various model selection criteria, like AIC, BIC,...

Roger V.

5,091

asked Mar 19 at 9:24

1 vote

1 answer

78 views

Gridsearch results vs learning curve

I am using a GridSearchCV to optimize some hyper parameters on a xgboost model. However, although the logloss (metric I am optimizing for) seems alright according to domain knowledge, the learning ...

user54565

89

asked Mar 14 at 18:27

1 vote

1 answer

118 views

How to reduce overfitting for a randomforest model even when cross validation is implemented?

I'm working on fitting a random forest model using the caret library in R with a repeated cross-validation design to select hyperparameters. I've also experimented with adjusting the number of trees (...

Mdhale

133

asked Mar 12 at 15:59

1 vote

0 answers

54 views

Is there an one to one relationship between high bias and underfitting, and with high variance and overfitting?

Assume you have training data $(x_1,y_1), \ldots, (x_n,y_n)$ and a relationship $y_i=f(x_i)+\epsilon_i$, where $\epsilon$ is a random variable. Assume you approximate $f$ with $\hat{f}$ using the ...

user394334

296

asked Feb 24 at 10:47

2 votes

1 answer

203 views

How to identify problems with mgcv:gam(y ~ s(x) + s(x, fac, bs="sz"))? [closed]

This is sort-of a follow-up from my last question, except purely based on curiosity. I found different versions of similar bs="sz" models in ...

Nate

2,537

asked Feb 17 at 3:26

1 vote

0 answers

57 views

The use of cross-validation and a hold-out set

I've been thinking about the use of cross-validation and hold-out sets and I don't really see the use of a randomly selected hold-out test set. I have to say, though, that when the hold-out is not ...

adriavc00

11

asked Feb 10 at 21:29

4 votes

1 answer

114 views

Smooth AIC selection

Suppose I have a family of $N$ models for the same data, indexed by $n\in\{1,\dots,N\}$. And suppose that model $n\in\{1,\dots,N\}$ has log-likelihood given by: $$L(X_n \theta_n),$$ where $L:\mathbb{R}...

cfp

565

asked Feb 5 at 16:57

0 votes

0 answers

90 views

Reducing MLP overfitting for feature importance

I am training an MLP on a dataset with the number of features >> number of samples. For certain reasons, MLPs with at least one hidden layer is the only architecture I am considering. ...

dkolobok

21

asked Jan 6 at 3:13

1 vote

0 answers

50 views

Model Performance Varying Greatly

I have built an XGBoost model that performs rather weirdly across months... I trained the model on a heavily imbalanced dataset (1:40 000), which I undersampled to (1:500). The model performance (...

user24758287

111

asked Dec 30, 2024 at 2:57

3 votes

1 answer

372 views

What should the objective be when tuning hyperparameters to minimize overfitting?

I'm working on a classification problem with ~90k data rows and 12 features. I'm trying to tune the hyperparamters of an XGBoost model to minimize the overfitting. I use ROC_AUC as the metric to ...

WatermelonBunny

60

asked Dec 18, 2024 at 9:28

26 votes

2 answers

4k views

Why doesn't ML suffer from curse of dimensionality?

Disclaimer: I asked this question on Data Science Stack Exchange 3 days ago, and got no response so far. Maybe it is not the right site. I am hoping for more positive engagement here. This is a ...

Landon Carter

1,845

asked Nov 16, 2024 at 0:56

6 votes

1 answer

643 views

Model performs well on train and cross-validation sets but inaccurate in the test set. How to solve? [duplicate]

I've been working on a CNN binary classification model, and the model performs pretty good in both the training set, and the cross-validation set as well (both practically 1.0 acc). However, I also ...

Efe FRK

71

asked Oct 20, 2024 at 21:09

1 vote

0 answers

70 views

Is my XGBoost Model Still Overfitting (Binary Classifcation)?

I am trying to build a binary classification model with XGBoost. I made sure to split my data into the training, validation and test sets. I performed feature selection, early stoppage and ...

Shak Jivraj

11

asked Oct 17, 2024 at 2:49

0 votes

0 answers

60 views

Advice on fine-tuning an email classifier for a Pharma company

I'm an intern working on implementing a binary email classifier for a client (Pharmaceutical company) and I need some advice on fine-tuning the model. The model I'm using is Longformer (because it has ...

Bhashwar Sengupta

1

asked Sep 24, 2024 at 16:03

1 vote

0 answers

57 views

Overfitting Time Series

I have only one time series $(y_0, t_0), (y_1,t_1), \ldots, (y_n, t_n)$, with $y_i \in \mathbb{R}$ and $t_0 < \cdots < t_n$. The believe is that these are points on a function $f(t; \mu)$ with $\...

温泽海

808

asked Sep 8, 2024 at 1:13

5 votes

1 answer

176 views

What's the statistical historical precedence for generalisation beyond overfitting?

A recent work shows generalisation beyond overfitting for overparametrized systems [*]. Is there any precedence from statistics literature or is this a new phenomenon for deep learning? [*] Grokking: ...

patagonicus

2,789

asked Sep 7, 2024 at 23:15

0 votes

0 answers

86 views

Training accuracy increases up to 99% but validation accuracy stops much earlier

I am attempting to perform classification against CIFAR-100 dataset using a Resnet model that I implemented. I have been trying multiple different hyperparameter configurations, changing learning ...

codinator

123

asked Sep 4, 2024 at 9:00

1 vote

0 answers

76 views

What are the appropriate data splitting techniques for time-dependent sequential datasets, such as breakdown records over time?

I am working with a time-dependent sequential dataset, specifically a record of machine breakdowns over a period of time. My dataset includes data from the sensors of several machines until they fail ...

user386164

111

asked Aug 26, 2024 at 11:06

5 votes

1 answer

687 views

Are epochs the same as data duplication?

Epochs, the number of times training is repeated on the original data, are absolutely necessary for neural networks where there are often many more parameters than original instances. What is the ...

Mitch

2,099

asked Aug 19, 2024 at 14:12

1 vote

0 answers

49 views

Train model with labels generated by similar model: overfit?

I train models to predict some linear features from aerial imagery. Because the reference data are just lines, I made a simple buffer so that labels resemble very approximately the width of the target ...

Pythonisa

11

asked Aug 8, 2024 at 7:58

0 votes

0 answers

86 views

Augmenting data for LSTM

The problem: I have a datset with monthly economic indicators alongside monthly stock price, containing 434 total observations. I have attempted to fit an LSTM onto the data, but it seems to ...

altayir1

1

asked Jul 18, 2024 at 10:51

1 vote

0 answers

136 views

AUC > 0.5 under null model following feature selection

I've been going over the output of a Monte Carlo model that simulates disease risk as a function of genotype. Under a null model of no disease risk, we have 1000 case and 1000 control individuals. ...

Max

145

asked Jun 27, 2024 at 22:15

0 votes

1 answer

119 views

Manual selection of parameters and features and bad results by gridsearch

For a very small dataset that I have, when I set the parameters with the help of gridsearch, the test and training results are not acceptable at all and have a huge difference. I have to manually ...

Erfan Mollai

43

asked Jun 11, 2024 at 5:06

1 vote

0 answers

69 views

Significant performance drop between train and validation set

I am trying both Lgbm and RandomForest for a classification, and I observe the same problem. I am using various metaparams to prevent overfitting, such as max_depth, num_trees (keeping it small for ...

Baron Yugovich

509

asked May 30, 2024 at 13:34

0 votes

0 answers

116 views

Path analysis with perfect fit

I'm trying to determine if I can display two regression models and the covariance between the dependent variables in one unified model using path analysis with lavaan in R. In the following (scaled) ...

BlueMarlin

43

asked May 22, 2024 at 12:42

2 votes

0 answers

151 views

Regression with small sample size - LASSO or remove variables?

I'm trying to run a regression, but I only have 14 observations, each being a different city in the US. My dependent variable is the total number of trips per capita, and my explanatory variables are ...

BeyondConfused

21

asked May 6, 2024 at 21:48

11 votes

1 answer

3k views

Getting 99-100% accuracy on my training/validation data but performs bad on completely new data

I have a large dataset of the ASL (American Sign Language). I split this data into 70:15:15 for train, validation, test. I then trained a CNN model on it, where I trained using the 70%, and evaluated ...

codinator

123

asked May 6, 2024 at 10:59

2 votes

1 answer

111 views

Estimate number of covariates in Cox regression model

My doubt about overfitting is almost general, but in this particular case is all about survival models. I am working in a case-cohort study, estimating the HR in a cohort where heart attack correspond ...

Javier Hernando

713

asked Apr 25, 2024 at 10:19

1 vote

0 answers

33 views

Image classification metrics

I have been working on an image classification task using CNNs and getting some puzzling results. My training, validation and test loss keep going down with epochs and are comparable. So this might ...

Nithin

11

asked Apr 17, 2024 at 15:55

0 votes

1 answer

70 views

Does the intuitive sense of overfitting in this mechanism design context exemplify bias-variance tradeoff?

Suppose the (we can say unanimous) preference of each individual in a society is to select roads for travel by placing 95% weight on the objective of minimizing travel time, and the remaining 5% ...

user10478

133

asked Apr 16, 2024 at 17:29

1 vote

1 answer

84 views

Accuracy "overfits" but loss doesn't?

I'm perplexed as to why my loss doesn't go up when the accuracy goes down (after about 40 epochs). Isn't it possible to tell overfitting from the loss curve alone? (I'm of course referring the ...

Tfovid

815

asked Mar 31, 2024 at 8:54

1 vote

1 answer

242 views

Is my model overfitting or is my training process wrong?

I'm predicting multiclass probabilities using CatBoost Classifier. I have a balanced dataset with roughly 4000 rows, 13 features, 4 target class labels. Dataset has some outliers which I decided not ...

primadonna

43

asked Mar 26, 2024 at 22:30

0 votes

1 answer

246 views

Learning Curve to Know Underfitting or Overfitting

I want to know if the model I am using tends to be overfitting or underfitting. I am using SVM and Random Forest algorithms. How to figure it out?

Anna

3

asked Mar 23, 2024 at 14:43

4 votes

2 answers

185 views

Scaling laws for neural network memorization

I would like to ask a generalization of this question: How to perfectly overfit neural network to memorize input? Are there any scaling laws for neural network memorization? In other words, if I have ...

zfj3ub94rf576hc4eegm

91

asked Mar 5, 2024 at 2:57

0 votes

1 answer

216 views

Random Forest Regressor gives negative test score in GridSearchCV

I built a random forest regressor and used gridserachCV to tune hyperparameters. ...

Nino640

11

asked Mar 4, 2024 at 0:40

2 votes

2 answers

591 views

Can I skip test set and train on 100% of data?

Is it a viable solution to train on the whole dataset without splitting the data into 'train' and 'test' sets? In other words, is it okay to skip offline evaluation and only perform online evaluation (...

asparagus

21

asked Feb 27, 2024 at 12:11

1 vote

0 answers

153 views

Predicted R squared - when is it good enough?

In order to access whether I am overfitting a multilinear model, I have calculated the predicted $R^2$, based on the info found here. My question is, when is a predicted $R^2$ "good enough", ...

Bettina

11

asked Feb 20, 2024 at 15:00

0 votes

0 answers

68 views

Ensemble Random Forest Overfitting

I am running an ensemble random forest model (a newer method published in 2020). The model works by using a double bootstrapping step to balance imbalanced training data. Then you grow multiple ...

Greatwhite4

11

asked Feb 16, 2024 at 18:07

1 vote

0 answers

89 views

BERT eval loss increase while performance metrics also increase

I want to fine-tune BERT for Named Entity Recognition (NER). However, when fine-tuning over several epochs on different datasets I get a weird behaviour where the training loss decreases, eval loss ...

CodingSquirrel

11

asked Feb 9, 2024 at 22:23

4 votes

1 answer

323 views

Implications of keeping a "low" basis dimension in GAMM

Some of the smooths in my generalized additive mixed model (GAMM) indicate in mgcv::k.check(m) they want to be more wiggly, but I don't think I have enough data to ...

Nate

2,537

asked Feb 1, 2024 at 19:31

4 votes

1 answer

237 views

Overfitting GBM by simultaneously adding trees and lowering learning rate?

I understand that you can overfit a Gradient Boosting Machine (GBM) by using too many trees (unlike random forest), and also that you can overfit a GBM by using too high of a learning rate. My ...

David

1,276

asked Jan 20, 2024 at 2:11

1 vote

0 answers

105 views

Fitting a Gaussian function to Poisson noisy data

Let $A$, $\mu$, $\sigma$ be some positive, a priori unknown parameters. Define a Gaussian function $f$ as $$f(x) = A \mathrm{exp}\left(-\frac{1}{2} \left( \frac{x-\mu}{\sigma}\right)^2\right).$$ One ...

mathslover

141

asked Dec 29, 2023 at 16:29

3 votes

1 answer

165 views

Is my regularized logistic regression model overfit?

I have a dataset with the following characteristics: moderate sample size (~300 samples) moderate class imbalance (~20% positives) high-dimensional (the number of independent variables, again ~300, ...

ladislaw94

41

asked Nov 23, 2023 at 21:14

0 votes

1 answer

466 views

CFA: chi-square value is 0 but with degrees of freedom [closed]

I want to do a SEM analysis with an actor-partner interdependence model in Mplus. I managed to calculate it and everything seems right if I look at the means, SD's, ...

Axenox

1

asked Nov 21, 2023 at 19:22

Questions tagged [overfitting]