Skip to main content

Questions tagged [overfitting]

Modeling error (especially sampling error) instead of replicable and informative relationships among variables improves model fit statistics, but reduces parsimony, and worsens explanatory and predictive validity.

Filter by
Sorted by
Tagged with
1 vote
1 answer
312 views

My data has 1530 samples and about 50 features. Not all features are used, some are removed after a feature selection process. Now I'm facing overfitting, and one solution to overfitting is ...
CORy's user avatar
  • 563
3 votes
2 answers
569 views

Imagine, we have speed data of car and we would like to detect, if car speeds up or down more than it should. Do I want to just overfit my model, so the outlier (higher or lower speed) would lead me ...
Mr. Panda's user avatar
  • 325
1 vote
0 answers
93 views

I've recently started to work in machine learning and this is my first post here. Excuse me in advance for duplicates and/or slang mistakes. My question is about transfer learning (although in this ...
leapofFaith's user avatar
1 vote
0 answers
80 views

In two datasets that are composed of the same sample number, about 1500, but with different features. The first dataset has 15 predictive features and the second has 40. Now for someone who is ...
Programming Noob's user avatar
2 votes
1 answer
1k views

I am trying to build a classification xgboost model at work, and I'm facing overfitting issue that I have never seen before. My training sample size is 320,000 X 718 and testing sample is 80,000 X 78 ...
Piyush's user avatar
  • 215
2 votes
1 answer
334 views

I have a computer science background but I am trying to learn how to apply ML by solving small problems. I have been working on this problem for the last couple of days and I cannot find a solution. I ...
pingu87's user avatar
  • 21
0 votes
0 answers
159 views

The dataset we are using consists of ~3000 images split at 60/40 partition for training/testing. We have used sklearn's GridSearchCV and ...
Colton Seegmiller's user avatar
0 votes
0 answers
50 views

The input are a timeseries of 1x41x41 geospatial images (so, 5x1x41x41 for example). I managed to achieve an MSE of 0.53 with PCA and Random Forest. But I thought to use ConvLSTM since my input size ...
Doomski's user avatar
0 votes
0 answers
57 views

I have a convolutional neural network that uses Resnet(18,34 or 50 doesn't matter) as the backbone and pretrained weights from ImageNet.When I try training it with a single image for 50 or so epochs, ...
K dai's user avatar
  • 1
0 votes
1 answer
235 views

I've recently gotten into ML and I'm a bit confused about rep k fold cross validation, train, test split and overfitting. I have already read some of the posts in this forum, but none of them could ...
Domi's user avatar
  • 1
2 votes
1 answer
1k views

The curse of dimensionality refers to the fact when a model tries to fit the data in a very high dimensional space (and there is not enough training data). In my mind, I believe that this curse ...
lalaland's user avatar
  • 247
3 votes
0 answers
89 views

My team is conducting a counterfactual difference-in-differences (DiD) healthcare analysis to estimate the benefits of home nursing visits compared with a control group. We've "pre-selected" ...
RobertF's user avatar
  • 6,644
2 votes
1 answer
255 views

I'm training a model on a small dataset of images. following are the curves of accuracy, f1 score and auc score. it's clear that the model is overfitting, however I don't understand why after sometime ...
Ines's user avatar
  • 31
2 votes
0 answers
211 views

I was trying out different regression models to fit a time series. Models include a multiple linear regression model, ReLU regression models (with varying numbers of ReLU functions) and sigmoid ...
Jack's user avatar
  • 71
2 votes
1 answer
275 views

I know that with k=1 a KNN lead to overfitting, this is because it follows the noisy data of the training sample and not generalize well on new input sample. But I am confused on how this happens, I ...
DYLAN NICO AMBROSI's user avatar
1 vote
2 answers
336 views

I am using SVM regressor models to fit some chemical data related to spectroscopy (I cannot say exactly what data because it is an ongoing research in my group). To combat overfitting, I have used 5-...
S R Maiti's user avatar
  • 163
1 vote
0 answers
48 views

I have a classifer (one/zero labels) that was trained and hypertuned by the book. When the model was ready, I applied it to the production data: real-time and unlabeled. After a short period (a few ...
Amit S's user avatar
  • 77
1 vote
0 answers
79 views

Consider the two pairs of learning curves below. The red and green lines are the training and validation curves of some model 1, and the gray and orange lines are the training and validation curves of ...
Tfovid's user avatar
  • 815
1 vote
1 answer
289 views

I have a binary classification problem with Dataset N430 and predictors=146. Both Validation and training accuracies along with losses fluctuates. What would be the reason and suggest solution please?
Asif Munir's user avatar
2 votes
1 answer
187 views

I have a dataframe which is a made of many datasets combined together (many datasets with the same predictive features but with different samples combined together). This dataframe, called ...
Programming Noob's user avatar
3 votes
2 answers
3k views

I fit a regression model on a data set and get some in-sample RMSE. I wanted to know, how likely is that I get this good RMSE (or even better) under assumptions that there are no patterns in the data. ...
Roman's user avatar
  • 774
6 votes
1 answer
786 views

Someone built an XGBoost classification model using each pixel in an image (256*256) as a separate feature, plus a few other features. However they only have 500 data points. The target classes were ...
Alex's user avatar
  • 185
0 votes
2 answers
285 views

I am running a random forest classifier in R and during 10-fold cross-validation, I discovered that the model is overfitting. I am using a grid search to find the best hyperparameters and used the ...
cassandra star's user avatar
1 vote
3 answers
166 views

What exactly is overfitting while building models ?
SR1's user avatar
  • 31
1 vote
0 answers
93 views

I'm trying to reconstruct a function, $A(x)$ from the results of some detectors. Essentially, I have a set of $n$ points which are $ V_{i} = \int_{-\infty}^{\infty} A(x) e^{-(x - v_{i})^{2}} dx $ ...
user1150512's user avatar
4 votes
1 answer
3k views

Usually it is called over-fitting when the test error is higher than the training error. Does that imply that it is called under-fitting when the training error is higher than the test error? Also ...
Just a stat student's user avatar
0 votes
0 answers
18 views

I'm working on a binary classification model over the RAVDESS dataset with a CNN model. These are the performances on the train and validation set and these are the performance on the test set for ...
Damiano Imola's user avatar
2 votes
0 answers
96 views

I gather RNA-seq transcriptomic data from multiple cancer datasets. The datasets are about a treatment of cancer, we check Response vs NoResponse samples. The RNA-seq data I gather is before the ...
Programming Noob's user avatar
1 vote
1 answer
358 views

In https://arxiv.org/pdf/1908.05355.pdf, it mentioned double descent that training loss is decreasing, increasing and then decreasing again. And the important point ...
Mark's user avatar
  • 171
3 votes
2 answers
658 views

I read here that there are statistical approaches to assess whether a tractable machine learning model (e.g., a linear regression model) overfits a dataset: Simpler models that have originated in ...
Rafs's user avatar
  • 453
0 votes
0 answers
292 views

I am training CNNs for image segmentation on a limited dataset and apply some on-the-fly data augmentation. I measure mean intersection over union (mean IoU) to evaluate the training and select models....
Manuel Popp's user avatar
1 vote
1 answer
148 views

I search this kind of question for a while and I find many discussions involve on counting the number of parameters of a Convolutional Neural Network, but not on the inputs. Using the Fashion MNIST ...
rodericktung's user avatar
2 votes
1 answer
180 views

I read the following: To prevent overfitting we would like to work with as few components as possible". How does the number of the mixture component affect the fit of the model? Is that because ...
Maryam's user avatar
  • 1,720
1 vote
0 answers
91 views

I have a dataset for energy consumer customers and binary target variables with which I want to predict the churn for the customers. Counts of target values Not Churn 0: 14153 Churn 1: 1520 I have ...
Paul's user avatar
  • 31
1 vote
0 answers
49 views

For regularizing neural networks, I'm familiar with drop-out and l2/l1 regularization, which were the biggest players in the late 2010's. Have any significant/strong competitors risen up since then?
chausies's user avatar
  • 561
1 vote
0 answers
129 views

I am using a Logistic Regression Classifier on the Airline Cancellation dataset. Please note that the training set was undersampled (in order to balance classes) while the test set was left as it was. ...
vincenzoconv99's user avatar
2 votes
1 answer
177 views

Let's say you have a set of potential explanatory variables (e.g. p = 8) that you think are important to explain your response variable ($Y$) but your sample is too small to include them all in the ...
Fanfoué's user avatar
  • 661
2 votes
0 answers
355 views

Deep double descent is an empirically observed phenomenon that happens with contemporary neural networks. Its essence is that often, increasing the model complexity first leads to the test loss ...
CrabMan's user avatar
  • 172
2 votes
0 answers
86 views

I'm training a logistic classifier to classify 5 classes using scikit-learn. The data isn't extremely imbalanced (class 1: 27.7%, class 2: 19.4%, class 3: 17%, class 4: 19.6%, class 5: 16.2%). I'm ...
Zoe's user avatar
  • 21
1 vote
1 answer
120 views

I wish to train the ANN and use regularizers to avoid overfitting. I need some suggestions, is it mandatory to normalize the data before using L1, L2 regularizers. I would highly appreciate if you can ...
SiH's user avatar
  • 141
3 votes
1 answer
399 views

I recently came accross the algorithm of Matrix Factorization for a recommendations system. One of the tutorials I followed can be found here. According to it given the initial matrix $R$ and the ...
RookieCookie's user avatar
0 votes
1 answer
561 views

I want to do a multi class classification with 6 classes. Whole dataset has 12750 and 56 features samples, so every class has 2125 samples. Before prediction I reduces amount of outliers by ...
jared's user avatar
  • 31
0 votes
0 answers
135 views

I am training a CNN to predict age,mass and tone from images. The structure of my dateset is as follows ...
Sparsh Garg's user avatar
13 votes
3 answers
3k views

I've seen that early stopping is a form of regularization that limits the movement of the parameters $\theta$ in a similar way that L2 Regularization penalizes the movement of $\theta$ to be closer to ...
wd violet's user avatar
  • 787
1 vote
2 answers
194 views

I have a data set with one real-valued feature and a real-valued target. Someone has used this data set to fit a model (a regression). I get a results of this fit, which is a single function mapping ...
Roman's user avatar
  • 774
0 votes
0 answers
18 views

I'm working on a multiple regression problem where I have reasons to believe some (if not all) of the regressors have been cherry picked/data mined to a varying degree. My hypotheses are that there's ...
stevew's user avatar
  • 841
8 votes
2 answers
792 views

I would like some clarification as to how principal component analysis mitigates the Curse of Dimensionality problem. My particular interest is in curbing overfitting in my modelling, or more ...
Andrew Beaven's user avatar
2 votes
2 answers
1k views

For a non-linearly separable problem, when there are enough features, we can make the data linearly separable. It seems to me that for logistic regression, the reason of overfitting is always ...
Santi Du's user avatar
4 votes
1 answer
505 views

Motivation Everyone knows that fitting high variance models requires more data. A "yes" answer to the question would suggest that more data is also needed to evaluate these models. ...
chicxulub's user avatar
  • 1,645
1 vote
1 answer
674 views

I have been thinking about this for a few so I would like to hear some opinions. It could be complicated to explain so I will update the question if there is something that its not clear. Imagine I ...
Sergiodiaz53's user avatar

1 2
3
4 5
21