Questions tagged [overfitting]
Modeling error (especially sampling error) instead of replicable and informative relationships among variables improves model fit statistics, but reduces parsimony, and worsens explanatory and predictive validity.
1,002 questions
1
vote
1
answer
312
views
Is XGBoost too much to apply on my data?
My data has 1530 samples and about 50 features. Not all features are used, some are removed after a feature selection process. Now I'm facing overfitting, and one solution to overfitting is ...
3
votes
2
answers
569
views
Do I want to overfit, when doing outlier detection based on regression?
Imagine, we have speed data of car and we would like to detect, if car speeds up or down more than it should.
Do I want to just overfit my model, so the outlier (higher or lower speed) would lead me ...
1
vote
0
answers
93
views
Transfer Learning "from scratch"
I've recently started to work in machine learning and this is my first post here. Excuse me in advance for duplicates and/or slang mistakes. My question is about transfer learning (although in this ...
1
vote
0
answers
80
views
What are the benefits of using gradient boosting machines in terms of variance and bias?
In two datasets that are composed of the same sample number, about 1500, but with different features. The first dataset has 15 predictive features and the second has 40.
Now for someone who is ...
2
votes
1
answer
1k
views
How to fix overfitting in xgboost?
I am trying to build a classification xgboost model at work, and I'm facing overfitting issue that I have never seen before.
My training sample size is 320,000 X 718 and testing sample is 80,000 X 78 ...
2
votes
1
answer
334
views
Random forest with small number of samples (10)
I have a computer science background but I am trying to learn how to apply ML by solving small problems.
I have been working on this problem for the last couple of days and I cannot find a solution. I ...
0
votes
0
answers
159
views
Cross validation methods in scikit-learn using an SVC classifier
The dataset we are using consists of ~3000 images split at 60/40 partition for training/testing. We have used sklearn's GridSearchCV and ...
0
votes
0
answers
50
views
Does this look like overfitting or something else?
The input are a timeseries of 1x41x41 geospatial images (so, 5x1x41x41 for example). I managed to achieve an MSE of 0.53 with PCA and Random Forest. But I thought to use ConvLSTM since my input size ...
0
votes
0
answers
57
views
Why does my network not learn a single image perfectly?
I have a convolutional neural network that uses Resnet(18,34 or 50 doesn't matter) as the backbone and pretrained weights from ImageNet.When I try training it with a single image for 50 or so epochs, ...
0
votes
1
answer
235
views
rep k fold cross validation, train test split and overfitting
I've recently gotten into ML and I'm a bit confused about rep k fold cross validation, train, test split and overfitting. I have already read some of the posts in this forum, but none of them could ...
2
votes
1
answer
1k
views
Curse of dimensionality using trees
The curse of dimensionality refers to the fact when a model tries to fit the data in a very high dimensional space (and there is not enough training data). In my mind, I believe that this curse ...
3
votes
0
answers
89
views
Does selecting confounder variables for a model with multiple correlation tests risk biasing results (similar to forward selection)?
My team is conducting a counterfactual difference-in-differences (DiD) healthcare analysis to estimate the benefits of home nursing visits compared with a control group.
We've "pre-selected" ...
2
votes
1
answer
255
views
Why validation accuracy starts to increase after overfitting?
I'm training a model on a small dataset of images. following are the curves of accuracy, f1 score and auc score. it's clear that the model is overfitting, however I don't understand why after sometime ...
2
votes
0
answers
211
views
Why does the test loss decrease even when the training loss and the validation loss increase
I was trying out different regression models to fit a time series. Models include a multiple linear regression model, ReLU regression models (with varying numbers of ReLU functions) and sigmoid ...
2
votes
1
answer
275
views
Example of KNN overfitting with k=1
I know that with k=1 a KNN lead to overfitting, this is because it follows the noisy data of the training sample and not generalize well on new input sample. But I am confused on how this happens, I ...
1
vote
2
answers
336
views
Can SVM overfit even with cross-validation?
I am using SVM regressor models to fit some chemical data related to spectroscopy (I cannot say exactly what data because it is an ongoing research in my group). To combat overfitting, I have used 5-...
1
vote
0
answers
48
views
Feature selection based on production data
I have a classifer (one/zero labels) that was trained and hypertuned by the book. When the model was ready, I applied it to the production data: real-time and unlabeled.
After a short period (a few ...
1
vote
0
answers
79
views
Overfitting is reduced but loss is worsened?
Consider the two pairs of learning curves below.
The red and green lines are the training and validation curves of some model 1, and
the gray and orange lines are the training and validation curves of ...
1
vote
1
answer
289
views
Fluctuations in both the accuracies and losses in training and validation of Deep learning MLP
I have a binary classification problem with Dataset N430 and predictors=146. Both Validation and training accuracies along with losses fluctuates. What would be the reason and suggest solution please?
2
votes
1
answer
187
views
OVerfitting using Random Forest - classification [duplicate]
I have a dataframe which is a made of many datasets combined together (many datasets with the same predictive features but with different samples combined together). This dataframe, called ...
3
votes
2
answers
3k
views
Can a regularization harm more than help in the situation of a huge over-fit?
I fit a regression model on a data set and get some in-sample RMSE. I wanted to know, how likely is that I get this good RMSE (or even better) under assumptions that there are no patterns in the data.
...
6
votes
1
answer
786
views
XGBoost when P>>N
Someone built an XGBoost classification model using each pixel in an image (256*256) as a separate feature, plus a few other features. However they only have 500 data points. The target classes were ...
0
votes
2
answers
285
views
overfitting of random forest in r
I am running a random forest classifier in R and during 10-fold cross-validation, I discovered that the model is overfitting. I am using a grid search to find the best hyperparameters and used the ...
1
vote
3
answers
166
views
What is overfitting while building model?
What exactly is overfitting while building models ?
1
vote
0
answers
93
views
Overfitting with Non Negative Least Squares
I'm trying to reconstruct a function, $A(x)$ from the results of some detectors. Essentially, I have a set of $n$ points which are
$ V_{i} = \int_{-\infty}^{\infty} A(x) e^{-(x - v_{i})^{2}} dx $
...
4
votes
1
answer
3k
views
Is it possible to have a higher train error than a test error in machine learning?
Usually it is called over-fitting when the test error is higher than the training error. Does that imply that it is called under-fitting when the training error is higher than the test error? Also ...
0
votes
0
answers
18
views
Emotion classifier: overfitting the training dataset [duplicate]
I'm working on a binary classification model over the RAVDESS dataset with a CNN model.
These are the performances on the train and validation set
and these are the performance on the test set
for ...
2
votes
0
answers
96
views
An aggressive overfitting situation
I gather RNA-seq transcriptomic data from multiple cancer datasets. The datasets are about a treatment of cancer, we check Response vs NoResponse samples.
The RNA-seq data I gather is before the ...
1
vote
1
answer
358
views
How to judge the neural network training stage with double descent?
In https://arxiv.org/pdf/1908.05355.pdf, it mentioned double descent that training loss is decreasing, increasing and then decreasing again. And the important point ...
3
votes
2
answers
658
views
Statistical approaches to detect overfitting in simple models
I read here that there are statistical approaches to assess whether a tractable machine learning model (e.g., a linear regression model) overfits a dataset:
Simpler models that have originated in ...
0
votes
0
answers
292
views
What do flattening learning curves indicate and when to stop training of a ML model in that case?
I am training CNNs for image segmentation on a limited dataset and apply some on-the-fly data augmentation. I measure mean intersection over union (mean IoU) to evaluate the training and select models....
1
vote
1
answer
148
views
How to calculate the total number of inputs in CNN?
I search this kind of question for a while and I find many discussions involve on counting the number of parameters of a Convolutional Neural Network, but not on the inputs. Using the Fashion MNIST ...
2
votes
1
answer
180
views
Is there a relationship between the number of the mixture components and the overfiting of the model?
I read the following:
To prevent overfitting we would like to work with as few components as possible".
How does the number of the mixture component affect the fit of the model? Is that because ...
1
vote
0
answers
91
views
Why isn't RandomSearchCV returning the optimum parameters for the XGBoost Model, and how can I avoid Overfitting?
I have a dataset for energy consumer customers and binary target variables with which I want to predict the churn for the customers.
Counts of target values
Not Churn 0: 14153
Churn 1: 1520
I have ...
1
vote
0
answers
49
views
State-of-the-art techniques for regularizing Neural Networks?
For regularizing neural networks, I'm familiar with drop-out and l2/l1 regularization, which were the biggest players in the late 2010's.
Have any significant/strong competitors risen up since then?
1
vote
0
answers
129
views
Underfitting and Overfitting at the same time?
I am using a Logistic Regression Classifier on the Airline Cancellation dataset.
Please note that the training set was undersampled (in order to balance classes) while the test set was left as it was.
...
2
votes
1
answer
177
views
Can a slightly overfitted model be useful for exploratory (i.e. hypotheses generating) modelling?
Let's say you have a set of potential explanatory variables (e.g. p = 8) that you think are important to explain your response variable ($Y$) but your sample is too small to include them all in the ...
2
votes
0
answers
355
views
Is deep double descent important in practical contemporary CNNs?
Deep double descent is an empirically observed phenomenon that happens with contemporary neural networks. Its essence is that often, increasing the model complexity first leads to the test loss ...
2
votes
0
answers
86
views
How to tell model (Multiclass Classification using Logistic Regression) is overfitting?
I'm training a logistic classifier to classify 5 classes using scikit-learn. The data isn't extremely imbalanced (class 1: 27.7%, class 2: 19.4%, class 3: 17%, class 4: 19.6%, class 5: 16.2%). I'm ...
1
vote
1
answer
120
views
Do I need to normalize data before applying L1, L2 norm in ANN
I wish to train the ANN and use regularizers to avoid overfitting. I need some suggestions, is it mandatory to normalize the data before using L1, L2 regularizers. I would highly appreciate if you can ...
3
votes
1
answer
399
views
Matrix Factorization and Overfitting
I recently came accross the algorithm of Matrix Factorization for a recommendations system.
One of the tutorials I followed can be found here.
According to it given the initial matrix $R$ and the ...
0
votes
1
answer
561
views
Low classification accuracy
I want to do a multi class classification with 6 classes. Whole dataset has 12750 and 56 features samples, so every class has 2125 samples. Before prediction I reduces amount of outliers by ...
0
votes
0
answers
135
views
Effect of duplicate/redundant labels on performance of model
I am training a CNN to predict age,mass and tone from images.
The structure of my dateset is as follows
...
13
votes
3
answers
3k
views
If I use a regularization (e.g. L2) can I not apply early stopping?
I've seen that early stopping is a form of regularization that limits the movement of the parameters $\theta$ in a similar way that L2 Regularization penalizes the movement of $\theta$ to be closer to ...
1
vote
2
answers
194
views
Is it possible to evaluate a given model without having access to its fit method?
I have a data set with one real-valued feature and a real-valued target. Someone has used this data set to fit a model (a regression). I get a results of this fit, which is a single function mapping ...
0
votes
0
answers
18
views
Regressor-based L2 penalty [duplicate]
I'm working on a multiple regression problem where I have reasons to believe some (if not all) of the regressors have been cherry picked/data mined to a varying degree. My hypotheses are that there's ...
8
votes
2
answers
792
views
PCA as a Cure for the Curse of Dimensionality
I would like some clarification as to how principal component analysis mitigates the Curse of Dimensionality problem. My particular interest is in curbing overfitting in my modelling, or more ...
2
votes
2
answers
1k
views
Why use regularization instead of feature selection for logistic regression? [duplicate]
For a non-linearly separable problem, when there are enough features, we can make the data linearly separable. It seems to me that for logistic regression, the reason of overfitting is always ...
4
votes
1
answer
505
views
Does higher variance in predictions result in higher variance error estimation?
Motivation
Everyone knows that fitting high variance models requires more data. A "yes" answer to the question would suggest that more data is also needed to evaluate these models.
...
1
vote
1
answer
674
views
Almost duplicate samples between train/test: overfitting?
I have been thinking about this for a few so I would like to hear some opinions. It could be complicated to explain so I will update the question if there is something that its not clear.
Imagine I ...