Newest 'ensemble-learning' Questions

1 vote

1 answer

94 views

A correct approach to validate/correct readings from similar sensors?

I am looking to apply a calibration/correction approach on a set of sensors and I just wanted to know that the approach I am going to use is statistically correct and acceptable. I am using a set of ...

Milad

157

asked Jul 14 at 11:41

1 vote

0 answers

60 views

Pooled Point Prediction Intervals for MICE imputed data

I am performing binary prediction on a dataset which contains missingness, and so I am leveraging Multiple Imputation (MI). For example, creating a train-test split, I perform MI on the training data ...

benedictjones

11

asked Apr 28 at 12:42

0 votes

0 answers

46 views

Ensemble Neural Network - Stacking ensemble neural network accuracy is significantly similar or low compared to base models

Context I'm trying to create an Ensemble survival neural network with a custom loss function which consist of 3 base models, Random Survival Forest (RSF), Gradient Boosting Survival Model (GBSM) and a ...

Yugan Gogul Muthukumar

1

asked Mar 7 at 7:41

1 vote

1 answer

66 views

in ensemble the resulting model is nonlinear even if the base model is linear?

I have this doubt, in the case an ensemble of linear basic models I am convinced (but I do not know the exact mathematical explanation for this) that the resulting ensamble works only for a linear ...

Gianni

173

asked Dec 2, 2024 at 12:57

1 vote

0 answers

74 views

Stacking Vs Voting Vs Blending

I am working on an experiment with a dataset, where I compared the performance of stacking, blending, and voting using base models and logistic regression as my meta model. Although, voting seems to ...

Community wiki

2 revs, 2 users 100%
user54565

0 votes

0 answers

129 views

Optimize precision and recall for specific classes in an imbalanced classification problem

I have three classes {-1, 0, 1}. The data is in the ratio 1:20:1 on average for the corresponding classes. I want to achieve High precision(>70%) and average recall (30%-40%) on classes -1 and 1. ...

CuriousRabbit

1

asked Jul 25, 2024 at 19:27

1 vote

1 answer

116 views

How can different models based on different sets of predictors be combined to significantly improve the model performance?

I have two machine learning models for predicting some continuous variable $y$, say $y=f_1(X_1, \theta_1)$ and $y=f_2(X_2, \theta_2)$, and these models are of the same type (ANN). $X_1$ and $X_2$ ...

tunar

563

asked Jun 30, 2024 at 3:01

2 votes

1 answer

207 views

How do I calculate estimated variance for an ensemble forecast?

I have several (n) different forecasts of comparable quality for a variable, based on the same data but using wildly different statistical models. For each, I have generated an estimate for m periods ...

andrewH

3,297

asked May 24, 2024 at 19:47

4 votes

1 answer

902 views

Gamma regression with XGBoost

I'll try to be brief. I have two questions about what exactly happens when I train a gradient boosted ensemble of trees using, say, XGBoost in order to perform a Gamma regression. I apologize in ...

user412834

41

asked May 20, 2024 at 13:01

0 votes

1 answer

164 views

Quantifying prediction uncertainty using deep ensembles: How to combine Laplace distributions?

For a regression problem, I want to train an ensemble of deep neural networks to predict the labeled output as well as the uncertainty, similar to the approach presented in the paper Simple and ...

qubit

1

asked May 11, 2024 at 21:35

1 vote

0 answers

251 views

Prediction vs confidence intervals using random forest / an ensemble of estimators

Given a random forest (or any other ensemble) where each of the $i=1..n$ trees/base estimators is trained by minimizing the mean squared error, then each tree/base estimator prediction $\hat{Y}_i(x) =...

Ggjj11

2,345

asked Apr 26, 2024 at 18:48

1 vote

0 answers

73 views

ML Modelling advice where a feature is partially missing but highly informative when present

I am building a model to predict a customer purchase event on a website. Specifically for those customers who, overnight when the model is run, have not yet purchased. Prediction is important, but ...

Jon

141

asked Apr 24, 2024 at 9:34

1 vote

0 answers

74 views

XGB predict_proba estimates don't match sum of leaves [closed]

When using an XGB model in the context of binary classification, I observed that the test estimates given by predict_proba were close but not equal to the results I ...

Juan Felipe Salamanca Lozano

29

asked Apr 15, 2024 at 17:13

1 vote

3 answers

200 views

Combining regression models based on missing data patterns

I have a dataset that contains a few patterns of missingness. For this dataset, I have a training set that is complete and contains all input features. My test set has complete observations for the ...

Squan Schmaan

53

asked Apr 11, 2024 at 4:01

3 votes

1 answer

388 views

Ensemble Methods for Probabilities

I am currently trying to build a stacked algorithm in order to determine how many people in each region of a country will be likely to buy a product versus its competitors. I have some data from an ...

huntercallum

133

asked Mar 1, 2024 at 17:21

0 votes

0 answers

68 views

Ensemble Random Forest Overfitting

I am running an ensemble random forest model (a newer method published in 2020). The model works by using a double bootstrapping step to balance imbalanced training data. Then you grow multiple ...

Greatwhite4

11

asked Feb 16, 2024 at 18:07

0 votes

1 answer

144 views

Bagging Ensemble Math

You are working on a binary classification problem with 3 input features and have chosen to apply a bagging algorithm (Algorithm X) on this data. You have set max_features = 2 and n_estimators = 3. ...

Tanjim Taharat Aurpa

1

asked Jan 4, 2024 at 10:15

1 vote

0 answers

90 views

Cross validation + model stacking with hyperparameter tuning while sharing data?

Let's say we want to stack 2 base models: an XGBoost regressor and a deep neural network by linearly combining their predictions as ...

statnoob

23

asked Dec 30, 2023 at 13:12

1 vote

0 answers

51 views

Is there a known way of producing forecasts with reasonable fit and residuals that are at least independent, & ideally negatively correlated? [closed]

I am trying to do some forecasts. I have produced multiple forecasts by a variety of methods. All of the forecasts I have generated so far have residuals that are strongly positively correlated.I ...

andrewH

3,297

asked Dec 9, 2023 at 5:08

0 votes

0 answers

81 views

Should I create an ensemble by averaging deep models' weights and biases?

When I train deep models with cosine annealing learning-rate scheduling and warm restarts, I get models that achieve completely different scores on my validation set, after each training cycle. There ...

SomethingSomething

187

asked Oct 12, 2023 at 7:03

1 vote

0 answers

55 views

Estimation under model uncertainty that cannot be adjudicated empirically

Many well-known methods address specific forms of model uncertainty that can be adjudicated empirically. For example, if we are fitting a predictive model and there is uncertainty about the set of ...

half-pass

3,850

asked Oct 11, 2023 at 14:57

4 votes

0 answers

150 views

Why are approaches that approximate a random forest with a single decision not more popular?

I understand that random forests yield better performance than standard decision trees, but are less interpretable, because they do not generate a single tree. In this question, several users provided ...

nka5we

49

asked Sep 24, 2023 at 21:38

1 vote

0 answers

149 views

Weighted bootstrap sampling vs. uniform bootstrap sampling with later weighting

Assume I have a fancy procedure $w: X \to \mathbb{R}$ to come up with weights for examples $x \in X$. Think of it as similar to the weights used in e.g. some boosting procedures. Now, I want to build ...

ngmir

339

asked Sep 12, 2023 at 10:31

2 votes

1 answer

104 views

Should I apply normalization to predicted probabilities from 7 different models before computing correlation among them?

I'd like to check if there are correlation among predicted probabilities of models in a voting classifier. According to the table below, one of models, Model5, has mean 40.9% and standard deviation 46....

KAI

23

asked Sep 11, 2023 at 4:05

3 votes

1 answer

1k views

Are Bagged Ensembles of Neural Networks Actually Helpful?

I've been looking into ways to estimate uncertainty for regression tasks on neural networks. One of the obvious options is ensemble modeling. Consider an ensemble of neural networks that all have ...

noNameTed

287

asked Sep 8, 2023 at 14:25

2 votes

1 answer

106 views

Ensemble learning with models of different quality. Develop a voting method that takes accuracy, F1, recall, calibration of each model into account

Lets assume I have 24 random forest models. Each of 24 random forest models produces a class prediction. I am currently using simple majority voting to select final prediction. Can someone please ...

MSKO

71

asked Aug 22, 2023 at 10:10

6 votes

1 answer

847 views

Fitting a simple model first, then training a neural network on the error

Can someone tell me what the name is for the following process? I have some data with inputs $x_i$ and outputs $y_i$, and I fit a simple model (e.g. linear regression) to them. Then, I compute the ...

Fai Wang

61

asked Aug 19, 2023 at 18:36

1 vote

1 answer

677 views

Forecasting a Time Series Model for 1000s of Time Series

I'm currently immersed in a challenging forecasting project centred around predicting the required work hours to complete various tasks within a team setting. My dataset comprises crucial attributes, ...

Tirth

113

asked Aug 6, 2023 at 10:53

1 vote

0 answers

38 views

Least squares with multiple outputs but one coefficient per example

According to Elements of Statistical Learning Ch 8.8, we can apply least sqaures at the population level to show that for a regression ensemble $f_1(x), f_2(x), \ldots , f_M(x)$ where $f_j: \mathbb{R}^...

Seraf Fej

556

asked Jul 26, 2023 at 17:03

2 votes

0 answers

209 views

Why would a model combining two pre-trained models not even achieve the performance of the best sub-model?

I have two different CNNs trained on the same dataset. One performs a bit better than the other but I believe each can provide different and useful information. I use ...

raquelhortab

172

asked Jul 12, 2023 at 19:35

1 vote

1 answer

107 views

Why should a valid diversity measure be independent of the target variable?

Consider an ensemble of weak learners (i.e. regressors or classifiers) whose predictions are aggregated (e.g. via averaging or majority vote) into an ensemble estimate. This gives rise to the question ...

ngmir

339

asked Jun 9, 2023 at 12:15

2 votes

0 answers

104 views

Is bagging less useful in 'big data' settings?

In 'big data' settings where the number of samples $n$ may be very large (for fixed number of features), is bagging less or more effective at reducing variance? I heard the claim that it is less ...

WeakLearner

1,631

asked May 7, 2023 at 14:35

4 votes

1 answer

2k views

SHAP values of Ensemble Model

I predict a continuous variable by taking the average of $N$ model predictions. The models are different in terms of their functional form, i.e. a tree model, a neural net, etc. Is the average SHAP ...

shenflow

1,149

asked Apr 25, 2023 at 11:35

1 vote

0 answers

40 views

Approaching multiple records for one observation; radiomics of 2D slices of a 3D object

Background I am trying to create a model that can predict Type 2 diabetes in a patient based on MRI scans of their thigh muscle. Previous literature has shown that fat deposition in the muscle of ...

Saminy Creed

11

asked Feb 11, 2023 at 23:28

1 vote

1 answer

165 views

Weights Update - Ensemble Models

I must identify if a data point is an outlier or not in a dataset (we don't have labels). I have different unsupervised models to identify the outlier. Then, I normalize the outlier score and I ...

AdrianJoi

11

asked Feb 3, 2023 at 8:54

0 votes

0 answers

98 views

Is Endogeneity an assumption of Ensemble Methods?

I am using catboost regressor and lgbm regressor to perform regression on dataset. I want to know the assumptions of both the models. Where can I find assumptions for both the models? Next I want to ...

Lopez

147

asked Feb 3, 2023 at 8:42

0 votes

2 answers

644 views

Combining logistic regression and decision tree?

I'm working on a project classifying patients as having (1) or not having (0) a particular condition. Someone I work with has suggested fitting a decision tree on this data, and using the leaf node ...

k13

57

asked Jan 31, 2023 at 18:19

2 votes

2 answers

731 views

Do random forests use weak learners (like XGBoost) or fully grown trees?

So it sounds like boosting techniques (eg. XGBoost) uses weak learners (stumps) to gradually learn sequentially. This is not in dispute I hope. However, with bagging techniques (eg. Random Forest) I'm ...

Katsu

1,051

asked Jan 20, 2023 at 18:08

2 votes

1 answer

158 views

Why doesn't boosting assign higher weight to the "good" (low residual) models?

Extremely confused about the following: Lets say we start out with a dumb weak learner. Since its the 0th model and hasnt learned anything yet, we have a high residual, lets say of 10,000. We produce ...

Katsu

1,051

asked Jan 19, 2023 at 18:42

1 vote

1 answer

665 views

Subset Differences between Bagging, Random Forest, Boosting?

Per my understanding, there are 2 kinds of "subsets" that can be used when creating trees: 1) Subset of the dataset, 2) Subset of the features used per split. The concepts that I'm comparing ...

Katsu

1,051

asked Jan 19, 2023 at 18:13

1 vote

1 answer

518 views

Evaluating Feature Importance for a Super Learner Ensemble Meta-Model

I have been reading up on super learner ensemble methods that utilize multiple models and model configurations to make model predictions as good or better than any individual base model previously ...

Jake Niederer

85

asked Jan 17, 2023 at 22:29

0 votes

0 answers

55 views

Do the neural networks belonging to a deep ensemble need to be trained on the same training set?

As the title says, I was wondering, if I have to train every neural network of a deep ensemble on a different training set or on the same one. I ask this question because I am getting weird results. ...

Alucard

325

asked Dec 21, 2022 at 3:57

0 votes

0 answers

68 views

Math behind ensemble learning

I'm struggling to find some clear math behind ensemble learning. I can simulate it very easily, eg: ...

Blaze

73

asked Dec 3, 2022 at 9:21

5 votes

1 answer

223 views

Boosting definition clarification

Regarding boosting in the context of machine learning. One definition I have encountered talks about turning multiple weak learners into one strong learner, and another talks about starting with a ...

A. Maman

222

asked Nov 12, 2022 at 18:17

1 vote

0 answers

42 views

Can someone explain why finding similar embeddings coming from two different net gives bad recall?

I'm currently working on an ensemble of 5 differently trained networks using MinkLoc3D v2 as base-net. I'm currently investigating the reason for lousy recall when I compare the extracted database ...

JackFrost67

11

asked Nov 8, 2022 at 14:18

0 votes

0 answers

190 views

What happens to the accuracy of a decision tree after pruning?

What happens to the accuracy of a decision tree when it is pruned? Can be higher than the accuracy of the fully-grown decision tree?

san

11

asked Nov 4, 2022 at 5:41

0 votes

0 answers

132 views

Ensemble learning with different data sets

Consider model A, a deployed model that produces a probabilty of if an event occur or not for a population. This I want to improve by building another model, model B, on top of model A. Model B should ...

Henri

253

asked Nov 2, 2022 at 21:25

3 votes

1 answer

579 views

How to prove error of ensemble model by using the Hoeffding's inequality?

Under Binary classification situation, error between function $f$ and basic learner(classifier) $h_i(x)$ is $$P(h_i(x)≠f(x))=\mathcal{E}.$$ It is assumed that $T$ basic classifiers are combined by a ...

Kombatant82

31

asked Oct 29, 2022 at 14:39

3 votes

0 answers

52 views

Is model selection itself a model?

Suppose that I wanted to choose from, for example, $Y = aX + \epsilon$ and $Y = aX^2 + \epsilon$. Is this meaningfully different from fitting $Y = a_1X + a_2X^2 + \epsilon$ and heavily penalizing $...

llllvvuu

223

asked Oct 28, 2022 at 1:49

0 votes

1 answer

71 views

How does accuracy increase in ensemble learning?

I have a doubt from a passage in the ensemble learning chapter of Aurelien Geron's book "hands on machine learning... ". I do not understand If you do the math, you will find that the ...

Jose_Peeterson

304

asked Oct 17, 2022 at 15:43

Questions tagged [ensemble-learning]