Questions tagged [ensemble-learning]
In machine learning, ensemble methods combine multiple algorithms to make a prediction. Bagging, boosting and stacking are some examples.
477 questions
3
votes
1
answer
597
views
Gradient boosting algorithm (steps) question
So, far I have read following regarding boosting:
Boosting is an ensemble technique.
Train learner sequentially, where early learners fit simple models to the data.
Analyze data for errors, that is, ...
9
votes
2
answers
4k
views
Matrix Factorization Model for recommender systems how to determine number of latent features?
I am trying to design a matrix factorization technique for a simple user-item, rating recommender system. I have 2 questions about this.
First in a simple implementation that I saw of matrix ...
0
votes
1
answer
714
views
Using a logistic model on the estimates of several other classification models
I'm working on a classification model that will predict whether a sales opportunity will end up 'won' or 'lost', given various attributes of the opportunity. I've been using my training data to build ...
3
votes
1
answer
274
views
Is there a well-defined class of ensemble methods?
Ensemble methodology's main aim is to somehow aggregate or summarize estimates from multiple models. In some cases this is aggregating different bootstrap estimates or Monte Carlo estimates, but ...
16
votes
1
answer
12k
views
Using LASSO on random forest
I would like to create a random forest using the following process:
Build a tree on a random samples of the data and features using information gain to determine splits
Terminate a leaf node if it ...
7
votes
2
answers
2k
views
On combining SVMs
Suppose we have a supervised training set $T=\{ (x_1, y_1),\dots, (x_n,y_n)\}$ where $x_i$ is an example and $y_i \in \{-1,+1\}$ is its label. Further suppose that examples are only observable through ...
3
votes
0
answers
169
views
One-against-all probability values into a multiple class probability value?
I have a 10-class classification problem. I've approached the problem as a set of one-against-all binary problems. For each class I've built a MLP neural network that provides a probability estimate [...
12
votes
3
answers
3k
views
Limits to tree-based ensemble methods in small n, large p problems?
Tree-based ensemble methods such as Random Forest, and subsequent derivatives (e.g., conditional forest), all purport to be useful in so-called "small n, large p" problems, for identifying relative ...
16
votes
3
answers
11k
views
Ensemble time series model
I need to automate time-series forecasting, and I don't know in advance the features of those series (seasonality, trend, noise, etc).
My aim is not to get the best possible model for each series, ...
2
votes
1
answer
436
views
Defintion for model diversity?
Two models are diverse if they make prediction errors on different instances. I know there are different measures to quantify diversity, however, I'm looking for formal conceptual definition of what ...
4
votes
2
answers
2k
views
Looking for examples or alternatives to R RuleFit ensemble package
Does anyone know of any good example code illustrations for the rulefit Rule Based Learning Ensembles package? The documentation is incredibly lacking. I was guided to the package by this paper.
If ...
3
votes
1
answer
1k
views
Which are the most effective clustering ensembles?
In supervised learning, there are some ensemble methods that overcome others significantly (adaboost or random forests to mention some).
Few years later, also ensembles in unsupervised learning were ...
5
votes
1
answer
2k
views
Determining the number of weak classifiers to use in adaboost without overfitting?
I was thinking by using validation but not quite sure how to go with it. Please list some papers or ideas on how. This is for multi class problem (using one vs all approach). I think each class/...
3
votes
1
answer
3k
views
How to convert multiple ranking scores into a probability distribution?
I would like to create a topic distribution for a document.
The current model I am trying to implement is: for each sentence in the document, I am getting a topic assignment with a score, e.g. "...
4
votes
1
answer
2k
views
Concept of iterations in Adaboost
I can't seem to get my head around "iterations" in Adaboost.
Are they analogous to weak classifiers that are used for Boosting?
I've seen many examples of Adaboost where a programmers use a
Single ...
2
votes
1
answer
1k
views
Accuracy of classifiers with Adaboost
Does Adaboost ensure that resultant accuracy is more than or at least equal to current accuracies?
What happens if Classifier A performs badly and the weights are accordingly updated and the next ...
14
votes
6
answers
11k
views
Resources for learning how to implement ensemble methods
I understand theoretically (sort of) how they would work, but am not sure how to go about actually making use an ensemble method (such as voting, weighted mixtures, etc.).
What are good resources for ...
3
votes
1
answer
2k
views
Weighting variables for an index
I have been tasked with trying to modify our current "index" which basically takes 4 observations per person and calculates a score based on what they achieve. Here is how the score is created (all ...
21
votes
3
answers
17k
views
Stacking/ensembling models with caret
I often find myself training several different predictive models using caret in R. I'll train them all on the same cross validation folds, using ...
6
votes
1
answer
2k
views
Ensembling regression models
I'm working on a securities pricing project and have a bunch of models I'd like to stack/ensemble together. I've been using simple linear regression in R (the lm() ...
8
votes
2
answers
4k
views
Base classifiers for boosting
Boosting algorithms, such as AdaBoost, combine multiple 'weak' classifiers to form a single stronger classifier. Although in theory boosting should be possible with any base classifier, in practice it ...
2
votes
1
answer
1k
views
What are the strongest boosting alternatives to Adaboost?
Whenever boosting is brought up, Adaboost is the first algorithm to be listed. What are the most popular boosting algorithms that aren't Adaboost?
9
votes
3
answers
6k
views
How are classifications merged in an ensemble classifier?
How does an ensemble classifier merge the predictions of its constituent classifiers? I'm having difficulty finding a clear description. In some code examples I've found, the ensemble just averages ...
7
votes
3
answers
23k
views
Does ensembling (boosting) cause overfitting?
I'm using SPSS Statistics Base 20. Using Analyze $\rightarrow$ Regression $\rightarrow$ Automatic Linear Modeling I've input about 50 variables.
When using no boosting, the reported accuracy of the ...
22
votes
4
answers
12k
views
Combining machine learning models
I'm kind of new to datamining/machine learning/etc. and have been reading about a couple ways to combine multiple models and runs of the same model to improve predictions.
My impression from ...
302
votes
8
answers
215k
views
Bagging, boosting and stacking in machine learning
What's the similarities and differences between these 3 methods:
Bagging,
Boosting,
Stacking?
Which is the best one? And why?
Can you give me an example for each?
23
votes
2
answers
4k
views
On the "strength" of weak learners
I have several closely-related questions regarding weak learners in ensemble learning (e.g. boosting).
This may sound dumb, but what are the benefits of using weak as opposed to strong learners? (e.g. ...