Ensemble Modeling in RapidMiner
You can read this in one minute. This post has been viewed 131 times.
Here’s a great 10 minute video by Vijay Kotu explaing the concept of ensembling modeling and how to do it in RapidMiner.
My notes are below:
- Friend Example to explain Error Rates
- P(Friend’s Circle wrong) = P(2 Friends wrong) + P(3 Friends wrong)
- Use Binominal Distribution for each friend
- Individual friends have a higher error rate than a group of friends
- Diversity works and the math works!
- One strong model can be created from multiple models
- Individual models SHOULD be independent and classifier models should be better than 50% in error rate on their own.
- Change/sample training data for each base model
- Use different models
- Change the parameters of the model (depth of trees, k value, etc)
- Example process in RapidMiner uses Iris data set
- Use a Vote operator to create a ‘stacked model’ using a majority vote
- Bagging/Boosting Ensemble operators manipulate the rows of training data
Updated with 5 Minutes with Ingo
- Ingo goes over Bagging and Boosting Ensemble methods
- Ingo needs to lay off the sugar, LOL