Rapidminer 5.0 Video Tutorial 13 – Parameter Optimization

In this Rapidminer Video Tutorial I show the user how to use the Parameter Optimization operator to optimize your trained data. The example shows how Rapidminer iterates the learning rate and momentum for a Neural Net Operator to increase the performance of the trained data set.

Video #14 will be about web mining financial text data.

Rapidminer 5.0 Video Tutorial 6 – Creating a Decision Tree with Rapidminer 5.0

Calling all marketers!  In this video we discuss how we can use build a decision tree with RapidMiner to help us find sweet spots” in a particular market segment.  This video tutorial uses the RapidMiner direct mail marketing data generator and a split validation operator to build the decision tree.

Note: This video was made using RapidMiner 5.o but it’s still applicable with the current versions. The only thing that has changed is how the GUI looks, it has a more modern feel now.

Using the SVM RBF Kernel

Wow,

I’m happy to announce that today is the first of a two part guest post series. Today’s guest post is by Marin Matijas, who gave a presentation at RCOMM 2010 about Short Term Load Forecasting using Support Vector Machines (SVM). I asked Marin to elaborate a little about his use of the Radial Bias Function (RBF) in Rapidminer’s SVM operator and here’s what he had to say! I did edit the post a bit for readability. Thanks Marin!

In my RCOMM 2010 presentation, titled Application of Short Term Load Forecasting using Support Vector Machines in RapidMiner 5.0,” I showed how SVMs can be used to solve a volatile Load Forecasting problem.

Load Forecasting is an old problem, it is almost as old as modern stock exchange related forecasting. I am comparing these two, as both problems are time-series which makes them similar (also because we are all eagerly waiting for Tom’s videos with more insights on how to predict financial markets).  The goal of Load Forecasting is to predict exact values of an electricity (power) load in a given time interval. Typically a load for the day ahead is being predicted on hourly basis. Unlike predictions in the financial markets where trend prediction is often more important than exact’ value, here the goal is to predict the (exact) value of the load itself.

Depending on the problem, Mean Average Percentage Error (MAPE) varies, but it is typically between 1 and 10 % for 24 intervals or more. A good precision can be obtained as load does not fluctuate much. Overall we consume typically more in winter than in autumn, more on Monday morning than Sunday evening, but when averaged electricity consumptions follow certain patterns.

Since load is serial nature where patterns are being repeated on a known basis, windowing has been used to take the advantage of this property. Support Vector Machines has been chosen for the regression, as it gave better results than previously used method. Compared to Artificial Neural Networks, it is much faster, an important characteristic with large datasets. One key parameter used for the SVM learner was the Radial Basis Function (RBF) kernel.  It was chosen for three main reasons, discussed below.

The first reason is that it is good for non-linear problems. Looking at a typical graph of the electricity grid daily load, one can easily see that Load Forecasting is a non-linear problem (see graph below).

The other types of kernels, linear and sigmoid  may be used but only under special conditions.  The second reason is that RBF has gamma parameter which makes optimizing the SVM in Rapidminer a simpler task.  The third reason is that RBF gave us the best results (low MAPE) than other kernels and it tends to be a standard kernel used in other research papers on Load Forecasting.

This same kernel can be applied for variety of other non-linear problems e.g. forecasting of options volatility and many others, as lot of problems are non-linear. The key take away points when incorporating an RBF kernel in a SVM is its simplicity in parameter and windowing optimizing in Rapidminer. I hope Tom will soon show in his video how simple it is to optimize parameters in RapidMiner, so you can create processes that utilize this powerful group of operators.

Cheerz,

Marin

mmatijas at hotmail.com