Using the SVM RBF Kernel

Wow,

I’m happy to announce that today is the first of a two part guest post series. Today’s guest post is by Marin Matijas, who gave a presentation at RCOMM 2010 about Short Term Load Forecasting using Support Vector Machines (SVM). I asked Marin to elaborate a little about his use of the Radial Bias Function (RBF) in Rapidminer’s SVM operator and here’s what he had to say! I did edit the post a bit for readability. Thanks Marin!

In my RCOMM 2010 presentation, titled Application of Short Term Load Forecasting using Support Vector Machines in RapidMiner 5.0,” I showed how SVMs can be used to solve a volatile Load Forecasting problem.

Load Forecasting is an old problem, it is almost as old as modern stock exchange related forecasting. I am comparing these two, as both problems are time-series which makes them similar (also because we are all eagerly waiting for Tom’s videos with more insights on how to predict financial markets).  The goal of Load Forecasting is to predict exact values of an electricity (power) load in a given time interval. Typically a load for the day ahead is being predicted on hourly basis. Unlike predictions in the financial markets where trend prediction is often more important than exact’ value, here the goal is to predict the (exact) value of the load itself.

Depending on the problem, Mean Average Percentage Error (MAPE) varies, but it is typically between 1 and 10 % for 24 intervals or more. A good precision can be obtained as load does not fluctuate much. Overall we consume typically more in winter than in autumn, more on Monday morning than Sunday evening, but when averaged electricity consumptions follow certain patterns.

Since load is serial nature where patterns are being repeated on a known basis, windowing has been used to take the advantage of this property. Support Vector Machines has been chosen for the regression, as it gave better results than previously used method. Compared to Artificial Neural Networks, it is much faster, an important characteristic with large datasets. One key parameter used for the SVM learner was the Radial Basis Function (RBF) kernel.  It was chosen for three main reasons, discussed below.

The first reason is that it is good for non-linear problems. Looking at a typical graph of the electricity grid daily load, one can easily see that Load Forecasting is a non-linear problem (see graph below).

The other types of kernels, linear and sigmoid  may be used but only under special conditions.  The second reason is that RBF has gamma parameter which makes optimizing the SVM in Rapidminer a simpler task.  The third reason is that RBF gave us the best results (low MAPE) than other kernels and it tends to be a standard kernel used in other research papers on Load Forecasting.

This same kernel can be applied for variety of other non-linear problems e.g. forecasting of options volatility and many others, as lot of problems are non-linear. The key take away points when incorporating an RBF kernel in a SVM is its simplicity in parameter and windowing optimizing in Rapidminer. I hope Tom will soon show in his video how simple it is to optimize parameters in RapidMiner, so you can create processes that utilize this powerful group of operators.

Cheerz,

Marin

mmatijas at hotmail.com

Rapidminer Web Mining Extension Now Available!

RapidMiner released its Web Mining Extension on the Marketplace. It’s super easy to install with RapidMiner Studio. Just go to Extensions > Marketplace (Updates/Extensions) and search for Web Mining.

Select the Extension and then accept the Terms and Conditions. RapidMiner will then have to restart and you should see the latest set of operators in the Extension folder of your Operators.

Web Mining Extension Operators

Here’s what you get with the extension, a web crawler, single and multiple page extraction, scraping text out of HTML tags, and much much more. My favorite operator is operator is the Enrich by WebService Operator, which I use quite a bit for mashing up geolocation data (see my Tutorials on this).

webmining

RapidMiner and R – Together At Last!

It’s ALMOST here, the R extension in Rapidminer is just one more week away!!!

If you want a sneak peak of it, check out this intro video by Ralf Klinkenberg on the R extension.

With the new GUI in 5.X and now this extension, Rapidminer will blow the doors off any data modeling suites in 2011!

RapidMiner and R Update

This post is incredibly old and the R extension in RapidMiner has been greatly overhauled. I would suggest checking out this updated video on how to use the new Execute R script operator with RapidMiner below.