In this video we start building a financial time series model, using S&P500 daily OHLCV data, and the windowing, sliding validation, and forecasting performance operator. Â This Part 1.
[flashvideo file=wp-content/uploads/2010/03/Rapidminer5-Vid9.mp4 /]
This video can be viewed in HQ by clicking this link here.  Please make sure you have Quicktime or another MP4 capable reader installed in your browser.
Here is the XLS training files.
PS: Check out my forums!
I love it! Can't wait for #10. Thanks again for taking your time to do these videos for us.
Question: What type of model is used to create the label when you are widowing? For example, when you show a window =1 and the first example row is 1288.120 where does this value come from?
I'm confused on why this value would be used instead of the true close?
Brian: the window operator is just that, a windowing operator. There is no model in it, its purely a data preprocessor. The 1288.120 value shown for the label on row 1 is actually the close value from the next day (row 2). The label value shown on Row 2 is the closing value from row 3, etc. Why? Because you set the horizon value on the windowing operator to 1 and I want to predict one day into the future.
@Tom: OK. So this is the same as predicting the close, using a lag(1) operator! I think I am following now.
@Brian: Yes, something like that but please don't think of the Windowing operator as just a modified lag operator. =)
I do think of it that way….so I will stay tuned!
Hey Tom,
Great video once again. I have a quick question…how can you check or output the model's predictions? Can't seem to figure it out.
Cheers,
Phil
@Philmo: That will be part of my last video. I gotta leave the best for last! =)
Ok fair enough!
Hello, Tom!
Thanks a lot for your videos. I have just signed up to dropbox. So this is a way to say thanks! i hope you've got more space :)
see you soon and looking forward for new videos about rapidminer :)
@Pathros: Thanks for the free space!
Tom thanks for a great tutorial set, finding it very well done and easy to follow. there is nothing else like it .. keep up the good work
ken
Hello, Tom!
I love your videos. I have just signed up to dropbox. But I can not access your shared fold for referenced files. My email is leidj@cqupt.edu.cn. I hope you add me to your shared friends.
@Ken: Thanks.
@Lei: I haven't set it up yet. It will happen sometime after I post the last video and when I have some free time. I'm insanely busy.
Hello Tom,
Nice videos. I am getting up to speed with RM 5 having spent time learning RM 4 and your videos have shown me some things I didn't know about.
@Andrew: Thanks. Rapidminer surprises me to this day with the things it can do.
PS. I signed up for Dropbox using your link…
keep up the good work! I’ve donated to the cause by signing up for Dropbox.
Hi Tom
thank you very much.
Your explanation is much easier and better than Dr. Ingo Mierswa
wating for your next video.
regards
Hi,
> Your explanation is much easier and better than Dr. Ingo Mierswa
Hey, I have also some feeling ;-)
But you are of course right: Tom’s videos are great!
Cheers,
Ingo
Hi Tom,
Thanks for a great set of tutorials.
In your video around time 10:51 you use Forecasting Performance under evaluation. This is not available in my version of Rapdi Miner 5. How do you install it?
Kind Regards
Werner
@Werner: You need to install the Time Series extension. It’s contained in there.
Hi Tom,
Thanks for the quick reply. I have installed all the add ins via the Help/ Update Rapid miner.
Mine shows Series (86) where yours show Series (87)
I see Series (86)/Evaluation(2)/Validation (2). Their is no Performance.
Is their another way of installing extension. Other then via the program it self?
@Werner: That is odd! I’m at a bit of a loss here, perhaps you can pose the question on the Rapid-I forums. Tell them you came from here.
Busy reading through the forums. Will keep you all up to date.
Hi Tom
Seems I am not the only one with the problem.
http://rapid-i.com/rapidforum/index.php/topic,2098.0.html
The problem is expected to be fixed with the next update for time series. To be released in the next few weeks.
Hi Tom,
Just for the record, I am third person with the same problem, couldn’t find Forecasting Performance. Also, as we are not an enterprise customer, so we have to wait till it get solved in the next release. couldn’t complain more from a free service;-)
So Tom, do you recommend another object to be used for this purpose for now?
@Ahmed: I use the regular community edition and when I downloaded it with the Time Series plugin, it was there. I’m a bit stumped as why its not included in your versions.
I have to admit that I had it before on my old vista machine, now I am using windows7. Also, I have the latest update from rapidminer , not sure which one of them is the reason? I will try shortly the first release of version 5.0 dated 2009-12-22, I will let you know if it works again…
Yes, the problem is the new update from Rapid-miner, it works again when I install “5.0 Release Candidate version”
http://sourceforge.net/projects/yale/files/
This is the one to be used:
rapidminer-5.0×32-install.exe 53.7 MB 2009-12-22 5,843
Rapid Miner English User Manual now available
http://rapid-i.com/content/view/26/201/lang,en/
or directly from
http://sourceforge.net/projects/yale/files/1.%20RapidMiner/5.0/rapidminer-5.0-manual-english_v1.0.pdf/download
Thanks Ron, this will help a lot of people.
Hi Tom,
Do you happen to know what is the Forecasting Performance? I am using the sliding window validation as instructed in your video and I used to get a number that is the same as coefficient correlation, when I dump the output into excel sheet, I verify it.But, now I am working on different data (but format is exactly the same), and I am getting 0.68 from the forecasting performance, but the CC is 0.77 for the output data.
I posted this question in the forum, but no one answered me :(
I thought I may get something here…
Cheers,
Ahmed
Hi Ahmed, the forecasting performance operator is the % forecasting accuracy on the training data set. The coefficient of correlation is the measure of how dependent the variables are to each other.
Thanks Tom for your speedy reply.
What I am after is how does rapidMiner calculate the forecasting accuracy, is it Mean Absolute Percent Error, is it Mean Percent Error, Root Mean Squared Error…etc?
What I have found before that the number I get from the forecast accuracy, is the same when I dump the output file and measure the CC using CORREL(…) in excel. So, in your experience do you get the same or what is it?
Hi Ahmed, I think Simon will answer your questions in the Rapid I forums.
Cheers Tom :)
Hi Tom,
first of all: Thank you for the great work!
Unfortunately, I find the forecasting performance operator, but I can’t drag it to the place where it should be.
My current version ist 5.0 RC and the Series Data Addon is installed.
Do you know a solution for this?
Thank you!
@Marius: I’m not sure I quite understand your problem. Do you mean that you can’t drag the operator from the sidebar into the process window and then connect it?
Yes, that’s exactly the problem! I can’t drag the operator ‘forecasting performance’ from the sidebar into the process window. Therefore, I can’t connect it.
However, all the other operators work fine.
Do you know a solution?
Marius, I suggest you try reinstalling the extension and trying it again. I believe a commenter on this blog once had a problem with the performance evaluator not even being in the time series extension.
If that doesn’t work, I would post your question on the Rapid-I forum.
Hi Tom,
great tutorials! As I am new(ish) to the topic and new to Rapidminer I am so glad there is a “pause” button on the video, when you outspeed me ;-)
A friend and myself are aiming for world domination (inspired by the novel “WCG”) thru advanced financial forecasting: so be assured we will stay tuned :-)
Regards from Germany
I’m glad its useful to you. When you do dominate the world, please remember me and give me a nice position in your New World Order. Preferable one with lots of young pretty college girls supporting my work.
ahmed I’m having the same problem of not having the forecasting performance operator. Did you manage to fix this problem?
I just reinstalled Rapidminer on a new laptop and Forecasting Operator was there when I installed the Series plugin. Did you try reinstalling RM?
hello
please send me document about rapidminner
I don’t have any RM documentation aside from my posts on the subject. Please check the Rapid-I.com site for any documentation.
Thomas…you are a life saver!!!
Your welcome Wff-daiblo.