Rapidminer 5.0 Video Tutorial #8 – Financial Time Series Data Discovery

In this video we review data discovery techniques for Financial Time Series data by calculating a Simple Moving Average (SMA), creating a non-linear trend line using a Neural Net operator, and creating a time series prediction line using a Neural Net operator.

[flashvideo file=wp-content/uploads/2010/03/Rapidminer5-Vid8.mp4 /]

This video can be viewed in HQ by clicking this link here.  Please make sure you have Quicktime or another MP4 capable reader installed in your browser.

Here is the XLS training files.

  1. S&P500 Training XLS
  • Ron McEwan

    Tom, I didn't see where I would refer you to Dropbox when I clicked on the link to setup an account ? Do I do this after the setup page?

  • http://www.neuralmarkettrends.com Tom

    @Ron: If you click on the referral link it automatically knows that I referred you (i hope). I just uploaded the video XLS data, and XML files for this video to my public dropbox.  Check it out and let me know if you can get to it.

  • Ron McEwan

    Thanks Tom, I saw the file updates and have saved them. I also wanted to mention that I use a free open source mp4 player to watch your videos. It is VLC Media Player at http://www.videolan.org/vlc/

  • Rafael Calastro

    Again! great job, tom!
    The movies quality is awsome!
    i havent used the dropbox before.. its look goood!! (i clicked on your refereal link to create and my account ;) )
    I trhought when we said "predictive" we are talking aboout the future (not necessary, right?) can rapidminer draw the probably  chart line of 5 next days?
    Coincidently, this week i decovery that the company i work buy a data mining service from "bossa nova" to predicte the sells of a kind of products for a costumers group to improve our margin
    .. and when i watched this last video realized the similarity  with "my world"(and how inst not so haaard to apply this tech in other places)

    sorry for my bad english!! uaiahua… i'm brazilian!!
    best regards!
    Rafael cALASTRO

  • Ron McEwan

    Tom, I belong to large online group of traders and finance professionals, They have a separate part of the forum just for neural net research. I would like to refer your site and Rapid Miner to the group. Just wanted to check with you first before I did that and I would also have to check with the groups admin to see if he would approve it. Thanks
    Ron

  • http://www.neuralmarkettrends.com Tom

    @Ron: Thanks for the dropbox space! Thanks for the referring my site to your online forum of traders, its ok with me.

    @Rafael: I predominantly use Rapidminer for financial modeling BUT it can be applied to all areas of data discovery.  Banking, Fraud Detection, Upselling, Marketing, etc.

  • scbhush

    Hi Tom
     
    Thank you for this 8 video.  I love it
     
    regards

  • Soham Das

    @Ron, @Tom
    Can you post the link of the group you mentioned.
     
    Soham

  • Rene

    Great video. Watched all 8 of them these two last nights.

  • http://www.neuralmarkettrends.com Tom

    @Scbhush: Your welcome.
    @Soham: I don't know what group it even is.
    @Rene: Only two more to go.

  • Philmo

    Hey Tom,
    Great website. I can't wait to see the next videos. I have a quick question for you and I am sure this can help others who are using RapidMiner for financial applications: How can you get RapidMiner to loop through a process and give you an overall prediction matrix. For example, say you have a data set, and you want to loop through it and train a learning process on the last 1000 values to predict the next point (or next few points), and then move the window forward, and repeat this process over the length of the time series. This is commonly known as a walk-forward analysis. I want to see if my model is actually good and could be used in real life by feeding it the latest data point and predicting the next. I know there are a few functions such as Loop Data Sets and Loop and Average but I can't figure out how to get them to work/give me overall results.

    Thanks in advance

  • http://www.neuralmarkettrends.com Tom

    @Philmo: If I understand you correctly, you'll need to use the Windowing operator for that. It takes a window snapshot (you can set the window size) of your training data, trains on the data, tests the data, then slides forward in the increment you want.  You should use the Forecasting Performance operator to determine how good the model is. I'll be discussing these operators in Vid Tut #9.

  • Philmo

    Perfect! Thanks a lot.

  • Ahmed

    Astonishing video! Thanks Tom.
    I was wondering if it is possible to interpolate more than one model in Rapidminer for timeseries prediction, i.e: NN & regression. I want to use it in demand forecast prediction, and I guess using more than one model can help to reduce prediction error. Thanks once again
    -Ahmed

  • http://www.neuralmarkettrends.com Tom

    @Ahmed:You mean like stacking? I believe its possible in RM but I've never done it in my experiments, I usually default to a NN and/or a SVM.  There is a sample model in the RM tutorials that has an experiment with three or more different learners.  Check it out there.

  • Ahmed

     
    I meant a hybrid system where you can have more than one model, and at last stage being able to interpolate the results from all of them in a nice way to improve overall performance to avoid single  model's pitfalls. Basically, this conceptual model is used very often to improve prediction errors, I will have a look at the tutorial (I am new to RM). Hope you get my Dropbox referral ;) Thanks Tom for your help!

  • http://www.neuralmarkettrends.com Tom

    @Ahmed: you can run multiple models using the same data and then with different performance operators to judge which on is better, all in the same experiment but I've never did as you suggest, interpolate between them in the last stage.  I think it is feasible in RM but I'd have to research it. :)

  • Klaus

    Hi Tom,
    your videos are just what I was looking for to get started with Rapid Miner! I am looking forward to see your toutorials #9 and #10.
    One question to #8, could you explain what the prediction graph exactly shows? I do not quite understand the explanation of the Predict Series-Operator.

    Thanks for your great work!

    Best regards,
    Klaus

  • http://www.neuralmarkettrends.com Tom

    @Klaus: This is from the Operator Info (F1) in RM - 
    "This operator can be used for some basic series prediction operations. The given series must be univariate and must be encoded by examples, i.e. each point of time is encoded by the values in one single example. The values which should be predicted must be defined by the label attribute. Other attributes will be ignored. The operator creates time windows and learns a model from these windows to predict the value of the label column after a certain amount of values (horizon). After predicting a value, the window is moved with step size 1 and the next value is predicted. All predictions are kept and can be compared afterwards to the actual values in a series plot or with a performance evaluation operator."

  • http://scientometrics.wordpress.com mathias

    Hi Tom, I can’t seem to access the xls and xml files in the dropbox. I was redirected to dropbox’s 404 page.

    Nevertheless, great videos!

  • http://www.neuralmarkettrends.com Tom

    @Mathias: I’ll have to check into that but I’ve been terribly busy and probably won’t be able to fix the problem for a while. Can you just view the video and build the model from that?

  • http://scientometrics.wordpress.com mathias

    @Tom: I did manage to build the model from viewing the video during the weekend. Thanks.

  • http://www.neuralmarkettrends.com Tom

    @mathias: great!

  • Mark Knecht

    Hi Tom. I am unable to get to any of the Video #8 stuff. All 3 links tell me 404 – not found.

  • http://www.neuralmarkettrends.com Tom

    @Mark: I think I’m going to have to rethink this dropbox download, you’re not the first one that its acting strange for. My next step, hopefully next week, is to make the source code, data, and videos available. Hopefully I won’t be so busy then!

  • David

    unable to get to video 8, I signed up for Dropbox , getting the 404 error mentioned.. any suggestions , would really like t view the last 3 video in the RM 5 series ,, GREAT job you did on these RM video’s, very helpful !! Now I hope i can view the last 3

  • Pathros

    @David: I think you can look for it on rapidminerresources.com
    ciao.

  • Ahmed

    Hi Tom,
    I have just noticed that in this video you get the same prediction accuracy in both xvalidation and out of sample data, so at 8:21 you got prediction accuracy 0.464 +/-0.108 when you were doing just Xvalidation. Also, and at 14:03 you got exactly the same number when you evaluated the models on the out of sample data. I was expecting to get different number, as you used different test set. Please correct me if I am wrong.
    Appreciate your help,
    -Ahmed

  • Tom

    Hi Ahmed. In Vid #8? I don’t think I used XVal in video #8. Are you thinking of Vid #10?

  • http://www.neuralmarkettrends.com Tom

    @Ahmed: In Vid #10, I created an experiment that builds off the training data and then simultaneously tests the out of sample S&P500 data. Its one seamless training and prediction experiment. The Performance Vector results of 0.464 +/- 0.108 refers to the training data and not the out of sample data test.
    Hope that helps.

  • Ahmed

    Sorry for the wrong post, I thought I posted it in vid #10. Yes, I realized that both number are for the xvalidation, it was funny as I changed the test set 3 times in my exp. and I got the same number every time, I thought I am going crazy. Any tip how to get the performance vector results for the test set (out of sample data)?

    Cheers..

  • Ahmed

    That was simple, I just connected the Forecasting Performance to apply model and connect per and exa to the output. Sorry for the stupid question. This forum is really awesome!!!
    Ahmed

  • http://www.jcachat.com Jonathan Cachat

    would it be possible to get a new dl link for the example files?

    thank you for the providing the best and only understandable resources for learning RM!

  • http://marketextend.com Jamie Shiller

    The dropbox links are broken. Can you please update the :)

    Love your tutorials. Thanks!

  • Marvin

    Tom. Love the tutorials. Thanks.

  • http://www.neuralmarkettrends.com Tom

    @Jaime: I’m trying but I keep getting overloaded with work. =)