Rapidminer 5.0 Video Tutorial #1 – Introduction To Rapidminer

After a very long hiatus I present to my readers my first Rapidminer 5.0 video tutorial.  Its just a quick 10 min introduction to the GUI and data import functions of Rapidminer 5.0.  You’re gonna like the way it looks!

Video download link: Rapidminer 5.0 Video Tutorial #1

PS: I’m glad to be back guys. Leave me a comment if you want more, please stroke my fragile ego. LOL.

PPS: My Youtube Channel is here: Neuralmarkettrends1

PPPS: For those who want to follow along, see the original GE.xls file.

  • Alleinikov

    Great video! Thanks. Would like to see more.

  • Braulio

    Thanks for starting posting great information again!
     
    Keep the blog, keep the wonderful work
    Braulio

  • http://www.neuralmarkettrends.com Tom

    Thanks guys.

  • http://rapid-i.com Tobias

    Hello Thomas,
    thank you very much for your kind words. I am one of the guys involved in the development of RapidMiner 5 and I am really glad you like the RapidMiner 5 and the redesigned user interface.
    Your video is great – a really helpful ressource for overall beginners as well as those who upgrade to RM 5. I am really looking forward to see more RM5 videos from you .. ;-)

    Best regards from Germany,
    Tobias

  • Mark

    Thanks for making these videos!
     

  • Devon Kyle

    Tom – so glad you've decided not to shut down this blog and you are updating your video tutorial library with new versions using ver 5 of rapidminer! Very interesting site.
    thank you

  • http://www.neuralmarkettrends.com Tom

    @Tobias: You guys did a great job on this version. The new design allows for faster development and mid stream error checking BEFORE you run the model. I love that!  I'm developing the next video and hope to role a new video out every weekend, its tough when you have you kids running around :)
    @Mark & Kevin: Thanks guys.

  • Michael Black

    Hi Thomas,
     Thank you very much for producing these videos using Rapid Miner 5.0.  My wife and I are in the process of discovering which tools may work well to compliment our research for trading.  We are making good use of your tutorials to be able to learn Rapidminer. We love your teaching style.
      Beside Rapidminer, we are looking into other commercial software, e.g. Statistica, Salford Sys, Tibco, SAS.  Being that one can't beat the price of Rapidminer, it seems like perhaps this may be work best all things being equal.
     If however you know of some differences and or advantages that commercial vendors software have that may not be available with Rapidminer, I would greatly appreciate your advice.  Intuitive interface, accuracy and speed are the factors we are most concerned with now.
    Gratefully,
    Michael Black
    Eugene, Oregon

  • http://www.neuralmarkettrends.com Tom

    @Michael: I'm glad you like the tutorials.  I did spend some time reviewing other products but that was limited because some of the trading related neural net products were too expensive to buy.  That said, the new version of Rapidminer has gotten vastly more intuitive that its close to rivarlring some of those commercial versions.  RM does have commercial versions, especially and analytic server that allows you run cron jobs of RM. Add in RM's native text mining capabilities and remote data repository fetching, you can do some wild things. 

  • Michael Black

      Recently, my wife and I attended the" Predictive Analytics World"  conference in San Francisco over a week ago.  It opened our eyes to various modeling strategies and techniques.   We had a chance to talk to the vendors and found that useful; however considering many of the packages include the same techniques such as neural nets, decision trees and "Mars", we came away confused . There is a great price disparity from free( R.M. to SAS Enterprise that run 10s to 100 thousand + annual percentage license fees.)
      I think that before we commit to learning Rapid Miner, we would like to exhaust our options so as not to  use our time inefficiently.  Price isn't so much an option so long as we can see the return on investment on the horizon. The risk is in determining that utility in relation to the time spent learning the product.
     Thanks again for putting up this site. It is very much appreciated. If you can think of anything else, I would appreciate it very much.

  • http://www.neuralmarkettrends.com Tom

    @Michael: Good luck with your search! Let me know what you finally decide on.

  • Michael Black

    Will do and thanks again.

  • scbhushan

    HI TOM, THANK YOU VERY MUCH, FOR THIS GREAT WORK.  IT VERY HELPFUL. 
     
    REGARDS

  • Jepse

    You totally hit the point of getting started with rapidminer!!
    Thanks for that!!!
     

  • Akash H

    Awesome  Tom,Keep up the good work, thanks a lot!

  • http://milktrader.net Milktrader

    I recently purchased Peltarion’s Synapse and the similarities are not insignificant. Synapse’s IDE is built on top of C# and I believe RapidMinder is built with Java.

    Appreciate your efforts in doing these tutorials as the greatest hurdle with these packages is understanding how to apply all of these modeling algorithms. I’m giving much attention to understanding the theory portion and have turned to several books as well as videolectures.net videos on machine learning.

  • http://www.neuralmarkettrends.com Tom

    @Milk Trader: I looked at Peltarion before but skippedit because of its cost. Does your version come with Genetic Algorithms?

  • http://milktrader.net Milktrader

    Yes, it has a genetic optimizer as well as a swarm optimizer (PSO). It’s snippet architecture also lets you code your own optimizer (or find the C# code for one) and add it. For example, is you were particularly fond of simulated annealing, you could add it.

    When I was evaluating neural network platforms a couple months back, Synapse ($679) came in at the low end of what I was reviewing. The others I looked at included MatLabs NN Toolbox, Neural Solutions, NeuralIntelligence. NeuralShell. Synapse came out on top because of its open architecture, robust algorithm selection, .NET deployment and relatively low cost (I was sold when they cut their price in half to 499 Euro).

    Of course, somehow Rapid Miner missed my radar. I looked at other cost-free solutions in R and Python and intend to pursue them, but an understanding of the underlying theory is more crucial when you build networks from the script level as opposed to the drag and drop IDE’s of a Synapse or Rapid Miner.

    I’m still very far away from developing my own proprietary learning algorithms, but when I do, I can code it in C# and drop it in the Synapse environment.

  • http://milktrader.net Milktrader

    Another interesting feature in Synapse that nudged me into the product is the ability to apply a C# filter to a data set. You can write a C# script to calculate the RSI, highest high (20), close above 30 ema, etc for an existing data set. It’s a bit more work initially, but long-term offers some distinct efficiency returns on analysis. It appears now that you are manipulating data and adding derivative columns (RSI, SlowSto, etc) in Excel, and then dropping them in Rapid Miner.

    One thing that has me very interested in Rapid Miner is its implementation of WEKA algorithms. I understand these NN algos are written in Java and though I wanted to pursue them, I decided to drop it since a Java implementation is far beyond my current programming skill level and I thought time would be better spent learning NN theory and rapid implementation of models.

  • JohnS

    Oh, nice! I develop in c# for a living which makes that a plus but still the cost is a bit of a deterrent for me still. Add that to my low level of skill/knowledge in neural networks and learning algorythms.

  • http://milktrader.net Milktrader

    @JohnS are you on twitter? I’m @milktrader. Would be interested in your insights into C# algorithms and the like. There is actually a fairly rich C# neural network community out there. I’m sure you can implement a C# neural network without the Synapse IDE. It’s a visual IDE, much like Rapid Miner and is useful to those with less than professional grade programming skills.

    Do a search on C# and neural networks and you may find some interesting material. There are also a few Python and R implementations out there as well. I haven’t found any Java implementations until I got turned on to Rapid Miner. It is a very impressive product, particularly given the cost.

  • http://www.neuralmarkettrends.com Tom

    Any of guys want to build an automated trading bot with me? I’ve been looking for a development team to work with because I’m not a coder but more of a strategy guy.

  • http://milktrader.net Milktrader

    Automated Trading Bot.

    This is how I envision such a beast:

    First, a simple trading algorithm with simple rules that has shown it can have a positive expectancy in a particular market. For the system to qualify, it must first show decent backtest results, the ability to improve upon those results upon parameter optimization and then the ability to still win on unseen data (walk forward).

    Second, an intermarket neural network adviser that acts to advise our vetted system on whether to take signals or not. This is not a mission-critical role and a failure of the NN does not blow up the account. It is my view that NNs work best this way as nobody has shown they are fault-tolerant.

    Third, a allocation algorithm based on some acceptable metric, such as max gain (this means large drawdowns, btw) or probability of success.

    The complete automation would require access to a broker’s API and I believe C# is a common language interface for this purpose. I haven’t implemented one though.

    A collaboration effort can be focused on best practices and workflow. This way, individual participants could keep their proprietary system logic their own.

    I’m in. Though like you Tom, I’m more of a strategy guy and working to ramp up my programming skills.

  • http://www.gametree.org Gman

    Tom,
    Hello, I’ve been following your videos since they first came out, and we’ve talked a couple of times. I started a trading group on http://www.maxinmontreal.com. That website was created for pokerbotting, but we have a strong team there trying to predict forex, and have created many interesting setups.

    You have talked to the team’s other co-founder in your forums, c1borg. We have 2 very proficient RM model builders. Two very strong programmers, and the rest of the team are good programmers. We have two experienced forex traders, that make money independent of Rapidminer, and tomorrow we will admit a dedicated statistics memeber, with a degree in statistics (and it is also his daily career predicting econometric time series). We have built a complete backtesting system exclusively in Rapidminer, that can backtest a system over many years, and are starting to see some payoff on all of our hardwork this past year. We have SVN/dropbox system, for updating our revisions to the entire team for sharing, and tonight I am setting up a dedicated 3.2 Ghz box for backtesting systems, and all team members will have VNC access.

    If you would like to join a team, that has already hit the ground running, please create an account at http://www.maxinmontreal.com/forums, and we will get you Insider access to our private forums immediately.

  • JohnS

    @Tom, ok, you caught me… I started teaching myself regression analysis (this and neural nets is what we do where I work) which lead into technical analysis. It started as an idea to help me find stocks I wanted to get into but all the while I kept thinking that if I could teach myself the analytics side of it, the automated bot or “algorithmic trading” aspect would be a piece of cake.

    I’ve written automated web bots before and thought that’s what I would end up having to do here too. I knew I could make the app “function” if I could ever figure out the analytics aspect. That is why I’m here…

    @Milktrader, I’m not currently on twitter (only because all the login’s I want are taken). I’ll do a search on the c# neural nets and see what I can find.

    You guys can count me in for some c# development. I was planning on doing it anyway, why not do it better with collaboration.

  • http://milktrader.net Milktrader

    @JohnS

    Here are some ideas:

    csharpNN
    csharpBOT
    MyRegression
    cSharpIsFun
    cSharpTrader
    ButIRegress

  • JohnS

    @Milktrader Thanks, I’ll have a look at them.

  • http://www.ranartech.com Arnab

    Hi Tom,
    Thanks for putting up such a nice tutorial. Although the tutorial in rapid-i.com is good, but this one is superb. I am a Java programmer and I was wondering if it is possible to get some information on how to use the rapid-i libraries into external Java programs. I could not find information related to this on rapid-i website.

  • http://www.neuralmarkettrends.com Tom
  • http://predictive-marketing.com/index.php/blog Bob Hodgson

    Hi Tom,

    Thanks for this video. It is fantastic. I hope that you decide to continue to produce more videos. I only wish they were already available!

    Bob

  • http://www.neuralmarkettrends.com Tom

    @Bob: I’ll be producing a few more videos when I have more time, hopefully over the summer.

  • Stanley

    Your a great teacher, thanks it really helps!

  • Leonid

    Thanks for your great video. Very nice!!!
    Let god bless you work and wish you to be an evangelist not only for Rapid Miner )
    I’ve watched tutorial with great pleasure!

  • Kreso

    Hey, thanks for the videos – could you please post original GE.xls file?

  • Simon

    Great stuff. This is by far the best resource i have found for an opensource software.

  • shanta kumar

    its just wonderful!! looking forward for more.

  • Sandy

    Awesome tutorial! Exactly what I needed to get started.

  • http://www.neuralmarkettrends.com Tom

    Thanks all.

  • Frank

    Hey great stuff, thanks for making these vids!

  • Vani

    Great Video! I am new to Rapid Miner and looking for text mining with rapid miner and classification of similar text together. Can you please guide me to a resource which can help me out.

  • http://www.neuralmarkettrends.com Tom

    @Vani: I’m just starting to dabble in Text Mining with Rapidminer and I’ll be posting about it over the next few weeks, as time permits ofc.

  • Vani

    Thank you very much! I look forward for your videos on text mining.

  • Mike

    Hi Tom,
    Thank you very much for your work. I am currently taking a course on the subject (complete noobie here), and your tutorials have been tremendously helpful!

  • abdel

    Hi tom, RapidMiner is making data mining funny and practical, and your tutorials are really amazing.

  • Marc

    Tom,
    Modern predictive analytics are new to me. We have estimated things in the past but based on simple statistical models (e.g. average, median, etc).

    I am looking into predicting an ‘end’ date (or delay in days from ‘start’ date) for a yet-to-be completed process. Assume it looks like the following. I want to predict these missing end dates (shown with a ?):

    ID attChar1 attChar2 attNum1 attNum1 dateStart dateEnd 1 USA Brown 0 1 01/01/10 04/13/10
    2 UK Blue 1 2 11/21/10 ?
    3 CAN Red 3 0 06/16/09 08/15/09
    4 UK Blue 1 2 04/16/08 11/10/08
    :
    :

    Is there a particular model algorithm that we should pursue initially? I am looking for the most accurate predictions possible.

    Thanks in advance
    Marc

  • Anonymous

    Your interested in predicting an end date? This is part of a time-series problem right?

  • Pylot7

    I am a new miner. Loved your video. I am sure you took many hours off of my learning curve!!

  • http://twitter.com/neuralmarket Thomas Ott

    I’m glad you liked the video. Please check out the other ones too. We’ll have you mining with Rapidminer in no time.

  • Bee

    great job … really helpful

  • Navin

    Hi Tom,

    I wanted to step into this industry of Data Mining, Justwwatched your first tutorial and learnt some things about the same. I am confused with these Modelling types. can u elaborate some of the widely used ones.
    And which tutorial of yours shows a result in data format with these modelling techniques?