September 2, 2010
What is Data Mining?
My old college professor, Dr. Stephan Kudyba, explains what Data Mining is for newbies.
His data mining class is the reason why I started Neural Market Trends.
My old college professor, Dr. Stephan Kudyba, explains what Data Mining is for newbies.
His data mining class is the reason why I started Neural Market Trends.
I’m done with my draft presentation for RCOMM 2010 and hopefully will wrap it up over the weekend and then its to Germany in two weeks. Its hard to believe but RCOMM 2010 will soon be a reality. I’m looking forward to meeting the rest of the Rapid-I team and networking with some great minds.
Once I return from Germany I will post three new video tutorials and some Rapid-I goodies for my loyal readers. One of those videos will be about parameter optimization for sure!
Thanks for sticking with me through my hiatus, I’m very appreciative of your understanding!
I’ve been busy working on my presentation for the first Rapidminer Community Meeting and Conference (RCOMM 2010). If you find yourself in Dortmund, Germany on September 15th, please stop by to hear me speak! Of course, there are tons of other reasons to go to RCOMM 2010 and here are just a few that I find interesting:
Workshops
Pattern Recognition Engineering
Shafait, Reif, Kofler, Breuel; DFKI Kaiserslautern and Technical University Kaiserslautern
Collaborative Work with RapidMiner and RapidAnalytics
Dr. Simon Fischer; Rapid-I
Integrating R into RapidMiner*
Sebastian Land; Rapid-I
Training Sessions
*I’m really looking forward to the unveiling and integration of “R” into Rapidminer 5.X and I’ll be listening closely to Sebastian’s presentation.
See you in Germany!
Hi all!
I’m almost back! I’m easing my way back into posting again as work somewhat quiets down. Thank you so much for your emails and comments while I was away. I’m still terribly busy with engineering work but I promise I’ll try to answer all your questions and comments over the course of the next week.
On a side and exciting note, I will be giving a talk at the very first Rapidminer Conference 2010 in Germany this September! The RM Team have put up a tentative schedule at www.rcomm2010.org and its looking to be 4 exciting days of training, talks, and workshops about our favorite, and powerful, open source data modeling suite.
I hope my readers can make it there between September 13th and 16th! It would be nice to meet you and finally meet the RM team! I hope that my liver doesn’t give out when I there! =)
Sorry folks, life is getting busy again and it looks like I’ll be traveling a lot for for work. For some strange reason, work always gets awfully busy this time a year.
Thanks for sticking around and I hope to start posting again in a few months, probably after the summer.
In several of my video tutorials I assign different parameters for my learning model “on the fly.” Of course the question any astute reader/viewer should ask is, “why did you choose those parameters instead of another combination?”
That’s a great question and the answer is, “well I just choose those parameters to illustrate my point for the video.” While this answer is not at all satisfying to the astute reader/viewer, it does lead us to ask the most important question of all, “what are the right parameters to choose?”
This can be answered very well if you were to use Rapidminer’s Parameter Optimization operator in your initial data discovery phase. This operator allows you to choose some or all of the parameters in your experiment and iterate different values for them to meet some specific requirement on your part (i.e. performance).
For example, if you were using the Neural Net operator and didn’t know what to set your learning and momentum parameters to, to get the best classification accuracy, you would use the Parameter Optimization operator to interate different combinations of those parameters to find the best accuracy.
Once the Parameter Optimization operator determines those values, you can input them into your experiment and truly optimize your model for performance! See below for an actual output from a parameter optimization model I’m working on. You can see that Rapidminer indicated that momentum of 0.3 and a learning rate of 0.6 was the best parameter settings to maximize the accuracy rate and minimize the classification error.
While is operator is a fantastic feature (they got evolutionary optimizers too!) for us data modelers, its a massive computer resource suck. I would advise anyone using this operator to have a very powerful server or computer, with oodles of memory, to run your iterations.
FYI, I plan on going over a short parameter optimization example in upcoming webinar.
I have some great news! If you’ve every wanted to ask me a direct question and/or get more information on my past video tutorials, this will be your chance! I plan on hosting my first Neural Market Trends webinar this May!
I plan on going over a few of my video tutorials in more detail for the webinar course and will provide a long Q&A session to boot! Of course I can answer any other general questions you might have about Rapidminer and applications for it.
It should be noted that this will not be a free webinar! I do plan on charging for it, but I will provide an “early sign up” incentive that will be reasonable.
My readers might be wondering why I have a beautiful Rapid-I logo so prominently featured at the top right side of my site. It’s probably because the Rapid-I team has my logo so prominently featured on their partner page.
Yes, you read right. I’ve joined forces with the Rapid-I team and became a partner company with them to help spread the word and evangelize about Rapidminer, and their other exciting products too.
What does this mean for you? It means that I’ll be focusing on using Rapidminer exclusively on this blog from now on and hopefully roll out some new and exciting features over the course of the year. I have a lot of things cooking right now that I’m not a liberty to post about right now, but 2010 is shaping up to be a good year for my loyal readers!
Just wanted to pass this along, in case you haven’t seen it already but Rapidminer has released the Web Mining extension for version 5.0. All you need to do to install it is to go to the pull down menu HELP > Update Rapidminer. It should find all the new extensions and let you select them.
As I wrote before, I’m trying to make Neural Market Trends a better reader experience but stripping out a lot of things and keeping things simple. As such, I’ve had a shiny new and unique branding logo created for my blog by Stefenie!
Stef did a fantastic job and helped me translate my vision for the Neural Market Trends brand into an elegant logo. She’s professional, responsive, and her graphic design skills are great. I highly recommend her for your future blog work or other graphic design needs! Big Thanks Stefenie!
Check out her blog to see some more of her talent!
For what its worth, there’s an interesting post on Fast Foruier Transformation and how to build a Rapidminer model to find stock market cycles in the forums going on. Members Ronmac and AWC have been kind enough to share the Rapidminer XML code and data file for it. I fooled around with it and discovered some interesting patterns in the markets, especially using S&P500 data.
It would appear that there is a turning point in the market every 12 days or so. Very interesting and this model definitely requires more analysis by yours truly…
In this video we continue building a financial time series model, using S&P500 daily OHLCV data, and the windowing, sliding validation, and forecasting performance operator. We test the model with some out of sample S&P500 data.
This video can be viewed in HQ by clicking this link here. Please make sure you have Quicktime or another MP4 capable reader installed in your browser.