September 22, 2010

The Whirlwind that was RCOMM Part 2

Wow,

Well the jet lag finally caught up to me so I apologize for this late post on RCOMM. Thursday morning was kicked off by yours truly, and I was deeply humbled that the Rapid-I team asked me to one of their two invited speakers at RCOMM.

For my presentation I choose to talk about Forecasting Historical Volatility for Option Trading. The subject of this talk was about the creation, or rather recreation, of a research paper that tried to predict the rise and fall of historical volatility and then utilize option volatility strategies to make profit.  I created the Rapidminer model from this research paper back in 2007 after an astute NMT reader, who also is a full time option trader, contacted me about collaborating on such an endeavor.  Long story short, we test traded the model through summer 2007 and it seemed to be working fine until Bear Stearns blew up.  We both got busy with the financial mess that began unfolding before us and the collaboration was put on indefinite hiatus.

When the Rapid-I team invited me to give a talk, I decided to talk about this experiment because it yielded some interesting results that perhaps the original researchers didn’t think about.  The first thing I did was to recreate this model using the newer Time Series Forecasting plugin and include the volatility time period from 2005 to 2010 for the S&P500.  In doing so, I yielded results that differed from what the research paper was predicting.  I proceeded to further drill down into the details and retrain the model on two distinct time periods from 2005 to 2007, and 2007 to 2009, with both showing very different results.  With the benefit of time, I was able to determine that in times of orderly/low volatility the historical volatility forecasting trend had greater than 60% accuracy.  In times of high volatility it was slightly better than a coin flip.  It seems that this strategy for forecasting historical volatility does work but only when the markets behave.”

Marin Matijas followed my talk on a similar type time series project, trying to apply  Short Term Load Forecasting using Support Vector Machines in RapidMiner 5.0. I was able to glean some interesting insight from his talk about using SVMs for my previous talk if I wanted to supercharge the option trading system, but that’s for another time.  Check back next week for a guest post from Marin where he details a bit more about using a RBF function in a SVM for his time series analysis.

Following Marin’s talk there was a short break which we chatted, networked, and drank lots of coffee. We began the next set of talks about how data analysis in Rapideminer can be improved.  Alexander Arimond presented about Distributed Pattern Recognition in Data Mining, then Marco Stolpe presented how stream mining can be integrated into Rapidminer (this was really amazing) in his Implementing Hierarchical Heavy Hitters in RapidMiner talk, and lastly for the morning we heard from Olaf Laber of Ingres Vectorwise about how the way databases use in memory are about to be changed forever.

Sounds like a lot for day doesn’t it? Well that was just the morning!  We kicked off the afternoon with two workshops that included the unveiling of the €œR€ plugin by Sebastian Land and how to use RapidAnalytics by Simon Fischer.  The rumor is that RapidAnalytics will be released as a open source soon.  If that’s true I’ll be installing it on the NMT server and pulling down lots of daily financial data!

Closing out the RCOMM 2010 were two amazing text mining presentations.  I realized that we are on the cusp of something amazing in text mining when I listened intently to Timur Fayruzoys talk about using the Rapiminer Framework for Protien Interaction Extraction.  Timur unvileved a working system that helps researchers, doctors, and other medical practicitions find protein interactions by text mining research papers.  WOW.  If that didn’t blow me away, Felix Jungermans talk about the creation of a new plugin for Information Extraction did.  Under development is a new text mining related plugin that attempts to extract information, not data, from text.  This plugin will be a quantum leap for text mining in Rapidminer for sure and I’ll be checking for it regularly on the Rapid-I site.



news RapidMiner


Previous post
The Whirlwind that was RCOMM - Part 1 Wow, Incorporating and expanding on my first RCOMM 2010 post, I going to write about the various presentations that I found highly interesting
Next post
Text Mining Annual Reports Wow, I’m playing around with Rapidminer’s powerful text mining tools to dig through annual reports this evening and I’m making progress.