Breakout Stock Pattern Recognition

I love trading breakouts and piling into trends.  As such I’ve been spending an awful lot of time trying to understand patterns that the assets make prior to breaking out and forming new trends.

It’s not rocket science, from a chart perspective breakouts happen when an asset hits a resistance line, then breaks through it, and continues higher.  The resistance line is typically a 52 week or all time high, or some other “line” in the asset’s class price action based on some previous event in time.

Breakouts are also a function of news, such as fundamental changes in outlook of a particular asset.  Perhaps Apple came out with a new product like the iPoop and this causes investors, funds, traders, etc to pile into or out of the stock. When they do, they leave clues in the price/volume action on charts.

Devoid of reading the news, the only way we can see changes in trader/investor psychology is through these price and volume clues.  This is why  I’m a big on understand candlesticks.  I like to look at charts and find where there are hammers, spinning tops, engulfing candles, and other “trend changing” formations.

However there is no way that, I as a human, can see subtle changes in price/volume changes in stocks that may be hiding clues in market psychology, but a computer can.  That’s where Rapidminer is helping me decipher these subtle clues and patterns.

Below is a chart of the ETF GWX.  It broke out around September 13th, 2010 and seems to be trending higher.  Ideally, a trend trader would probably get long above the August 6th spinning top and the resistance line around $26.40ish.  Now wouldn’t it have been better to have gotten long sometime around August 24th?

The answer is YES but would a trader do so? NO! You’d be concerned that the price could drop and you’d wait for the bounce to follow through and might even make a small trade to ride it back up to the resistance line.  However, once the resistance breaks, you can bet people will pile in (see the volume action in October).

So I’ve been working on a model to try and identify when a new trend, and a subsquent break of the resistance line, is going to happen.  When will the odds be in my favor? What is the right time to establish a position before the masses do?

Well, it ain’t easy but I think I’m making progress.  Take a look at the Rapidminer chart below:

The red line activity (my break out signal) coincides with the breakout of GWX (blue line)for the Sept 13th period.  So the model is now identifying breakouts, which is progress in my book.  However, the trick is for the model tell me to get long on August 24th, which it doesn’t do right now.

Rapidminer pushes the envelope with Information Extraction

I’ve been waiting for this plugin ever since Felix Jungermann showcased it at RCOMM 2010! It seems all the really cool things were showcased there and I wonder what’s in store for RCOMM 2011, but I digress.

Felix has been busy since then and he’s finally released a compiled .JAR file for his Information Extraction plugin.  This is a quantum leap in text mining, in my opinion, and will be the next big thing.  Now you can parse text data and find associated information with your word tokens.  Wild.

It’s really easy to get started with this.  Just download the .JAR file and save it under the /lib/plugins director in your Rapid-I directory.  Then load up RM and it shows up as a new set of operators.

Have a fun sleepless weekend playing with this.

A Botched “R” plugin installation in Rapidminer – Solution

I decided to install the “R” plugin in Rapidminer recently and seriously botched the process.  I botched it so bad that Rapidminer got stuck in an installation loop that would ask me to “Exit – Restart Rapidminer” continuously.  I couldn’t get Rapidminer to load and I was stuck.  So what’s the course of action if something like this happens to you?

First and foremost, go to the experts.   I went to the Rapid-I forums and searched for “R plugin” and in about 10 seconds I found the answer to what I was looking for. Following Sebastian’s answer to a poster’s similar problem,  I found my extensions.xml file and edited it.  Then I restarted Rapidminer and all was well again the land of data analytics.

So if this happens to you, just search for the extensions.xml file in your \.RapidMiner5\managed\ directory and delete out offending plugin.

Particle Swarm Optimization in Python

I found and installed the ECSPY evolutionary computation package and fiddled around with it.  Considering I learned how to define and use functions in Python now, the example code (txt) for this Particle Swarm Optimiztation (PSO) chart below is beginning to make sense.

Long time Neural Market Trends readers might be wondering why I’m suddenly posting about Python and not Rapidminer? It’s a valid question and I do have answers.

First off, I always wanted to learn a programming language because I’ve felt that not knowing a programming language has held me back career wise, especially when I’m manipulating and data mining oodles of financial data.

Second, Python is a great way to get my feet wet learning programming! Its  fun and easy so far!  Ultimately the goal is to learn Java so I can truly extend Rapidminer by creating custom operators, but learning Java at this stage of the game is like swallowing a whole elephant at once; not going to happen!  So I’ll start with eating a Python first.

An Example of R and Rapidminer for Trading

Ingo over at the Rapid-I blog found this link from a Rapid-I forum member about using R and Rapidminer for Trading.  It’s a pretty wild process developed by Neural Concepts, and he goes into detail about the the win/loss ratios for the system!  An utterly fascinating read and a job well done indeed!

It goes to show you that the application Rapidminer, and the growing plugin list, makes this software very flexible indeed for ANY application you need!

Rapidminer Text Mining Videos

There’s a whole new set of text mining tutorial videos currently being produced, and they’re not by me!  Neil over at Vancouver Data Blog is rolling out 5 brand new tutorial videos over the course of the week on how to use Rapidminer for text mining.  His first video on how to load text in Rapidminer is a great way for novice text miners to get started and learn how to wield unstructured data.

I’m definately checking out his posts this week, especially the ones toward Friday because they intersect with what I’m doing with my Twitter project, and you should too!

Got Big Data Analytic Needs? We Got Big Rapidminer Solutions!

Or should I say, Enterprise Solutions.  I posted yesterday about the amazing amount of answers to questions out there in the forums and blogs for new Rapidminer users.  Often these questions take the form of “Does Rapidminer do “X” or how does the XVal operator work?” Usually from those answers, the user can tinker with their processes and finish up their work.

But what if you’re looking to truly really unleash the power of Rapidminer?

What if your company or special project needs a custom solution that’s open source, stable, platform independent, and has the technical expertise of the Rapidminer developers and power users behind it?

Then you need the Enterprise Solution.  The guys at Rapid-I provide a three tier consulting solution for small to large enterprises, all customized to their needs, and they got a solution for you.

  • Need support setting up automated data mining systems? They do that!
  • Need a custom operator designed? They do that!
  • Want to monitor your company’s brand with out the hassle or servers and software setup? They do that!
  • Need some training? They do that too!

I can go on and on about the depth and breadth of their services and consulting capabilities, but I’ll let their great list of great references ranging from Siemens to Merrill Lynch, and even the Cleveland Indians to speak to that instead.

From Rapidminer’s early incarnation as YALE, I watched this open source software evolve from version 3.4 to 5.0 and I’ve been amazed at its growing functionality and capabilities.  Now, with its new plugin architecture, I can analyze time series data easier than ever before, and text mine Twitter.

It’s been an awesome ride so far and it’s getting better everyday.  Want to come along for the ride?

You Got Rapidminer Questions? We Got Answers

I just wanted to remind my readers, and any new readers, that between the Rapid-I forums, the Rapid-I Blog,  my blog, and my NMT forums, there is an amazing amount of answers to your questions!

First and foremost, the Rapid-I forums and my forums should be your first stop for questions not answered in our growing video library of tutorials.  If that fails, a well thought out (and written) email to yours truly is appreciated. I try to answer every email and forum post (if another reader hasn’t already) I get, but I do a lot of traveling for work and sometimes I forget or don’t answer right away. Please keep pestering me! =)

If you’re just interested in learning about new upcoming functionality in Rapidminer OR just seeing some great ways to extend this software, check out the Rapid-I blog too.  Ingo does and great job showcasing Rapidminer’s existing and future functionality and you should put their RSS feed in your reader.

Rapidminer 5.X Video Tutorial #12 – Using the Generate Attribute Operator to Create Trading Rules

In this video I highlight the often overlooked (I know I did) Generate Attribute operator to create trading rules. This operator is not just applicable to financial data but to any data set where you want to transform and create new attributes ”on the fly.”

No sooner did I post this video that I received a notification of a new post at the Rapid-I blog about this operator. They’re going to extend nominal values in the Generate Attribute operator in Rapidminer 5.1.

What is the WhiBo plugin for Rapidminer?

Today’s guest post about an awesome new plugin for Rapidminer, is from Milan Vukicevic.  Although I walked in at the very end of his presentation at RCOMM 2010, I sat down with Milan on my last day and he gave me a personal demo of WhiBo.  The applications I see from this plugin, as it relates to the financial world, is its ability to build algorithms on new data,  find patterns, and tweak parameters that were never possible before. Thanks Milan!

WhiBo is a RapidMiner plug-in for component-based design and performance testing of data mining algorithms. Users can design whole algorithms simply by connecting components. These components are building blocks that represent crucial algorithmic steps that every algorithm of certain type should have.

WhiBo has an interactive GUI for design of component-based algorithms that can be designed and saved for reuse with just a few clicks, without having to write a single line of code. This way, data mining practitioners have more possibilities to construct and rebuild algorithms that better adapt to concrete data.

In comparison with traditional algorithms, which could only be adjusted by parameter tuning, this approach offers more significant possibilities of algorithm adjustment. A component repository for design and testing of Decision tree and Partitioning Clustering algorithms is provided. This repository allows users to design algorithms which can outperform traditional, well-known, algorithms. If needed, component-based design allows simple extension of the repository, but also definitions of new generic algorithms (e.g. neural networks, SVMs etc.). When combined with RapidMiner’s pre-processing and visualization operators, WhiBo becomes a powerful tool for pattern recognition and predictive analysis.

For more information about WhiBo and component-based approach in design and application of data mining algorithms, feel free to contact me at milan.vukicevic *AT* fon.bg.ac.rs, (remove *AT*). Installation instructions, detailed user and developer documentation and list of our publications can be found on www.whibo.fon.bg.ac.rs.

Using the SVM RBF Kernel

I’m happy to announce that today is the first of a two part guest post series. Today’s guest post is by Marin Matijas, who gave a presentation at RCOMM 2010 about Short Term Load Forecasting using Support Vector Machines (SVM). I asked Marin to elaborate a little about his use of the Radial Bias Function (RBF) in Rapidminer’s SVM operator and here’s what he had to say! I did edit the post a bit for readability.  Thanks Marin!

In my RCOMM 2010 presentation, titled “Application of Short Term Load Forecasting using Support Vector Machines in RapidMiner 5.0,” I showed how SVMs can be used to solve a volatile Load Forecasting problem.

Load Forecasting is an old problem, it is almost as old as modern stock exchange related forecasting. I am comparing these two, as both problems are time-series which makes them similar (also because we are all eagerly waiting for Tom’s videos with more insights on how to predict financial markets).  The goal of Load Forecasting is to predict exact values of an electricity (power) load in a given time interval. Typically a load for the day ahead is being predicted on hourly basis. Unlike predictions in the financial markets where trend prediction is often more important than ‘exact’ value, here the goal is to predict the (exact) value of the load itself.

Depending on the problem, Mean Average Percentage Error (MAPE) varies, but it is typically between 1 and 10 % for 24 intervals or more. A good precision can be obtained as load does not fluctuate much. Overall we consume typically more in winter than in autumn, more on Monday morning than Sunday evening, but when averaged electricity consumptions follow certain patterns.

Since load is serial nature where patterns are being repeated on a known basis, windowing has been used to take the advantage of this property. Support Vector Machines has been chosen for the regression, as it gave better results than previously used method. Compared to Artificial Neural Networks, it is much faster, an important characteristic with large datasets. One key parameter used for the SVM learner was the Radial Basis Function (RBF) kernel.  It was chosen for three main reasons, discussed below.

The first reason is that it is good for non-linear problems. Looking at a typical graph of the electricity grid daily load, one can easily see that Load Forecasting is a non-linear problem (see graph below).

The other types of kernels, linear and sigmoid  may be used but only under special conditions.  The second reason is that RBF has gamma parameter which makes optimizing the SVM in Rapidminer a simpler task.  The third reason is that RBF gave us the best results (low MAPE) than other kernels and it tends to be a standard kernel used in other research papers on Load Forecasting.

This same kernel can be applied for variety of other non-linear problems e.g. forecasting of options volatility and many others, as lot of problems are non-linear. The key take away points when incorporating an RBF kernel in a SVM is its simplicity in parameter and windowing optimizing in Rapidminer. I hope Tom will soon show in his video how simple it is to optimize parameters in RapidMiner, so you can create processes that utilize this powerful group of operators.

Cheerz,

Marin

mmatijas *at* hotmail.com