Happy Holiday’s and Happy New Year!

I’m taking the rest of the month off from blogging and hopefully tie up a few loose ends before the year is over.  That said, I wanted to wish my readers and friends a Happy Holiday and a Happy New Year!  You guys made 2010 a banner year for this blog and I just wanted to say thanks!

All the best for you in 2011!

Cheers!

$1,000 Forex Experiment

I think I’ll restart my $100 Forex Experiment again in 2011 but up the ante this time to $1,000.  I probably won’t start trading until mid January to early February after I recompile and rebuild some neural net models.  If you remember from my old $100 Forex Experiment, I witnessed about an 11% drawdown the first month as I was fiddling around with my system,. I hope to avoid that in 2011, but you never know.

My process is simple. I use neural nets or SVM’s to confirm that the trend remains intact for some currency pairs, and then use good ol’ support/resistance spots to go long or short as the case may be.  I like to place limit orders for overnight moves to capitalize on London’s open and then close any trades before 8AM NYC time.   Yes, its that simple.

Let’s see if I can recreate the 50% return again.

Particle Swarm Optimization in Python

I found and installed the ECSPY evolutionary computation package and fiddled around with it.  Considering I learned how to define and use functions in Python now, the example code (txt) for this Particle Swarm Optimiztation (PSO) chart below is beginning to make sense.

Long time Neural Market Trends readers might be wondering why I’m suddenly posting about Python and not Rapidminer? It’s a valid question and I do have answers.

First off, I always wanted to learn a programming language because I’ve felt that not knowing a programming language has held me back career wise, especially when I’m manipulating and data mining oodles of financial data.

Second, Python is a great way to get my feet wet learning programming! Its  fun and easy so far!  Ultimately the goal is to learn Java so I can truly extend Rapidminer by creating custom operators, but learning Java at this stage of the game is like swallowing a whole elephant at once; not going to happen!  So I’ll start with eating a Python first.

A Simple INTC Chart

After yesterday’s “not so simple GOOG chart” adventure, I spent more time digging around the matplotlib website and discovered the matplotlib.finance module.  Grabbing a python chart example code from the matplotlib site and applying the axis labeling lessons I learned from previous day, I was able to create this $INTC red/green candlestick chart.

Granted, I still don’t understand all the code in the example, but its becoming less of a mystery to me. There are bits and pieces I understand and I’m able to tweak things. After all, it would be silly to reinvent the wheel all over again! Right?

A Not So Simple GOOG Chart

What I am about to post is a simple $GOOG chart created by a python script that I grabbed off Trainee Trader’s site.  While creating a line chart of GOOG’s stock price is really basic finance, it does represent something very important to me.  It means that I’m learning and troubleshooting python at a fast pace.

I just started learning python about a week ago with absolutely no knowledge in programming (except for Frotran 77 a long time ago), but I am armed with an engineer’s logic and enthusiam. I’ve worked on about 15 exercises in a training course I’m taking, and have read up on all the neat things python can do.   I installed python 2.6, matplotlib, and the ystockquote module from Cory Golberg on my development machine and configured the correct paths.  Then I then tried to copy and paste TT’s google chart python script to see if I could recreate his chart.  I typed it verbatim and fired it up in python hoping to see the chart!

Instead of seeing a nice GOOG chart, it crashed on me!!! Yep, it failed and I had no idea what the error messages were telling me.

Fast forward one week and I’m picking up this language quick.  So I took TT’s code again, checked out the documentation for the matplotlib module, troubleshot it, and got the damn thing to plot!  Go me!

Of course a sophisticated programmer would laugh at my silly plot, but to me it’s awesomeness.

Update: Here’s the revised python script (txt): GOOGChart2.txt

Learning Python – The Hard Way?

I’m teaching myself, through hand’s on exercises, how to code  (script) in Python.  So far its been a breeze, but that’s because I’ve found a great course for free.  I’m reading and following along with “Learn Python the Hard Way,” written by Zed Shaw.  Zed made his book available for free download in PDF and I highly recommend it to programming novices, such as myself.

Poking Python

So I’m finally getting around to poking around with Python again, and I created a scatter plot for Google.  I’m pretty impressed with the modules they have for this language. Perhaps I should sit down and spend some time learning it, not just dream about it.

An Example of R and Rapidminer for Trading

Ingo over at the Rapid-I blog found this link from a Rapid-I forum member about using R and Rapidminer for Trading.  It’s a pretty wild process developed by Neural Concepts, and he goes into detail about the the win/loss ratios for the system!  An utterly fascinating read and a job well done indeed!

It goes to show you that the application Rapidminer, and the growing plugin list, makes this software very flexible indeed for ANY application you need!

Can Twitter Sentiment Analysis Predict the Stock Market

Ugly over at Uglychart.com just posted a link about research that sentiment mined over 10 million tweets from 2008 and was able to predict daily market behavior to an accuracy of 87.6% .  While the post is vastly interesting from a text & sentiment mining perspective using social media, and the application of it to the stock market, I’m not 100% convinced its very viable.

Why? Well I tend to echo some of the comments left by readers at the bottom of the original post.  For example, once this “edge” is discovered by general market participants, it tends to get discounted and the edge goes away.  So what we read here today is probably already discounted by the market and is just routine “business as usual.”

Now, I certainly don’t mean we should abandon text & sentiment mining for the markets but rather we should continue to use these tools to develop our own secret edges and evolve them as the market changes.  Follow the advice of poker players and underarm deodorant manufacturers,  never show your hand and never let them see you sweat.

Rapidminer Text Mining Videos

There’s a whole new set of text mining tutorial videos currently being produced, and they’re not by me!  Neil over at Vancouver Data Blog is rolling out 5 brand new tutorial videos over the course of the week on how to use Rapidminer for text mining.  His first video on how to load text in Rapidminer is a great way for novice text miners to get started and learn how to wield unstructured data.

I’m definately checking out his posts this week, especially the ones toward Friday because they intersect with what I’m doing with my Twitter project, and you should too!

Tweeting Sentiment

I finally got my hands on some Twitter data from my collaborative partner and began the process of text mining it.  The creation of the Rapidminer model and then its subsequent execution took all of 10 minutes.  That’s the beauty of the Rapidminer system, you can build templates and have processes ready to go! Just add data!

But adding the data is usually the hardest and most time consuming part of text mining, especially getting the right data in the right format!  Since we’re working on a proof of concept model for now, my collaborative partner had to crawl Twitter, parse the tweets, and then hand classify 1,500 Twitter posts into Positive, Neutral, and Negative labels!  Whew!

Once I got the data I built a 10 fold cross validation model to process train and test the sentiment in Tweets for accuracy. Then I identified the most strongly correlated words to sentiment classification.  Our results are definitely promising, we achieved a near 80% classification accuracy and nailed all the correlated words.  There were some issues with missclassification of positive sentiment as negative and vice versa which we have to work on but overall this is a great start.

We now know how to fine tune the process/data, and hopefully squeeze out more accuracy between parameter optimization and better crawled data.

Now its back to civil engineering for a while, unless you guys want to hire me full time. :)