Learning Python helps me with Excel

I use Excel extensively at work and I’ve always wished to write intricate macros but never understood the complexities of Visual Basic.  After spending a few weeks learning Python, I’ve managed to build an Excel formula in VB that creates an IF statement formula if a hammer candlestick formation has  occurred.  I use this as a macro function and apply it automatically to my downloaded data using TraderXl.

The formula is generates a 1 if a hammer has formed or a 0 if not.  Assuming you’ve downloaded 153 rows of a stock’s date, open, high, low, close, volume in the A, B, C, D, E and F columns respectively. The output is in column P; you may modify it as you see fit.

Whoop-de-doo, right?

Well here it is:

Range("P1").Select
ActiveCell.FormulaR1C1 = "Hammer Formula"
Range("P2").Select
ActiveCell.FormulaR1C1 = _
"=IF(AND((RC[-13]-RC[-12])>(3*(RC[-14]-RC[-11])),(((RC[-11]-RC[-12])/(0.001+(RC[-13]-RC[-12])))>0.6),(((RC[-14]-RC[-12])/(0.001+(RC[-13]-RC[-12])))>0.6)),1,0)"
Range("P2").Select
Selection.Copy
Range("P3:P154").Select
ActiveSheet.Paste

]]>

R and Rapidminer Together = Disruptive Technology!

I’ve been teaching myself R now that I finally got Rapidminer’s R plugin to work.  It’s  pretty slick program and easy to learn, I’ve picked up so many things quickly.  I extensively use the PerformanceAnalytics, Quantmod, and tseries packages for R and on top of that, I started to recreate A Physicist on Wall Street’s awesome Rapidminer + R Example for Trading tutorial. So far so good.

It’s fantastic that I can now download stock quotes, using the R plugin, right into Rapidminer and then model those time series.  Yes the native R software has a few learning algorithms, but they in no way match Rapidminer’s breadth and depth.  That, and with Rapidminer’s ability to handle large data sets effeciently, and R’s statistical analytic and graphing powers, makes the Rapidminer and R combination a disruptive technology in my book.

Download it today, play with it, it will make your data shine in ways you can only dream of.

]]>

Text Mining Blog Keywords In Rapidminer and Google Analytics

So I finally got around to downloading some keyword data from Google Analytics for the time period of 2/17/11 through 3/17/11 just to see what’s driving my site traffic.  I did a simple text mining process in Rapidminer to build my keyword frequency list (it took me a few minutes) and generated keyword similarities.  Of course I know what is the biggest draw to my site, that would be my tutorials about Rapidminer, BUT what I’m looking for are subtler patterns in the keywords relative to the bounce rates and site visits.

So below are a few charts I generated from one month of keyword data.

The first chart I want to share with you is a bubble chart showing the site visits for a particular keyword vs the bounce rates. In this case the keyword is Rapidmi (a stemmed word for Rapidminer) It’s a bubble chart so the size of the bubbles are set for the frequency of the word Rapidminer relative to the site visit and bounce rate.

Visits vs Bounce rate with RapidMiner keyword

The second chart is visits vs bounce rate but with the keyword Tutorial as the bubble size.

Visits vs Bounce rate with Tutorial keyword

And the last chart is visits vs bounce rate but with the keyword Stock as the bubble size.

Visits vs Bounce rate with Stock keyword

It appears from the above exercise that the keyword Rapidminer and Tutorial drives a lot of traffic but they have a relatively even keyword frequency distribution across the bounce rate, some people bounce immediately while other stick. The keyword Stock has an interesting bounce rate per visit distribution relative to the keyword frequency, its either 100%, 30 to 50% or almost 0%.

What I find fascinating is the stickiness of the keyword frequency Rapidminer and Tutorial relative to the 50% bounce rate and site visits. There’s a strong site visit (45 to 60) component for those keywords in the data, but I knew that already.

I’m attaching the Rapidminer process file in case you want to mine your own keywords (you have to supply your own data).

KeywordSimilarity

]]>