Text Mining Blog Keywords In Rapidminer and Google Analytics

Text Mining Blog Keywords In Rapidminer and Google Analytics

So I finally got around to downloading some keyword data from Google Analytics for the time period of 2/17/11 through 3/17/11 just to see what’s driving my site traffic.  I did a simple text mining process in Rapidminer to build my keyword frequency list (it took me a few minutes) and generated keyword similarities.  Of course I know what is the biggest draw to my site, that would be my tutorials about Rapidminer, BUT what I’m looking for are subtler patterns in the keywords relative to the bounce rates and site visits.

So below are a few charts I generated from one month of keyword data.

The first chart I want to share with you is a bubble chart showing the site visits for a particular keyword vs the bounce rates. In this case the keyword is Rapidmi (a stemmed word for Rapidminer) It’s a bubble chart so the size of the bubbles are set for the frequency of the word Rapidminer relative to the site visit and bounce rate.

Visits vs Bounce rate with RapidMiner keyword

The second chart is visits vs bounce rate but with the keyword Tutorial as the bubble size.

Visits vs Bounce rate with Tutorial keyword

And the last chart is visits vs bounce rate but with the keyword Stock as the bubble size.

Visits vs Bounce rate with Stock keyword

It appears from the above exercise that the keyword Rapidminer and Tutorial drives a lot of traffic but they have a relatively even keyword frequency distribution across the bounce rate, some people bounce immediately while other stick. The keyword Stock has an interesting bounce rate per visit distribution relative to the keyword frequency, its either 100%, 30 to 50% or almost 0%.

What I find fascinating is the stickiness of the keyword frequency Rapidminer and Tutorial relative to the 50% bounce rate and site visits. There’s a strong site visit (45 to 60) component for those keywords in the data, but I knew that already.

I’m attaching the Rapidminer process file in case you want to mine your own keywords (you have to supply your own data).


Up next Statistics - An Introduction to R I’m sure you’ve noticed that I put Google Ad’s back on this site. Don’t worry, I’m doing this on a temporary basis because I want you guys to help Rapidminer Sample Process: Parameter Optimization Below is a simple parameter optimization process in Rapidminer using the Iris data set.  Download the TXT file and import it into Rapidminer.  Of
Latest posts The Ye Old Blog List Motorola: Then and Now EWM Redux Testing for mean reversion with Python & developing simple VIX system - Talaikis unsorted - Tadas Talaikis Blog Steps to calculate centroids in cluster using K-means clustering algorithm - Data Science Central Basics of Statistical Mean Reversion Testing - QuantStart Algorithmic trading in less than 100 lines of Python code - O’Reilly Media Interpreting Machine Learning Models Microsoft the AI Powerhouse Investing in the S&P500 still beats AI Trading Microsoft makes a push to simplify machine learning | TechCrunch 10 Great Articles On Python Development — Hacker Noon Introduction to Keras Democratising Machine learning with H2O — Towards Data Science Getting started with Python datatable | Kaggle Phone Addiction Version 12 Launches Today! Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing Investing in 2019 and beyond TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund