It’s been a while since my last update but I was extremely busy with work, Rapidminer World, and finally a nice vacation overseas.
- Working on a lot of new POCs
- Hosted the Hackathon with Tobias for Rapidminer World
- Had a chance to sit in on some great presentations at Rapidminer World
- Went swimming at Sandy Hook with my fitbit, needless to say it broke. Took a few weeks to replace it.
- Visited my extended family in China and Macau. Took a side trip to Taiwan, great times indeed!
- Drank way too much beer and ate way too much yummy food on vacation. Ouch!
- My daily steps broke through 1,000,000 (total).
The agenda for this year’s Rapidminer World (formerly RCOMM) is fast filling up with some great speakers and presentations, check out the agenda here. If you’re a student, you can get special pricing with a current and valid ID, so you have no excuse to come!
Why am I pimping this? Because I had an utter blast at RCOMM 2010! I met so many cool people there and I got listen to exciting applications and use cases. Rapidminer users were developing processes and applications that the world is just realizing its potential, how cool is that? To be on the cutting edge?
You should go, you won’t regret it! I’m going to be there myself, so come by and meet the team. I’ll buy you a beer!
This is why I love the Rapidminer Community. There are so many smart people out there sharing their Rapidminer processes with us that it keeps upping the game. Youtube user Sean C just posted a video on how to use Rapidminer and Twitter OAuth in a set of seamless processes to mine FIFA related Twitter feeds. Just plain awesome and make sure to subscribe to Sean C’s channel! Sean even shares the XML source code here.
A couple of notable things happened this past week:
- Started recording some new tips and tricks videos
- Picked up some new clients for new POC’s
- Broke my left pinkie toe as I was packing for our camping trip
- Fitness (walking) challenge is currently on hold for me. That includes kickboxing. Notice the trend down in the chart below
- Camped at Swartswood St. Park again. Everyone had a great time
Swartswood St. Park
Posted another tips and tricks video using Rapidminer. This one is about how to balance and sample data from a large data set (10 million rows).
Finally got around to making a few new videos. This one is about using the Union operator and it saved me from a Sunday night of hell. Hope you find this useful!
More videos to come, as time permits.
If you trade, or dabble in trading, you will know exactly what Daniel Negreanu is saying about pot odds, avoiding traps, and bet size when playing “small ball” Texas Holdem.
Moral of the story, never bet $10 to make $2.
I’ve been busy the past month at Rapidminer and in my personal life. I’m starting to use this blog as a way to keep a status update of what I’ve been doing and reading (see my pinboard).
Time Period June 1, 2014 to June 22, 2014
- Learned that the Set Operators in Rapidminer are your friend and can
save you from a night of insanity.
- Worked on a big Proof of Concept (POC) for a consumer goods company.
- Participated in a Big Data panel
discussion at Ramapo College for BAASANA’s 2014 Conference.
- Did Kickboxing three times, love it but it kills my body right now, so I have to go slow.
- Watched a bunch of World Cup Games, Go #USA!
- Walked almost 256,000 steps or almost 117.5 miles (see image below).
- Lost 5lbs.
- Started reading “The Sun Also Rises” by Hemingway.
I got the chance to attend a trade show in Miami Beach for the Big Data Pavilion last week. I had a great time talking about how Rapidminer can be applied for various problems across so many industries, and the big hits of the day was my Craft Beer Twitter analysis. People really understood the power of Twitter and the social media conversation. Way cool.
The first night I was there, I took a walk out to the beach from my hotel and took this picture looking back at the hotel.
Then I took this picture looking at the ocean. The two lights are cruise liners.
Some stats from my fibit:
- Walked 224,000 steps in the past 21 days, avg 10,600+ steps a day.
- Walked 73,257 steps in the past 7 days (33.59 miles), climbed 186 floors, and burned 18,703 calories.
If you don’t collect data, you won’t be able to understand the problem and act appropriately on it.
I recently bought a wearable health monitoring device (Fitbit) so I can see how much of a slug I am. This all started when a co-worker of mine introduced me to the Argus App for the iPhone. The Argus App is a simple pedometer that measures all the steps you take during the day. It prompts you to start with a default goal of 10,000 steps a day.
I thought to myself that walking 10,000 steps a day would be a breeze, I must walk 20,000 steps – at least – a day. Boy was I wrong.
After I started monitoring myself I realized that on the best days, I’d walk only 3,000 steps – just over a mile.
This was indeed an eye opener for me and I’ve decided to take up a the challenge of walking 70,000 steps in 7 days. Whether or not I lose weight remains to be seen, but I’ll be doing more walking and that’s good.
The best part of this all? I’m collect data on myself and using it make changes in my life.
I love beer, especially craft beers, and I also love what people say about beers on Twitter. One evening I had a wild idea to see if I can find out what are the popular types of beers (pilsners, stouts, etc) based in a particular region of the United States. I wanted to know if Pilsners are popular in the West and if Ales are popular in the South, etc.
Using Zapier, I extracted tweets using the #craftbeer search term and saved the them, along with the location of the tweeter (if available), then did some ETL in Rapidminer, harnessed our text mining capabilities, and then built a simple decision tree.
I should give you plenty of warning here because this is an ongoing personal project of mine, and my sample size is ridiculously small. Out of 2,000+ tweets, I ended up with about 80 tweets after I cleaned out the noise and looked for words like stout, ale, etc AND that had a US location available. So take this for what it’s worth, an interesting exercise in mining Twitter, with an emerging interesting result. Maybe.