Why I Left RapidMiner

For those that are wondering why I left RapidMiner, my dream job, there are no gory details to share.

The simple reason is I got burnt out. My time at RapidMiner was some of the best learning and growth years in my entire professional career. I solved problems, made presentations to C-suite people, and worked with some of the best talent. The flipside of this was that it wasn’t easy and it sure as hell wasn’t a smooth ride. I worked through some of the most tumultuous years at RapidMiner. We had 3 years of management changes and radical 180 degree strategy changes while I was there. All this ‘chaos’ eventually took it’s toll on me.

Don’t get me wrong, I have no ill will towards anyone. I really miss my colleagues and friends, but I made my decision to leave in early May 2017. I had no real plan at that moment, I just needed some ‘me’ time. I quit in July 2017 and took a few weeks off to be with my family.

As luck would have it, my town’s Municipal Engineer approached me and asked if I’d like to work on a town project. He knew that in my former life as an Engineer, I had a lot of stormwater management design experience and this project needed some big stormwater analysis. I decided to take this job and simultaneously start a Data Science consultancy too. I ended up doing Engineering and RapidMiner/Data Science consultancy from August 2017 work till right about September 2018. What happened in September 2018? Well that deserves its own post for another time.

I started taking on RapidMiner related consulting work and doing my stormwater analysis. I was living the life of a consultant, working strange hours, worrying about invoices, worrying about where the next job will come, etc. Luckily I picked up more Engineering work from my Municipal Engineer to keep me afloat as I navigated through the lean times.

In August of 2018, my stormwater project ended up getting approved by the NJDEP and Highlands Council as a Major Development in New Jersey. This was a first in Highlands and NJDEP history, a brand new Community Center and Shelter got the green light to be constructed, all because we engineered a better stormwater management system. Although my work will never be seen – it’s all underground – I can take solace that I will be recharging 133% of clean rainwater over the development area into a depleting aquifer. I know that this project will be a benefit to the Community and the environment, and that makes me happy.

Making a positive impact can be really hard at times but the reward is immeasurable.

So there you have it. Nothing to see. It was time to move on to the next adventure. I can take solace that all the RapidMiner adventures and friends will always be a part of me. As Ingo from RapidMiner would say, “onward and upward.”

Let’s go.

The Fallacy of Twitter Bots

I’m going to be the first to admit that I use Python to send out Tweets to my followers. I have a few scripts that parse RSS feeds and do retweets on an hourly basis. They work fine but they do get ‘gamed’ occasionally. That’s the problem with automation, isn’t it? Getting gamed can cause all kinds of havoc for your brand and reputation, so you have to be careful.

Has this happened to me? Not really, but there has been a few embarrassing retweets and silly parsed advertisements in lieu of good articles.

Why bother with Twitter automation in the first place? Simple, everyone wants to be an ‘influencer’, myself included. Yet using automated methods to gain ‘eyeballs’ comes with a price. You end up sacrificing quality for quantity. You end up diluting your brand and losing the signal. In the end you get nothing but noise!

Signal vs Noise

At one time I tested/used @randal_olson‘s TwitterFollowBot to increase my follower count. It worked well and I started growing my followers in large clips. The script is pretty simple in logic, it follows people based on a certain hashtag (or followers of a Twitter handle) that you supply and does in about 100 people per run.

The goal here is to get a ‘follow back’ from the people you just followed, then auto mute them. If, after a week or so, they don’t follow you back you run another routine that ‘unfollows’ them and puts them on a black list not to ‘autofollow’ them again.

You run this script every few hours for a week and MY GAWD, does your following list explode! The noise becomes unbearable, even after muting them. You end up with cranks, conspiracy theorists, crypto-currency shills, and bots (most liked Russian bots). Yes, you do get a lot of follow backs but the quality signal of people you should really follow and interact with gets completely lost!

I stopped that experiment a few months ago and started unfollowing the noise. My following count is now below 1,000 but I feel that’s too much. I want to get that number to about 500. Of course, this resulted in my follower count dropping too. There’s a lot of Twitter users that also run ‘you unfollow me so I unfollow you’ scripts too. LOL.

Possible solutions

Just stop it. Stop all the Retweeting, TwitterBot following, and parsing. Instead do one or more of the following:

  1. Create a curated list of great links that you filter through. I know that @maoxian has done this over the years and it’s invaluable because he puts the time and effort in to filtering out the noise.
  2. Write a Python script to parse RSS feeds but write the links to a file so you can review later and tweet accordingly (more signal, less noise)
  3. Write a Python script to find ‘true’ influencers on Twitter and interact with them personally. Perhaps create a ranking system
  4. Something else that I’ll remember after I post this article

I guess lesson here is that we can’t automate the human touch. You can do a lot of the heavy lifting but in the end, it’s us that bring meaning and value to everything we do.

Extract Blog Post Links from RSS feeds

As part of my goal of automation here, I wrote a small script to extract blog post links from RSS feeds. using Python. I did this to extract the title and link of blog posts from a particular date range in my RSS feed. In theory, it should be pretty easy but I’ve come to find that time was not my friend.

What tripped me up was how some functions in python handle time objects. Read on to learn more!

What it does

What this script does is first scrape my RSS feed, then use a 7 day date range to extract the posted blog titles and links, and then writes it to a markdown file. Super simple, and you’ll need the feedparser library installed.

The real trick her is not the loop, but the timetuple(). This is where I first got tripped up.

I first created a variable for today’s date and another variable for 7 days before, like so:

The output of today becomes this: datetime.date(2018, 9, 8)
The output of week_ago becomes this: datetime.date(2018, 9, 1)

So far so good! The idea was to use a logic function like if post.date >= week_ago AND post.date <= today, then extract stuff.

So I parsed my feed and then using the built in time parsing features of feedparser, I wrote my logic function.

BOOM, it didn’t work. After sleuthing the problem I found that the dates extracted in feedparser were a timestruct object whereas my variables today and week_ago were datetime objects.

Enter timetuple() to the rescue. timetuple() changed the datetime object into a timestruct object by just doing this:

After that, it was straightforward to do the loop and write out the results, see below.

Python Script