Current State of Affairs
Lately, it seems that the only time I remember to post now is when I’m on the train leaving to or from Boston. While I’m not neglecting my blog, it is suffering from a lack of updates, because working as a Consultant for Rapidminer is busy and exciting. Busy because I just got back from a whirlwind week of travel starting to Dallas, then to Newark DE, followed by Las Vegas, and finally to Portland OR. Exciting because the more I use Rapidminer in a professional sense, the more I’m amazed by its capabilities to manipulate data.
I recently started using operators that I haven’t before and I’m just loving all the ETL functions I can do with them. I just started using the Generate Extract operator to mine location data from Twitter, and I love the fact that I can extract the state abbreviation (CA) from San Diego, CA using regular expressions. Once I’ve extracted the state abbreviations, I then use the Replace-Dictionary operator to replace all the state abbreviations with the region the state resides in. CA becomes West, NJ becomes Northeast, and so forth.
How cool is that?Don't forget to sign up for our monthly newsletter on Data Science and RapidMiner here!