Lately, it seems that the only time I remember to post now is when I’m on the train leaving to or from Boston. While I’m not neglecting my blog, it is suffering from a lack of updates, because working as a Consultant for Rapidminer is busy and exciting. Busy because I just got back from a whirlwind week of travel starting to Dallas, then to Newark DE, followed by Las Vegas, and finally to Portland OR. Exciting because the more I use Rapidminer in a professional sense, the more I’m amazed by its capabilities to manipulate data.
I recently started using operators that I haven’t before and I’m just loving all the ETL functions I can do with them. I just started using the Generate Extract operator to mine location data from Twitter, and I love the fact that I can extract the state abbreviation (CA) from San Diego, CA using regular expressions. Once I’ve extracted the state abbreviations, I then use the Replace-Dictionary operator to replace all the state abbreviations with the region the state resides in. CA becomes West, NJ becomes Northeast, and so forth.
How cool is that?