I spent last week in Dortmund at the RapidMiner office – the place where it all started – to take part in Radoop training. It was a
Radoop is an abstraction layer on top of Hadoop that takes all the heavy coding and configuration out of it. Sure you can write Pig Script if you want to in Radoop, but do you really want to spend time writing Hive SQL? The Radoop operators translate the operations in Hive SQL and then sends it down to the cluster, all within the friendly Rapidminer GUI environment. It just sits on top of Pig, Hive, Map Reduce, YARN, and Impala, and once Spark is integrated by the end of the year, it will sit on top of that too.
We also had some intense discussions on the integration of Kerberos and other security layers, which is a topic that keeps coming up more and more in my discussions with companies. Fortunately for us, Kerberos integration is almost done! On the last day of training we spun up a Hadoop cluster on Amazon Web Services (AWS) and hooked Radoop into it. This step was a bit more involved, but within an hour, we were up and connected to AWS! For lack of better words, it was damn cool, and I can’t want to spin up my own Hadoop cluster on AWS.
So I leave with this quick posts with some photos from the Rapidminer Dortmund office and Radoop training.
Here’s Frank and Marius hard at work. They were kind enough to let me sit in their office while I was there.