RapidMiner released its Web Mining Extension on the Marketplace. It’s super easy to install with RapidMiner Studio. Just go to Extensions > Marketplace (Updates/Extensions) and search for Web Mining.
Select the Extension and then accept the Terms and Conditions. RapidMiner will then have to restart and you should see the latest set of operators in the Extension folder of your Operators.
Web Mining Extension Operators
Here’s what you get with the extension, a web crawler, single and multiple page extraction, scraping text out of HTML tags, and much much more. My favorite operator is operator is the Enrich by WebService Operator, which I use quite a bit for mashing up geolocation data (see my Tutorials on this).