|||

The Sudden Interest in Data Science Platforms

I’ve been at this startup thing for a few years now and I’ve seen a thing or two. If you read KDNuggets, you’ll stumble across the Gartner Hype Cycle. Right now Big Data is entering the trough of disillusionment. While that sounds sad, it kinda makes sense.

For years we’ve hearing how Big Data will unlock all kinds of insights in a corporation’s data. Everyone raced to stand up clusters, jam all kinds of data into them, and then stumble when extracting insight. The cluster became hard to tame, hard to use, and seemed like a big waste of money.

Of course RapidMiner Radoop came along and actually delivered on this promise but many companies decided to use a single tool to extract their insight. Maybe it was PySpark or Pig Script? Maybe something else completely. They married themselves to one or two ways of getting insight.

Now many companies are realizing they’re not just an R shop, they’re an R, Python, and Spark shop. Now they need to use all three or more tools in the Data Science toolkit to get anything done. Now they’re looking around for a platform to bring all these tools together.

Imagine their surprise when they find RapidMiner. We’ve been a Data Science platform from day 1. Ninety percent of the time you can do all your data science and model building right in the Studio platform. The rest of the time you might need some esoteric algorithm to finish your work. So, if you married yourself to one tool and that esoteric algorithm wasn’t available, you were SOL.

With RapidMiner it’s always been different. Need that Tweedie algorithm in R? Use the R Scripting extension and pull it in. Need to do some PySpark on your cluster? Put that script right inside Radoops Spark Script operator.

It’s that easy. After all, isn’t that what a real Data Science platform is supposed to do?


Originally published at community.rapidminer.com on April 20, 2017.

Up next Current Status - Gartner Data Analytics Conference #Current Status - Gartner Data Analytics Conference Spending a few days at Gartner Data Analytics Conference. Loving this part of Texas! Engineering Firms: Use Process Mining for Competitive Advantage I’ve spent years working at small and large Engineering firms and they hardly made any money. Engineering firms are notorious for making 1 to 3%
Latest posts Revisiting GOOG, GE, NE, IYR from 2007 The Ye Old Blog List Motorola: Then and Now EWM Redux Testing for mean reversion with Python & developing simple VIX system - Talaikis unsorted - Tadas Talaikis Blog Steps to calculate centroids in cluster using K-means clustering algorithm - Data Science Central Basics of Statistical Mean Reversion Testing - QuantStart Algorithmic trading in less than 100 lines of Python code - O’Reilly Media Interpreting Machine Learning Models Microsoft the AI Powerhouse Investing in the S&P500 still beats AI Trading Microsoft makes a push to simplify machine learning | TechCrunch 10 Great Articles On Python Development — Hacker Noon Introduction to Keras Democratising Machine learning with H2O — Towards Data Science Getting started with Python datatable | Kaggle Phone Addiction Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing Investing in 2019 and beyond TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund