Neural Market Trends |||

Spark vs Hadoop

There’s a lot of hullabaloo about Spark vs Hadoop for Big Data these days. If you’re rushing to stand up a Big Data cluster, you probably heard about this new Spark technology. The simplest way to think about the differences is that Hadoop is for batch jobs and Spark can do batch and stream processing. However, the biggest promise of Spark is the ability to code in Scala, Python, and soon R.

Dynamic programming languages like Python have opened up new ways to program, letting you develop algorithms interactively non-stop instead of the write/compile/test/debug cycle of C, not to mention chasing the inevitable memory management bugs. (Smart Data Collective)

While I don’t see Spark supplanting Hadoop - both rely on the HDFS data storage system - I see the leveraging of Spark to make that Hadoop elephant dance on a pin head.

As Mr. Schmitz so eloquently pointed out in the comments, Hadoop and Spark can’t supplant the other, they coexist together. What I mean to say in my last paragraph is that Spark will really let you leverage your Hadoop environment!

Up next New RapidMiner Intro Videos I’m pretty excited to have been selected as the “voice of RapidMiner” for 6 new introduction videos that were just posted to YouTube. The first one Coding RapidMiner in Python Back in middle school we learned about log tables. We learned how to look them up in a table, interpolate them, and then use the result in our
Latest posts Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund What's new in Driverless AI? Latest Writings Elsewhere - December 2018 House Buying Guide for Millennials Changing Pinboard Tags with Python Automate Feed Extraction and Posting it to Twitter Flux: A Machine Learning Framework for Julia Getting Started in Data Science Part 2 Makers vs Takers How Passive Investing Saved My Life Startups and Open Source The Process of Writing H2O AI World 2018 in London Ray Dalio's Pure Alpha Fund Isolation Forests in H2O.ai Living the Dream? Humility and Equanimity in Sales What is Reusable Holdout? H2O World London 2018 - Record Signups!