Neural Market Trends |||

Infrastructure Investigation

Big Data and Infrastructure

I have a daily downtime routine. Every evening I set aside about a hour and think. I sit or walk around the house and ruminate about all sorts of random things. Sometimes it’s with a glass of wine and more often it’s with a cup of black tea and milk. Sometimes my mind wanders to what I did that day or what I didn’t finish. Other times I get inspired to write a new blog post or create a new tutorial. Sometimes it’s an epiphany like the impact of Big Data and Infrastructure.

It’s no secret that I came from Infrastructure field. I spent many years designing and managing infrastructure projects as a Civil Engineer. Some projects were big, some were small. I traveled to remotest parts of Montana and North Dakota and worked all over the country. I’ve inspected bridges, roads, sewers, and written countless reports.

Most of these reports were to highlight deficiencies in some bit of infrastructure. We’d take an inventory of the structure, take photos, and make measurements. Then the report would go to an agency where they’d use it to get budget monies to fix the problem.

Then I moved to the machine learning startup world and here I am today.  My first move was into Pre-sales and right before I transferred to the Marketing group, I fielded  some interesting queries from potential customers.  One was from a major freight railroad and the other from an railroad car inspection company. Both of these organizations capture sensor data and make measurements on their infrastructure assets. They measure temperature of rail gauges, wear patterns, widths, and hours of use.

"Back by hand""Back by hand"

Big Data Migration

The most interesting part of these queries? The data was migrating from standalone reports into Hadoop clusters.  For the first time ever, at least since I was in the industry, data is coming together from all over the place. The only problem was getting the data out to work on it!

There are many ways to get the data out and work with it (i.e. Spark, Hive, RapidMiner, etc) now, but engineering professionals don’t understand it. Ask any manager in an Infrastructure firm what Hadoop is and they won’t know. Some might have heard of data science and data mining but they might not know what all the hoopla is about.

The hoopla is this.

Engineers use data to design all kinds of things. Imagine if they have access to a deeper pool of stress strain data for bridges?  Or for rails? What if researchers adjust the mixture ratios of concrete or temper steel differently to extract more performance based on terabytes of data from a central research Hadoop cluster?

These scenarios are not far fetched at all. I went to a presentation two years ago on the forecasting of flooding for Hurricane Sandy event types in the NY area. The room was filled with engineers and a presenter from Stevens Institute of Technology. The presenter says they run several wave function calculations to help state governments like New Jersey and New York predict where the flooding is going to occur and its severity.

After the presentation I asked him where they get their data from and he said from a group of computers tied together in a cluster.

The Future

The reality is that more Infrastructure companies are collecting ever increasing amounts of data. They’re using drones to do bridge inspections, tying river gauges together via the Internet, and using more sensors than ever. These sensors (aka IoT) collect and stream this data somewhere. In the old days it was an Access database. Today it’s an more robust database and one day that will be a big Hadoop cluster. The average Civil Engineer of my time hasn’t heard about a Hadoop Cluster but they heard of Big Data and wonder what its about.

Soon they’ll crush the silos of their data stores, unlock innovation, and build their own clusters.

Imagine the world we can build then?

Up next Machine Learning on a Raspberry Pi Mashing Up Julia Language with RapidMiner If you want to execute any Python in RapidMiner, you have to use the Execute Python operator. This operator makes things so simple that people use
Latest posts Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund What's new in Driverless AI? Latest Writings Elsewhere - December 2018 House Buying Guide for Millennials Changing Pinboard Tags with Python Automate Feed Extraction and Posting it to Twitter Flux: A Machine Learning Framework for Julia Getting Started in Data Science Part 2 Makers vs Takers How Passive Investing Saved My Life Startups and Open Source The Process of Writing H2O AI World 2018 in London Ray Dalio's Pure Alpha Fund Isolation Forests in H2O.ai Living the Dream? Humility and Equanimity in Sales What is Reusable Holdout? H2O World London 2018 - Record Signups!