Making AI Happen Without Getting Fired

From H2O.ai

I watched Mike Gualtieri’s keynote presentation from H2O World San Francisco (2019) and found it to be very insightful from a non-technical MBA type of way. The gist of the presentation is to really look at all the business connections to doing data science. It’s not just about the problem at hand but rather setting yourself up for success, and as he puts it, not getting fired!

My notes from the video are below (emphasis mine):

  • Set the proper expectations
  • There is a difference between Pure AI and Pragmatic AI
  • Pure AI is like what you see in movies (i.e. ExMachina)
  • Pragmatic AI is machine learning. Highly specialized in one thing but does it really well
  • Chose more than one use case
  • The use case you choose could fail. Choose many different kinds
  • Drop the ones that don’t work and optimize the ones that do
  • Ask for comprehensive data access
  • Data will be in silos
  • Get faster with AutoML
  • Data Scientists aren’t expensive, they need better tools to be more efficient
  • Three segments of ML tools
    • Multimodel (drag and drop like RapidMiner/KNIME)
    • Notebook-based (like Jupyter Notebook)
    • Automation-focused (like Driverless AI)
  • Use them to augment your work, go faster
  • Warning: Data-savvy users can use these tools to build ML. Can be dangerous but they can vet use cases
  • Know when to quit
  • Sometimes the use case won’t work. There is no signal in the data and you must quit
  • Stop wasting time
  • Keep production models fresh
  • When code is written, it’s written the same way and runs the same forever
  • ML Models decay, so you need to figure out how to do it at scale
  • Model staging, A/B testing, Monitoring
  • Model deployment via collaboration with DevOps
  • Get Business and IT engaged early
  • They have meetings with business and IT, get ducks in a row
  • Ask yourself, how is it going to be deployed and how it will impact business process
  • Ignore the model to protect the jewels
  • You don’t have to do what the model tells you to do (i.e False Positives, etc)
  • Knowledge Engineering: AI and Humans working together
  • Explainability is important

Latest Musings from a Traveling Sales Engineer

Writers gonna write! February is shaping up to be a big traveling month for me. I was in Mountain View at the end of January for Sales Kick Off and then stayed for H2O World in San Francisco at the beginning of the month. Topping that off so far was a trip to Orlando for HIMSS last week. All incredible events but my traveling isn’t done yet! I plan to be all over the eastern seaboard in the coming week!

If you don’t innovate, you die.

The cool thing about being a part of Sales Kick Off is that we got to interact directly with the Makers. They pumped us for info on what we hear from the field, the enhancements and the ‘nice to have’ features in both Driverless AI and the Open Source H2O. A Customer to Maker feedback loop is critical for survival. It makes for a happy customer and it pushes us Makers to innovate. If you don’t innovate, you die.

Our open source software is what built our brand, it opened the door to customers.

On the surface it’s a strange thing to have two seemingly opposed products at the company you work for. Why should you pay for a commercial product if you already have a great set of open source algorithms and tools to use? That’s a great question and surprisingly, we don’t face very often. Our open source software is what built our brand, it opened the door to customers.

Yes, 1,ooo’s of organizations use our open source and probably never become Enterprise customers and a large percentage of them will never buy Driverless AI but that’s ok.

They know and trust our fast customer response teams.

It’s ok because they trust H2O to build and produce accurate and powerful models, for whatever they do. On the other hand, it’s the brand/reputation we built in Open Source that leads to Customers buying Driverless AI. They know and trust our fast customer response teams. They enjoy a reoccurring ‘touch point’ of a cadence they choose with our Customer Success team to see ‘how’s it going’ and help ‘get models into production.’

The best part? Driverless AI isn’t just our Open Source product slapped with a new GUI, it’s a product that’s built completely from the ground up. Just like I wrote here:

H2O.ai brings that clarity in the confusing AI space in a very simple way. Create world class algorithms, build an open source platform, and create an automated modeling platform. Our algorithms are used by over 14,000 organizations and by other software vendors ‘under the hood.’ We took all that knowledge and poured it into our automated modeling platform. We asked our Kaggle Grandmasters if you had the chance to build the best automated modeling platform in the world, what would it look like?

From ‘Finding Clarity in the Automated Modeling Space’ via H2O.ai blog

Driverless AI is built by Kaggle Grandmasters to solve highly complex data science problems. It has safeguards and processes in place to deal with Feature Generation, Feature Selection, Target Leakage, Transformations, Validation, Hyperparameter Tuning, etc. It’s really quite an impressive platform.

Driverless AI tends to innovate at what feels like the speed of light. Recently, a new customer asked when we would have a confusion matrix available in the GUI. As it stood at the time, the confusion matrix was exposed in the log file. Good but not great. I went back to the Makers, filed a ticket, and in about two weeks it was available in the GUI. We have a hyperactive and intense group of Makers that are DYING for feedback to keep building cool stuff.

Crazy right?

In all my time as a Sales Engineer (which isn’t that long), I’ve found that listening too and prioritizing a customer’s needs and wishes is what builds your brand and creates a thriving community. Making money from that is really just an affirmation of your brand. After all, what else do you want in your life?

Makers Gonna Make and Writers Gonna Write!

Getting Started in Data Science Part 2

I’m finally getting around to writing Part 2 of Getting Started in Data Science. The first part can be found here. I made suggestions for university students interested in the field of Data Science. I even made a video about it too. 

Pick Two, Master One

Pick two computer languages and become proficient in one and a master at the other one. Or, pick a platform like H2O-Flow or RapidMiner and a language. Become a master at one but proficient in the other. This way you can set yourself apart from other students or applicants. 

Continue reading “Getting Started in Data Science Part 2”