web
stats
May 5, 2017 Labeling Training Data Correctly

Labeling Training Data Correctly


Labeling Training Data Correctly

When you’re dealing with a classification problem in machine learning, good labeled data is crucial. The more time you spend labeling training data correctly, the better. This is because your model’s performance and deployment will depend on it. Always remember that garbage in means garbage out.

Thoughts on labeling data

I recently listened to a great O’Reilly podcast on this subject. They interviewed Lukas Biewald, Chief Data Scientist and Founder of CrowdFlower. CrowdFlower provides their clients with top notch labeled training data for various machine learning tasks, and they’re busy!

The few bits that caught my ear were how much of the training data is used in deep learning. They’re also seeing more image labeled data for self driving cars.

The best part of the interview as Lukas’s discussion on using a Raspberry Pi with Tensor Flow! How cool is that?

The Podcast

https://soundcloud.com/oreilly-radar/data-preparation-in-the-age-of-deep-learning?in=oreilly-radar/sets/the-oreilly-data-show-podcast


Originally published at Neural Market Trends.

Don't forget to sign up for our monthly newsletter on Data Science and RapidMiner here!


thoughts Machine Learning


Previous post
Engineering Firms: Use Process Mining for Competitive Advantage Let’s talk Process Mining and Engineering. Why? Because it’s the silver bullet
Next post
Advice from a Venture Capitalist I’ve been following Howard and Fred on and off over the years. They’re old Web 2.0 veterans like me and have