Neural Market Trends |||

Fix Spelling Mistakes in Text Processing with RapidMiner

The RapidMiner Community has some really talented Data Scientists. I recently came across a response by Unicorn lionelderkrikor on how to fix general spelling errors using a bit of Python and RapidMiner.

The goal here was to correct things like verrry goood!” to very good!”, or yah!” to yes!” Typical annoying text processing tasks that every data scientist needs to do time and time again.

RapidMiner was used to do the heavy text processing and Lionel used the Python Textblob library to write two simple functions that corrected the majority of mistakes. Note, I said majority. In some cases if you wrote verrrrrrrrrrrrrrrrrrrrrrrrrrrrrryyyyyyyyyyyyyyyyyy goooooooooooooooooooooooooddddddddddddd’, the Textblob library couldn’t figure it out, and I completely understand it. If you wrote that above in a Tweet, I’d take away your smartphone and spank you with it.

Check out the Community post and grab Lionel’s XML to play with it yourself.

Up next Bubble Grid vs Choropleth in R Working with the Instagram API, JSONPath, and RapidMiner I used Scott’s excellent RapidMiner Instagram API tutorial to build one of my clients a simple hashtag/keyword tool for brand marketing. The
Latest posts Democratising Machine learning with H2O — Towards Data Science Getting started with Python datatable | Kaggle Phone Addiction Version 12 Launches Today! Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund What's new in Driverless AI? Latest Writings Elsewhere - December 2018 House Buying Guide for Millennials Changing Pinboard Tags with Python Automate Feed Extraction and Posting it to Twitter Flux: A Machine Learning Framework for Julia Getting Started in Data Science Part 2 Makers vs Takers How Passive Investing Saved My Life Startups and Open Source The Process of Writing H2O AI World 2018 in London Ray Dalio's Pure Alpha Fund Isolation Forests in