|||

Orange 3 is impressive

I’ve been keeping a lazy eye on Orange over the years and it’s (fairly) recent update has made it quite an impressive contender in the Data Science visual platform space. While it’s not RapidMiner, it does have a lot of great things going for it. First, it’s entire core was rewritten to tightly integrate with Scikit-Learn and Python. It has a decent time series add-on’ which comes stock with ARIMA. It has a really good Text Processing add-on’ that gives the user more finer control that RapidMiner’s and it has a great GEO Map natively.

Sure, there is no production server or native Hadoop connectivity but all that can be solved by easily creating a new Widget using Python and calling some Orange classes and exporting Pickle files.

Testing out Orange3

I took a few minutes yesterday and this morning to build a simple Twitter Text Processing flow. Just like RapidMiner you have to connect Widgets’ together. Each Widget does a specific task. In the process below I connect to Twitter, do a search for RapidMiner’ and extract the corpus. I use the NLTK package to do my stopword filtering convert the text via TF-IDF. Whenever you connect to a widget, the process executes that widget, so you don’t have hit play’ all the time.

From there I do two more things, I create a word cloud and run hierarchal clustering on it.

There is a pretty rich ETL set of Widgets too and if you don’t find what you need, you can just use the Python Script widget to write your own code.

Some of the negatives I’ve encountered is that it crashes when I try to install too many add-ons’ and it doesn’t feel stable enough on a Windows machine but overall, it’s quite impressive. I’m going to continue to tinker around with this software and write about it.

Up next Writing Advice Taken from Thought Catalog. The first draft of everything is shit. -Ernest Hemingway Never use jargon words like reconceptualize, demassification, My Frenemies, the colon (:) and semicolon (;) This is a repost from my lost Posterous blog One of the most powerful and frustrating pieces of punctuation in my life is the colon (:) and
Latest posts Revisiting GOOG, GE, NE, IYR from 2007 The Ye Old Blog List Motorola: Then and Now EWM Redux Testing for mean reversion with Python & developing simple VIX system - Talaikis unsorted - Tadas Talaikis Blog Steps to calculate centroids in cluster using K-means clustering algorithm - Data Science Central Basics of Statistical Mean Reversion Testing - QuantStart Algorithmic trading in less than 100 lines of Python code - O’Reilly Media Interpreting Machine Learning Models Microsoft the AI Powerhouse Investing in the S&P500 still beats AI Trading Microsoft makes a push to simplify machine learning | TechCrunch 10 Great Articles On Python Development — Hacker Noon Introduction to Keras Democratising Machine learning with H2O — Towards Data Science Getting started with Python datatable | Kaggle Phone Addiction Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing Investing in 2019 and beyond TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund