Neural Market Trends |||

Working with the Instagram API, JSONPath, and RapidMiner

I used Scott’s excellent RapidMiner Instagram API tutorial to build one of my clients a simple hashtag/keyword tool for brand marketing. The problem with that process was that it used the Instgram API and needed access tokens. First, getting an access token from Instagram was an utter pain in the butt and second, the Instgram API is being deprecated.

Fast forward 6 months and my neice in South Africa has started a TravelBlog site. She mostly posts her videos and photos to Instagram and was very interested in using the hashtag tool I built. So I made some changes for her and put in production. A few times a week she uploads a spreadsheet to a shared folder and consumes the results via a spreadsheet in an output folder.

How does it work?

It’s simple in the way it works, I have a RapidMiner Server watching the upload folder and when it sees a new spreadsheet, it triggers the the process to extract hashtag metadata. In about 25 seconds, a new spreadsheet is written back to an output folder with how popular the tags she chose are.


The next version of the tool was to incorporate keyword suggestions from the tags she uploaded. So I started working on an updated process incorporating Hypernyms and Hyponyms from RapidMiner’s Wordnet Extension. I built the entire process and started testing it. Then POOF! DISASTER!

I rate blocked myself or the API just broke. Not sure which, but I’m leaning toward the former. Now what?

The solution

The solution came from extracting the JSON information associated with each hashtag by accessing the following URL:{HASHTAG}/?__a=1

I had to use some of the built-in RapidMiner functionality for working with JSONPath and I ended up learning some new tricks. The JSONPath online evaluator really helped me here.

With a bit of tweaking the original hashtag tool was back in production and the day was saved.

However, this got me to thinking about how I could be a better Internet citizen when it comes to extracting data from the Instagrams of the world. I think the solution would be to download the actual JSON file and maybe store it into database. From there I could use a simple JSONPath to extract the hashtag count and store the results in another table.

I could even log a timestamp and with some cron scheduling, build up a comprehensive database for the growth and/or decline of hashtags.

The majority of these processes are just ETL and there is very little machine learning. However, with the new LDA operators and in combination with the excellent Text Processing RapidMiner has, I think I could come up with a better hashtag suggestion tool.

Up next Fix Spelling Mistakes in Text Processing with RapidMiner The RapidMiner Community has some really talented Data Scientists. I recently came across a response by Unicorn lionelderkrikor on how to fix Beta Testing an Instagram Hashtag Tool Continuing the stream of consciousness from my Working with Instgram API, JSONPath, and RapidMiner post, I started beta testing a new and improved
Latest posts Machine Learning Making Pesto Tastier 5 Dangerous Things You Should Let Your Kids Do The Pyschology of Writing TensorFlow and High Level APIs Driving Marketing Performance with H2O Driverless AI Machine Learning and Data Munging in H2O Driverless AI with datatable Making AI Happen Without Getting Fired Latest Musings from a Traveling Sales Engineer The Night before H2O World 2019 Why Forex Trading is Frustrating Functional Programming in Python Automatic Feature Engineering with Driverless AI Ray Dalio's Pure Alpha Fund What's new in Driverless AI? Latest Writings Elsewhere - December 2018 House Buying Guide for Millennials Changing Pinboard Tags with Python Automate Feed Extraction and Posting it to Twitter Flux: A Machine Learning Framework for Julia Getting Started in Data Science Part 2 Makers vs Takers How Passive Investing Saved My Life Startups and Open Source The Process of Writing H2O AI World 2018 in London Ray Dalio's Pure Alpha Fund Isolation Forests in Living the Dream? Humility and Equanimity in Sales What is Reusable Holdout? H2O World London 2018 - Record Signups!