Plugging in, Python, and RapidMiner

Python, Pyspark, Automation

I’ve struggled for a while trying to build an embedded visualization for my auto-generated blog posts. I tried D3js (javascript), Bokeh (python), and (various languages) and got frustrated quickly for something that will allow me to easily create and auto embed a chart.

In general, D3js has no barriers for embedding but it’s a pain in the butt to code javascript for a non-coder like me. Bokeh uses python and is kinda of nice since I know python but it’s very hard to auto embed a visualization on the fly. Plus the generated visualization is 1,000’s of lines of autogenerated code and clipping and pasting the code into a markdown post is a no-no for me.

Next I investigated Plotly. While not 100% perfect, I liked it from the get go. It’s syntax is very easy to learn and you can code it using javascript, python, pandas, and R. Since I tend to avoid R, I tried coding in their javascript and python/pandas API. The same frustrations I had in coding D3js came back for their javascript API, so I focused completely on their python/pandas API.

That was a success. When I wrote out the python/pandas code and them embedded it my RapidMiner process (see below), I successfully generated a static PNG image from my RapidMiner process and auto embedded it into my markdown post.

The only snag I ran into is that I needed to get an API token from to autogetnerate the static image. You can see in the code below that I “X’d” it out but it was pretty easy to get it once you create an account with

If you check out the python code I put into the RapidMiner Execute Python operator, you’ll notice that I use macros to alter the name of the autogenerated files. This is crucial if I want to “set it and forget it” autoposting in a production sense (like using the RapidMiner Server), but that’s a post for another day.

Here’s the python code in RapidMiner: