Tag Keras

Posts: 5

Introduction to Keras

This is a really great introductory video on Keras and how simple it makes calling complex deep learning libraries like Tensorflow. The 13 year old author builds a great deep learning model in under 100 lines of code.

I do some questions w.r.t. to the AUC score being 1 (that always raised red flags in my mind) but he he does share the code on Github so everyone can follow along.

This is why I like Keras a lot, it's like H2O.ai and makes the complex work of coding very simple and accessible.


Deep Learning with TensorFlow and Keras

A really good technical introduction to deep learning from @AndrewYNg's Coursera that touches on stocastic gradient descent for optimization, which is really cool in my book. Sample processes are provided on the original article but I think you should just take the Coursera. :)

The following problems are taken from a few assignments from the coursera courses Introduction to Deep Learning (by Higher School of Economics) and Neural Networks and Deep Learning (by Prof Andrew Ng, deeplearning.ai). The problem descriptions are taken straightaway from the assignments Read More.


Installing Keras Deep Learning with RapidMiner

If you watched this great introduction on Deep Learning with Keras and RapidMiner here, you probably want to try it out! I'll warn you though, it's a bit tricky to get this up and running on a Windows machine.

The best place to read up on how to do it is in this KB thread at the RapidMiner Community. Installing Keras is heavily dependent on the Python version, additional python modules, and operating system versions too. Of course, if you get stuck, the KB article is a great place to post your questions.


Introduction to Deep Learning

This is a great introduction to Deep Learning. I know I learned a few things from Phillip.

Some key concepts

  • RapidMiner now can do GPU deep learning. Supports NVIDIA.
  • Easy: Using already loaded Neural Net operators.
  • Harder: Using H20.ai Deep Learning operator.
  • Hardest: Using Keras with RapidMiner.
  • Keras requires more complex setup with RapidMiner.
  • CNN, RNN, LSTM, etc are now available via RapidMiner GUI.
  • Keras supports Tensorflow, CNTK, and Theano.
  • Need Python v3.5 (see installation guide here).
  • 'Deep' discussion on Recurrent and Covolutional.


Here's a nice graphic from KD Nuggets on some basic activation functions. These activiation functions are what switch a neuron on or off in a neural net model.


Keras and NLTK

I've been doing a lot more Python hacking, especially around text mining and using the deep learning library Keras and NLTK. Normally I'd do most of my work in RapidMiner but I wanted to do some grunt work and learn something along the way.  It was really about educating myself on Recurrent Neural Networks (RNN) and doing it the hard way I guess.

Keras and NLTK

As usually I went to google to do some sleuthing about how to text mine using an LSTM implementation of Keras and boy did I find some goodies. The best tutorials are easy to understand and follow along. My introduction to Deep Learning with Keras was via Jason's excellent tutorial called Text Generation with LSTM Recurrent Neural Networks in Python with Keras. Jason took a every easy to bite approach to implementing Keras to read in the Alice In Wonderland book character by character and then try to generate some text in the 'style' of what was written before. It was a great Proof of Concept but fraught with some strange results. He acknowledges that and offers some additional guidance at the end of the tutorial, mainly removing punctuation and more training epochs. The text processing is one thing but the model optimization is another. Since I have a crappy laptop I can just forget about optimizing a Keras script, so I went the text process route and used NLTK. Now that I've been around the text mining/processing block a bunch of times, the NLTK python library makes more sense in this application. I much prefer using the RapidMiner Text Processing implementation for 90% of what I do with text but every so often you need something special and atypical.

Initial Results

The first results were terrible as my tweet can attest too!

lol. the results of my #deeplearning #keras poetry model: "thethethethethethethethethethethethethethethethethethethethethethethethethethe"

— Thomas Ott (@neuralmarket) August 10, 2017

So I added a short function to Jason's script that preprocesses a new file loaded with haikus. I removed all punctuation and stop words with the express goal of generating haiku. While this script was learning I started to dig around the Internet for some other interesting and related posts on LSTM's, NLTK and text generation until I found Click-O-Tron.  That cracked me up. Leave it to us humans to take some cool piece of technology and implement it for lulz.


I have grandiose dreams of using this script so I would need to put it in production one day. This is where everything got to be a pain in the ass. My first thought was to run the training on  a smaller machine and then use the trained weights to autogenerate new haikus in a separate scripts. This is not an atypical type of implementation. Right now I don't care if this will take days to train. While Python is great in many ways, dealing with libraries on one machine might be different on another machine and hardware. Especially when dealing with GPU's and stuff like that.  It's gets tricky and annoying considering I work on many different workstations these days. I have a crappy little ACER laptop that I use to cron python scripts for my Twitter related work, which also happens to be an AMD processor. I do most of my 'hacking' on larger laptop that happens to have an Intel processor. To transfer my scripts from one machine to another I have to always make sure that every single Python package is installed on each machine. PITA! Despite these annoyances, I ended up learning A LOT about Deep Learning architecture, their application, and short comings. In the end, it's another tool in a Data Science toolkit, just don't expect it to be a miracle savior.

Additional reading list

  • http://h6o6.com/2013/03/using-python-and-the-nltk-to-find-haikus-in-the-public-twitter-stream/
  • https://github.com/fchollet/keras/blob/master/examples/lstm_text_generation.py

The Python Script


import numpy
import os
import sys
import nltk
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import LSTM
from keras.callbacks import ModelCheckpoint
from keras.utils import np_utils
import string
from nltk.tokenize import RegexpTokenizer
from nltk.corpus import stopwords
import re

# look at https://gist.github.com/ameyavilankar/10347201#file-preprocess-py-L1

def preprocess(sentence):
    sentence = sentence.lower()
    tokenizer = RegexpTokenizer(r'\w+')
    tokens = tokenizer.tokenize(sentence)
    filtered_words = filter(lambda token: token not in stopwords.words('english'), tokens)
    return " ".join(filtered_words)

# load ascii text and covert to lowercase
filename = "haikus.txt"
sentence = open(filename).read()
#raw_text = raw_text.lower()
#raw_text = nltk.sent_tokenize(raw_text)

sentence = preprocess(sentence)

#print (sentence)

raw_text = sentence

#print (raw_text)

# create mapping of unique chars to integers
chars = sorted(list(set(raw_text)))

#print (chars)
char_to_int = dict((c, i) for i, c in enumerate(chars))

#print (char_to_int)

n_chars = len(raw_text)
n_vocab = len(chars)
print ("Total Characters: ", n_chars)
print ("Total Vocab: ", n_vocab)

# prepare the dataset of input to output pairs encoded as integers
seq_length = 300
dataX = []
dataY = []
for i in range(0, n_chars - seq_length, 1):
    seq_in = raw_text[i:i + seq_length]
    seq_out = raw_text[i + seq_length]
    dataX.append([char_to_int[char] for char in seq_in])
n_patterns = len(dataX)
print ("Total Patterns: ", n_patterns)

# reshape X to be [samples, time steps, features]
X = numpy.reshape(dataX, (n_patterns, seq_length, 1))
# normalize
X = X / float(n_vocab)
# one hot encode the output variable
y = np_utils.to_categorical(dataY)

# define the LSTM model
model = Sequential()
model.add(LSTM(256, input_shape=(X.shape[1], X.shape[2]), return_sequences=True))
model.add(Dense(y.shape[1], activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam')

# define the checkpoint
checkpoint = ModelCheckpoint(filepath, monitor='loss', verbose=1, save_best_only=True, mode='min')
callbacks_list = [checkpoint]

model.fit(X, y, epochs=3, batch_size=256, callbacks=callbacks_list)


Neural Market Trends is the online home of Thomas Ott.