Building an AI financial market model – Lesson III

Welcome back! In this lesson we’ll explain how to build your first YALE experiment. Building the experiment takes just a few mouse clicks but how you put the pieces together requires a little thought.

Although this lesson isn’t hard, you have to logically build your YALE experiment and put all their “operators” in the right order. If you don’t then your experiment will crash, or worse, give you weird results. As I said in Lesson I, YALE has a very steep learning curve and I spent many hours pulling out my hair trying to figure out how to make it work.

A basic YALE experiment consists of the following operators:

  1. The root
  2. A data loader,
  3. A data visualizer
  4. A data validator
  5. A model creator
  6. A model writer
  7. A model applier
  8. A performance evaluator
  9. Final Experiment

The Root

Every YALE experiment has a Root operator. The Root operator is just where your experiments starts out. Your entire YALE experiment will “roll up” to this operator in the end.

Now, click on the YALE icon and load up the program and select “Blank Experiment.” Once you did that your screen should look like this:

YALE Root

Data Loader

A blank YALE experiment won’t do us any good, we have to fill it up with operators to help us analyze data and build a model! Before we can build a model, we need to load in our data. Without the data we can’t analyze it and build a model, duh!

YALE requires you to have a data loading operator in your experiment “tree”. Since we are using an Excel spreadsheet we need to use an Excel compatible data loader.

Do the following steps: Right click the Root Operator > Select New Operator > IO > Examples > ExcelExampleSource.

YALE Data Loader

You should then see the ExcelExampleSource operator directly below your Root operator

Data Visualizer

Once you’ve selected your data loader, you want to add another operator that will let you manipulate and visualize your data.

Tip: This operator is not necessary but I find it useful to see patterns in the data. Skip this step if you want too.

Once again do the following steps: Right click the Root Operator > Select New Operator > Visualization > ExampleVisualizer.

YALE Data Visualizer

Once you click the ExampleVisualizer you’ll see it directly below your ExcelExampleSource operator. When you run the YALE experiment, the data gets loaded in and the ExampleVisualizer lets you click on any data visualizations YALE creates for you for more information. I like this operator a lot because it lets me find specific dates in the data charts when I see anything out of the ordinary.

Data Validator

The next step, and very important one, is to transform your data into something called training and validation sets. When you build a model, you build it from training data. You essentially train the model to learn the relationships in your input data to explain the output data. The validation set is used to test the trained data to make sure it makes sense. YALE has an operator that automatically splits up your data randomly into training and validation sets and then feeds those sets into the learning algorithm.

To find the Data Validation operator, do the following steps: Right click the Root operator, select New Operator > Validation > XValidation.

YALE Data Validator

Model Creator

Next we’ll add in the model creator, also known as the learning algorithm. Before we can place the learning algorithm, we have to create a “split” in the experiment. We’re doing this because we’ll use the same XValidation data set (training and validation) to build the model and test its performance later.

To do this we have to use something called an Operator Chain. An Operator Chain, in my dictionary, is nothing more than a fancy name for a branch in your experiment tree.

To find the Operator Chain do the following steps: First, Right click on the XValidation operator > Select New Operator > Operator Chain.

YALE Operator Chain

Now, the step we’ve all been waiting for! We’re going to place the learning algorithm into our experiment. YALE has several different algorithms available for you to use. Some are regression based, others use machine learning, and the one we’ll use is a classification algorithm.

A classification algorithm takes your data and classifies your output data into categories based on your input. Huh? This simply means that the algorithm takes your data and groups it into similar categories (this very handy when modeling trends because you want to find emerging trends before everyone else does).

When you use prediction data (when you want to see if the trend has changed) against your model, YALE will look for similar patterns and then give you the output signal (UP, DOWN, RANGE, etc) based on the categories your model learned.

The learning algorithm we’ll use is a classifier called “IBk” and to install it into your experiment you have to Right click the Operator Chain > Select New Operator > Learner > Lazy > IBk.

YALE Model Creator

Model Writer

Now, after your model is learned, you want to write it to a file, this way you can load it anytime you want and test new data against it.

To install the Model Writer, right click your Operator Chain > New Operator > IO > Models > ModelWriter

YALE Model Writer

Great! The model portion of the experiment is now done! If you really wanted to, you could learn a model right now but how would you know if its any good? We need some performance measures so we can determine if our model is good enough to make some predictions! That’s handled in the Model Applier and Performance Evaluator section, see below!

Model Applier

Once again we need to create a branch in our experiment to handle the performance evaluation. To do that, we’ll need to split off in the same place as before; we’ll split off from the XValidation operator. Follow the steps in the Data Validation section to install a new Operator Chain.

Right click on the XValidation operator > Select New Operator > Operator Chain.

YALE Operator Chain

Once you did that, we have to install an operator that will take your newly learned model and apply it to the XValidation data sets. This operator is called, surprisingly a Model Applier.

Right click on the second Operator Chain > Select New Operator > ModelApplier

YALE Model Applier

Performance Evaluator

The last operator you’ll need is the Performance Evaluator. This operator gives you the option to see the model’s prediction accuracy, correlation, squared correlation, and a host of other performance measures. This let’s you know, right off the bat, if your model is any good. I would never build a YALE experiment without some sort of performance evaluation measures!

YALE Performance Evaluator

Final Experiment

If you’ve followed along and did everything correctly, your final experiment framework should look like the image below. If it doesn’t, then please go back and make fixes. Any deviation from this framework could produce experiment errors which causes baldness.

YALE Final Experiment

There you have it, you’ve built your first YALE experiment! Of course you can cheat and just download the XML file here (in zip format): Gold XML

In Lesson IV we’ll cover setting your preferences and running the model for the first time. In Lesson V, we’ll cover how to interpret the results and check out some of the data visualization capabilities YALE has.

Thanks so much for attending the Neural Market Trends University! Please feel free to leave me a comment if you have any questions.

  • Michael Barck

    This description is totally helpful – thank you very much! I just begun and get the ideas behind YALE and must admit the software actually is extremely powerful but also slightly hard to use for beginners. But after reaching a certain point in knowledge I would say it is both more powerful and useful than standard data mining software like SAS or SPSS. And a lot cheaper, too.

    Reading you detailed tutorial on this basic YALE experiment together with the helping facilities built-in in YALE itself let me now better start to perform my own experiments and it would be great if you could give future tutorials in order to help others getting started with this fascinating topic.

    Best regards,
    Michael

  • http://www.neuralmarkettrends.com Tom

    Michael,

    Thanks for your kind words! It took me a long time to learn to build a basic model in YALE . After losing quite a bit of hair, I decided to share my knowledge with other people to help them “jump start” their YALE learning.

    Please check back next week when I post Lesson IV and over time as I show readers how to build a prediction experiment and a genetic algorithm experiment.

  • Shane B

    In the latest version of RapidMiner, I don’t see the classifier IBk that you describe (Operator Chain > Select New Operator > Learner > Lazy > IBk.) I do however see one called W-IBk here: Operator Chain > Select New Operator > Learner > Supervised > Weka > Lazy > W-IBk. Is this classifier suitable for this tutorial?

  • http://www.neuralmarkettrends.com Tom

    Shane,

    You are correct, the moved the IBK operator to Operator Chain > Select New Operator > Learner > Supervised > Weka > Lazy > W-IBk.

    I’m still not sold 100% on Rapidminer. I still have issues with it. You might want to consider downloading YALE 3.4

  • Pingback: Building an AI financial market model - Lesson IV | Neural Market Trends

  • Matt

    My hair, i was losing it!! By luck I’ve found your great tutorial

    P.S. a tutorial for the text plugin of YALE (aka Rapidminer) would stop my hair’s fall better than the floor :-)

  • http://www.neuralmarkettrends.com Tom

    Matt, your welcome. Check back tomorrow and I should have the Lesson I of the Evolutionary Weighting Tutorial Up.

  • Filip

    Thank you for you work -> you will have good karma :-)

  • http://www.neuralmarkettrends.com Tom

    Thank you Filip.

  • Kay

    The manual is fantastic!

    I just downloaded the latest version from RapidMiner, but the screenshots do no correspond with the latest version of the product(anymore).
    Is there somewhere an update on screenshots available to follow your example?

  • http://www.neuralmarkettrends.com Tom

    You can check out my video tutorials, they are in sync with the latest RM download.