Welcome back! In this lesson weâ€™ll explain how to build your first YALE experiment. Building the experiment takes just a few mouse clicks but how you put the pieces together requires a little thought.
Although this lesson isnâ€™t hard, you have to logically build your YALE experiment and put all their â€œoperatorsâ€ in the right order. If you donâ€™t then your experiment will crash, or worse, give you weird results. As I said in Lesson I, YALE has a very steep learning curve and I spent many hours pulling out my hair trying to figure out how to make it work.
A basic YALE experiment consists of the following operators:
- The root
- A data loader,
- A data visualizer
- A data validator
- A model creator
- A model writer
- A model applier
- A performance evaluator
- Final Experiment
Every YALE experiment has a Root operator. The Root operator is just where your experiments starts out. Your entire YALE experiment will â€œroll upâ€ to this operator in the end.
Now, click on the YALE icon and load up the program and select â€œBlank Experiment.â€ Once you did that your screen should look like this:
A blank YALE experiment wonâ€™t do us any good, we have to fill it up with operators to help us analyze data and build a model! Before we can build a model, we need to load in our data. Without the data we canâ€™t analyze it and build a model, duh!
YALE requires you to have a data loading operator in your experiment â€œtreeâ€. Since we are using an Excel spreadsheet we need to use an Excel compatible data loader.
Do the following steps: Right click the Root Operator > Select New Operator > IO > Examples > ExcelExampleSource.
You should then see the ExcelExampleSource operator directly below your Root operator
Once youâ€™ve selected your data loader, you want to add another operator that will let you manipulate and visualize your data.
Tip: This operator is not necessary but I find it useful to see patterns in the data. Skip this step if you want too.
Once again do the following steps: Right click the Root Operator > Select New Operator > Visualization > ExampleVisualizer.
Once you click the ExampleVisualizer youâ€™ll see it directly below your ExcelExampleSource operator. When you run the YALE experiment, the data gets loaded in and the ExampleVisualizer lets you click on any data visualizations YALE creates for you for more information. I like this operator a lot because it lets me find specific dates in the data charts when I see anything out of the ordinary.
The next step, and very important one, is to transform your data into something called training and validation sets. When you build a model, you build it from training data. You essentially train the model to learn the relationships in your input data to explain the output data. The validation set is used to test the trained data to make sure it makes sense. YALE has an operator that automatically splits up your data randomly into training and validation sets and then feeds those sets into the learning algorithm.
To find the Data Validation operator, do the following steps: Right click the Root operator, select New Operator > Validation > XValidation.
Next weâ€™ll add in the model creator, also known as the learning algorithm. Before we can place the learning algorithm, we have to create a â€œsplitâ€ in the experiment. Weâ€™re doing this because weâ€™ll use the same XValidation data set (training and validation) to build the model and test its performance later.
To do this we have to use something called an Operator Chain. An Operator Chain, in my dictionary, is nothing more than a fancy name for a branch in your experiment tree.
To find the Operator Chain do the following steps: First, Right click on the XValidation operator > Select New Operator > Operator Chain.
Now, the step weâ€™ve all been waiting for! Weâ€™re going to place the learning algorithm into our experiment. YALE has several different algorithms available for you to use. Some are regression based, others use machine learning, and the one weâ€™ll use is a classification algorithm.
A classification algorithm takes your data and classifies your output data into categories based on your input. Huh? This simply means that the algorithm takes your data and groups it into similar categories (this very handy when modeling trends because you want to find emerging trends before everyone else does).
When you use prediction data (when you want to see if the trend has changed) against your model, YALE will look for similar patterns and then give you the output signal (UP, DOWN, RANGE, etc) based on the categories your model learned.
The learning algorithm weâ€™ll use is a classifier called â€œIBkâ€ and to install it into your experiment you have to Right click the Operator Chain > Select New Operator > Learner > Lazy > IBk.
Now, after your model is learned, you want to write it to a file, this way you can load it anytime you want and test new data against it.
To install the Model Writer, right click your Operator Chain > New Operator > IO > Models > ModelWriter
Great! The model portion of the experiment is now done! If you really wanted to, you could learn a model right now but how would you know if its any good? We need some performance measures so we can determine if our model is good enough to make some predictions! Thatâ€™s handled in the Model Applier and Performance Evaluator section, see below!
Once again we need to create a branch in our experiment to handle the performance evaluation. To do that, weâ€™ll need to split off in the same place as before; weâ€™ll split off from the XValidation operator. Follow the steps in the Data Validation section to install a new Operator Chain.
Right click on the XValidation operator > Select New Operator > Operator Chain.
Once you did that, we have to install an operator that will take your newly learned model and apply it to the XValidation data sets. This operator is called, surprisingly a Model Applier.
Right click on the second Operator Chain > Select New Operator > ModelApplier
The last operator youâ€™ll need is the Performance Evaluator. This operator gives you the option to see the modelâ€™s prediction accuracy, correlation, squared correlation, and a host of other performance measures. This letâ€™s you know, right off the bat, if your model is any good. I would never build a YALE experiment without some sort of performance evaluation measures!
If youâ€™ve followed along and did everything correctly, your final experiment framework should look like the image below. If it doesnâ€™t, then please go back and make fixes. Any deviation from this framework could produce experiment errors which causes baldness.
There you have it, youâ€™ve built your first YALE experiment! Of course you can cheat and just download the XML file here (in zip format): Gold XML
In Lesson IV weâ€™ll cover setting your preferences and running the model for the first time. In Lesson V, weâ€™ll cover how to interpret the results and check out some of the data visualization capabilities YALE has.
Thanks so much for attending the Neural Market Trends University! Please feel free to leave me a comment if you have any questions.