Welcome back! In this lesson we will set our preferences to make sure everything loads in correctly before we create the model. By now, you should’ve read and built the experiment framework, as described in Lesson III. If you haven’t, then what I’ll post here might not make a lot of sense.
We’ll cover 4 items in this lesson:
- Data Loading Preferences
- Model Writing Preferences
- Performance Preferences
- Run the experiment
Data Loading Preferences
Before you run the experiment for the first time, you have tell it where to find your data. If you click on the ExcelExampleSource operator, you’ll see the following preferences and toggle box.
I highlighted the important preferences with red dots in the above image so to avoid confusion. First select your data spreadsheet (Gold Final Input.xls) from your file folders, click on the “first_row_as_names“, enter the number 9 into the label_column field, and enter the number 1 into the id_column field.
It’s important that you get this step right. What you are doing is telling the experiment where your output variable is located and what your reference id is. The label_column field is the Excel column number of your output variable and the id_column field should be your date column number. Remember this because you’ll have to fill these fields in for your other experiments.
Next, you should create a breakpoint in the experiment which is nothing more than a pause in the experiment’s run. We’re doing this because we want the experiment to pause right after its loaded in the data. Why do we do this? By creating a breakpoint at this point in the experiment, you can inspect the loaded in data and make sure the experiment is reading in your output variable correctly.
Tip: You can skip this step but I highly advise that you don’t. You can create breakpoints at any step in the experiment if you choose but its more valuable during the data loading stage.
Model Saving Preferences
Scroll down to the ModelWriter operator and click on it. You’ll see only one field that will allow you select the path location to save your model. Click on it > select your data directory > type “gold_final.mod” and hit enter. Done!
Performance Preferences
Now we reach the final step, the setting of the performance preferences. Scroll down to the Performance Evaluator operator and click on it. You should see several fields available with check boxes. Scroll down and check the absolute error, relative error, correlation, square correlation, accuracy, and classification error boxes. Make sure the field with the pull down menu is set at correlation. Refer to the image below for the setup.
You’re done now. Let’s run the experiment!
Run the Experiment
This is the best part, all your hard work is about to pay off! Find the “play” button and click it!
The experiment should load your data in flash and then reach the breakpoint we discussed about. The experiment will automatically switch to the results screen which should look like this:
This is where the fun in data analysis begins! This results screen (only if you used the breakpoint) will tell you what the model sees as your output variable (label column). If its not GC Trend, then press the stop button and go back to the ExcelExampleSource operator and check your preferences.
Take a moment and click on the “plot view” option. Here you can create scatter plots, self organizing maps, or historgrams to your heart’s content. Take a moment and create a scatter plot, choose whatever you want for the X, Y, and Point Colors. YALE should automatically create a plot for you with several dots. These dots are from your id_column preference, in this case the date.
Remember we added in the data visualization operator? Doing this allows us to click on anyone of those scatter points and find out more about that data point. Adding this operator lets you determine that data composition of outliers and or specific information about a data point of your choosing.
When you’re all done, you’ll have to resume the experiment. Click on the resume button.
Now the experiment will create the model and determine its performance. This step could take a few minutes, depending on the size of your data. While you’re waiting, take a moment to subscribe to my RSS feed (shameless plug).
When the experiment finishes you should see the information in the results tab be replaced with the following screen:
I’m not going to discuss the importance of the statistical measures here but I will tell you that in building a classification model, like this, a high correlation is good. The correlation can be positive or negative and the closer it is to 1 (or -1) the better.
Congratulations! You’ve finished your first YALE experiment and build your first model! In Lesson V, I will show you how to build a prediction experiment and we’ll finally predict some current trends for Gold.
As always, if you have any questions regarding this lesson or the topics covered so far, please leave a comment or email me.
Pingback: Building an AI financial market model - Lesson V | Neural Market Trends
Pingback: Building an AI financial market model - Lesson II | Neural Market Trends