Building an AI financial market model – Lesson I

I have a lot of people and readers ask me how can they build their own neural net or AI learned model using the open source (free) Y.A.L.E. software. YALE is a very powerful data mining and machine learning program but its learning curve is like jumping off a cliff and into an underwater cave; it’s deep and scary!

Have no fear, you too can build a financial model to help you identify trends in different equity, currency, or asset markets, in a few easy lesson. Follow along with Tom! I make modeling FUN!

First off, building a learned and predictive model requires a lot of “up front work” so in the first few lessons we’ll concentrate on the setup. Once the setup is complete, all you need to do in the future is update some data and then optimize the model again with a few mouse clicks.

Before we get started you’ll need to ask yourself the follow questions:

  1. Do I have the right software?
  2. What do I want to model? What are my input and output variables?
  3. Do I have the financial data to model with?
  4. Is the data in the right format?

Do I have the right software?

Visit YALE website and download the free software. Just follow the on screen instructions and install the system to your hard drive. If you’re running windows, like me, you’ll have to download the entire window’s executable file.

What do I want to model? What are my input and output variables?

This is the biggest question you’ll have to ask yourself. What exactly do you want to know? In the example I’ll use for this lesson, I posed the following question: “What is the current trend of Gold?” Once you know what you want to model, identifying your input and output variables becomes easier.

Whoa! Hold on Jack! What are input and output variables? Your output variable is nothing more than what you want to know. For example, if you want to know what the current trend of Gold is, your output variable might be a simple word as UP, DOWN, RANGE or BUY, SELL, and HOLD. In the end you want to know if the data you analyze gives you an output of BUY, HOLD, UP, DOWN, etc.

Next, we have the more difficult input variables. These variables are what “drives” the price and trend of your output variable. For example, we are interested in Gold; what might drive the price of Gold? You might say some factors are bond yields, the price of oil, the level of S&P500 index, or even Silver prices. You’re probably correct in these assumptions and all of these prices, yields, or index levels might very well become part of your input variables as you try to find relationships between them and the trend on Gold.

In our lesson example we’ll use seven input drivers to build the Gold trend model. They are:

  1. 1-Month Libor
  2. 10 Year Treasury Note
  3. 30 Year Treasury Note
  4. Canadian Dollar (currency)
  5. Russell 2000 index
  6. German DAX index
  7. Philadelphia Gold and Silver Index

Do I have the financial data to model with?

Next, we need to find data to model with. I use a data downloading program called TraderXL Pro which let’s me download daily and weekly closing prices of currencies, stocks, ETF’s, and futures. I plug in the symbols for my seven input variables and download them into an Excel spreadsheet.

Tip: Yahoo allows you to download free historical data so if you’re looking for data to play with, check out the historical prices link for a specific asset symbol.

What I typically do is download two sets of data, one data set for model building and a second set to test your model against. The minimum thing to do is to download a data range starting 2 years and 3 months ago and ending 3 months ago for your model building data set and a second data range starting 3 months ago through yesterday (or whatever last market close day) to test your model against.

The reason why you want to use at least 2 years of data is that you can capture rising bond yields, market sell offs, and other financial events that will influence the current trend you are looking for. For our lesson example I downloaded nearly 7 years of closing data.

Tip: I download weekly closing data to build my trend model and then daily data for my test data set. This way I can see on a daily basis if my input variable’s daily closing prices could change any trends quickly!

Is the data in the right format?

YALE allows you read Excel, CSV, and a host of other file types, so finding a format that’s compatible with YALE is easy. For this lesson we’ll stick to using Excel because its easy for everyone to understand. I’ve uploaded the example spreadsheet we’ll be using to build our Gold trend model: GA-Gold.xls

Stay tuned for Lesson II where I discuss the first step in building the trend model.