Stephen Smith's Blog

Musings on Machine Learning…

The Road to TensorFlow – Part 4: The Stock Market

with 6 comments


This is the fourth article in my series on Google TensorFlow and we still won’t get to TensorFlow in this article. We’ve covered Linux, Python and various Python libraries so far. Last time we started to use Python libraries to load stock market data ready to feed into some sort of Neural Network model constructed using TensorFlow. In this article we’re going to take a bit of a side trip into looking at a number of issues, theory and logistics around playing with the stock market.

One thing to remember is that this discussion isn’t pure Mathematics. These are all theories that provide some guidance, they might be based on a lot of historical study, but that doesn’t mean they will be true tomorrow, or even that everyone believes them today. One good reference for this stuff is the Udacity course “Machine Learning for Trading”.


Is This a Suitable Problem for AI?

The first question to ask is whether trading stocks is a suitable problem? After all people can’t predict what the stock market will do tomorrow, so why would we think a computer can? Most AI problems, like image recognition or machine translation, know the problem is solvable since people solve these problems. So they know that if they can successfully model what people are doing, then they should be able to get similar results. In this case we are attempting a problem people can’t solve (but some are better at guessing than others), and hoping that fancy algorithms and big data will perhaps give us an edge. This idea could well be a fantasy since predicting the future in general is impossible. It would be nice if we could do as well as a stock picking cat, but that cat did beat a team of professionals.

Hedge Funds

Hedge funds are typically high risk funds that perform risky trading strategies for small select clienteles. There are many types of these funds that trade in all sorts of things using all sorts of strategies. However, the ones we are interested in, in this article, are the ones that perform high volume computer trading of stocks. Typically, these are driven by algorithms with little or no human oversight and typically the Hedge fund has an extremely favorable arrangement with a given stock market to allow their computers in the stock markets data center and further that they have extremely low transaction fees. Being in the stock market’s data center means they see everything first since they have no latency. Then don’t have to wait 10ms or whatever for information to make it to your location over the internet. Using very high powered computers they can profit by trading during these latencies (possibly taking your profit).

If these advantages weren’t enough, some Hedge funds have negotiated the right to filter all stock market transactions before they happen and optionally execute the trade themselves again allowing them room to make small profits by inserting themselves into other people’s transactions.

The main takeaway from this, is that unless you are such a Hedge fund, you are at a considerable disadvantage. This is one of the main reasons that day traders have all but disappeared. Hedge funds were able to manipulate them and generally profit from the day traders.

The other thing to remember is that Hedge funds are large and capable of manipulating the market. Often they will play against known trading strategies by over selling or buying to make it look like something is happening and then tricking people into doing things that are a bad idea and profiting from it.


The Efficient Market Hypothesis

The Efficient Market Hypothesis (EMH) states that asset prices fully reflect all available information. There are weaker and stronger forms of this hypothesis, but the basic premise is that you can’t beat the market and you may as well put all your money in an index fund that just matches market performance. Basically that it is futile to try and find undervalued stocks to buy or overvalued stocks to sell.

One claim is that Hedge funds contribute to making the markets efficient. Since they trade so quickly any new information is incorporated into the prices of stocks instantly as far as you can tell. Maybe so, but it does rub the wrong way that someone is profiting this way.

Not everyone believes the EMH, but at the same time it has been proven out time and time again especially in the large heavily traded world markets.


This is the Capital Asset Pricing Model that is often used in portfolio management to manage risk, but its also often used in stock trading. A simple form of this equation is:

ri(t) = βi * rm(t) + αi

This says that the return for a given stock i at a point of time t given by ri(t) is equal to a constant βi times the market return at time t given by rm(t) plus a constant αi. Where the expected value of αi is zero. There is usually another term for the base interest rate, but that is effectively zero these days.

The upshot of this is that stocks move with the market and not individually. Each stocks beta can be determined from the stocks history and then this gives a pretty good model for stock returns. This is bad if you have some special insight into a stock, for instance if you are an expert in its industry or perhaps have a good idea of the future trend. For instance if you know something bad is going to happen, you want to short the stock, but if the market goes up that day, it could overwhelm the individual stocks bad news and you lose on your position.

If you believe the EMH then alpha will always be zero or go to zero before you can capitalize on it.

The first Hedge founds came up with a clever scheme to avoid this. If you have two stocks, one you think is going to go up (positive alpha) and another that you think will go down (negative alpha) then you can buy/sell these stocks in pairs by choosing weights of the positions that cause the two beta component to cancel out. This way you eliminate the market from the equation and can concentrate on just the stocks. This is in fact where Hedge funds got their name, using two stocks to hedge their market exposure. This worked for awhile and then others figured out ways to exploit this and it caused a market crash and bailout for a number of funds when it failed. Now this buy/sell pair strategy doesn’t work. As most strategies seem to stop working once they are widely enough known.

Finding alpha is an interesting pursuit. For Hedge funds it could be via illegal insider information as dramatized in the TV series “Billions”. Or it could be via semi-legal methods like hiring a guy in China to sit by the road and could the number of trucks that come out of a factory. Certainly studying Apple’s suppliers and factories is a huge industry in trying to gather information on secretive Apple.

The Fundamental Law of Active Management

The fundamental law of active management is the following:

Performance = skill * square root(breadth)

This basically says that the performance of a portfolio manager is equal to his skill times the square root of the number of trades he makes. This law basically says that a poor portfolio manager can make up for his stupidity via volume.

For instance, Warren Buffet is really smart (high skill) and gets a really great return. His breadth is really small, he just buys 120 stocks and holds them. So his breadth is 120. Suppose a Hedge fund has developed a computer algorithm for stock trading that is 1/1000 as smart as Warren Buffet. Then if you do the math with this formula it comes out that the Hedge fund needs to trade 120,000,000 times a year to match Warren Buffet’s performance. The scary part is that there are lots of Hedge funds that employ this strategy. They have low grade (not very smart) algorithms that can get the same return as Warren Buffet by doing huge numbers of trades.

Adjusted Close

In the previous blog posting we read in the history of adjusted closes for all thirty components of the Dow Jones index. There was also a close price returned, why did we use the adjusted close rather than the real close? If a stock does well, its price goes up and the stock gets too expensive. To help with this every now and then a company will split its stock. They will issue say 2 new stocks for each old stocks. Everyone gets these, then now have twice the stocks at half the price. So from people’s point of view they still have the same value and nothing has changed. Stocks also issue dividends. Whenever a stock does this its prices goes up the value of the dividend before payment and then goes back down right after payment. Again to stock owners this is all well and good and understood. But these two things cause havoc to developing stock market pricing models and algorithms. Without knowing anything else a stock split looks catastrophic. So to help with this, stock markets provide the adjusted close which will adjust historical data for stock splits and dividends so they don’t mess up charts and algorithms. Generally, quite a nice feature of stock feeds. If you compare adjusted close and close they will be the same back to the last event of this nature and at that point will diverge.

Stock Prices

Stock prices don’t by themselves tell you anything about a company and can’t be used to directly compare companies. A company’s value is the stock price times the number of shares. But all companies have issued different numbers of shares and have completely different histories of stock splits, additional share offerings, etc. One way to deal with this is to normalize the stock market data, for instance you could divide all the share prices in a history by the first price. This will cause the stock price to start at 1 and then evolve from there. This does provide one way to compare performance graphically. When doing AI we tend to have to normalize data since the algorithms we are going to use generally don’t like working on large ranges of numbers. We’ll talk more about that later.

Testing with Real Money

We’re not going to test anything with real money. However, most algorithms need real testing in the real market. What we are going to look at doesn’t worry about transaction fees. It also doesn’t worry about some market logistics, since we are only looking at closing prices. You can’t get the previous close price at the next day’s open due to after hours trading and in general how stock order books work. Also if you are a big Hedge fund then actually performing your trades may affect the market. I might have a brilliant algorithm that makes me lots of money in a simulator, but if I run it in the real market, the market may react and counter what I’m doing. Worst sometimes Hedge funds have caused market crashes, or caused the stock market circuit breakers to kick in as a result of their actions.


This was a really quick introduction to the stock market concepts we’ll be talking about. If you are interested, you can follow the links in the article to learn more.

Written by smist08

September 2, 2016 at 11:32 pm

6 Responses

Subscribe to comments with RSS.

  1. This arc is excellent Stephen, thanks for sharing!

    Aslan Kanzas

    September 5, 2016 at 11:04 pm

  2. […] quickly covered a number of preliminary topics including Linux, Python, Python Libraries and some Stock Market theory. Now we are ready to start talking about Neural Networks and […]

  3. […] after a long journey through Linux, Python, Python Libraries, the Stock Market, an Introduction to Neural Networks and training Neural Networks we are now ready to look at a […]

  4. […] I spent quite a bit of time playing and blogging about predicting the stock market with TensorFlow, this is where I started. The data was all numeric, so it was quite easy to get […]

  5. Since you are not mentioning anything of Tensorflow, why titled with Tensorflow?
    Good article anyway!

    Sunding Wei

    March 23, 2017 at 9:00 am

    • Its part of a series, and I just needed one article on some domain background.


      March 23, 2017 at 5:57 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: