Stephen Smith's Blog

Musings on Machine Learning…

The Road to TensorFlow – Part 5: An Introduction to Neural Networks

with 2 comments


We’ve now quickly covered a number of preliminary topics including Linux, Python, Python Libraries and some Stock Market theory. Now we are ready to start talking about Neural Networks and TensorFlow.


TensorFlow is Google’s open source platform for performing the types of numerical computations required by Neural Networks. It isn’t specific to Neural Networks, but has a lot of supporting functions to help with their development. If you had another application that required lots of matrix algebra, then perhaps TensorFlow would also work for you. TensorFlow supports optimized mathematical operations that can either run on your native CPU or be offloaded to a GPU. Google has even developed a custom processor chip to run TensorFlow operations in their data centers.

TensorFlow now powers quite a few Google products for things like speech recognition, photo recognition, and is even giving back some Google search results.

Biological Versus the Mechanical

A lot of AI researchers like to distance themselves from taking how biological neurons exactly work and rather to just take certain ideas. They point out that to achieve manned flight required taking ideas from birds like wing design while throwing away other ideas like wings flapping. Similarly, for neural networks they take some ideas and throw others away.

If you are interested in a more precise simulation of the brain, check out Waterloo University’s Nengo project. This is a very interesting simulation of the brain that has been able to solve a number of problems. In this discussion we’ll be looking at what is more typically done these days in neural networks which tend to take the ideas where the math works easiest and skipping the rest.

From Neurons to Matrix Equations

Consider a bunch of neurons in the brain as depicted in the following diagram.


Inputs come into each neuron and then if a weighted sum of the signals it receives is high enough then its outputs will fire (with a certain strength) which will then feed into another layer of neurons. This rather simplistic model of neurons and the brain is what we will model for our initial neural networks.

We will take some sort of vector of inputs and feed them into an input layer of neurons which based on the weighted sums of these inputs will fire with some strength into the next layer of neurons. In neural networks any layers of neurons that aren’t externally connected to inputs or outputs are called hidden layers. The following diagram shows this model.

Notice that all the inputs connect to all the next layer of neurons. In a biological brain, there won’t be that many connections, but here when we train this model to determine the weights, some weights will be zero (or very small) corresponding to there not really being a connection. But having a fixed complete set of connections really is just convenience to make the math easier and more uniform.

If you work out the math of doing all these weighted sums you quickly realize, you are just doing matrix algebra and you can get the input to the next layer by multiplying the inputs to this layer by a matrix. So:

Output of Layer = A x (Input of Layer)

Where A is the matrix of weights. That’s simple and easy to calculate (just ignoring for now where the elements of the matrix A come from).

If you remember your matrix algebra you will realize that if you do this to each layer, since this is just linear, you can multiply all the matrixes together and reduce the multiple layer problem to a single layer problem. So in this simple view there is no value in multiple layers. Additionally, linear models are overly simple and can be constructed and solved quite easily. Also with this the output is unbounded, it can come out at any magnitude, which clearly real neurons can’t.

What most neural networks do is add a non-linear activation function to this equation. The activation function maps the output value back into a valid range, adds a non-linearity so the whole equation doesn’t just transform back to one layer as well as adds flexibility in how the model can produce values. The new form of the equation then becomes:

Output of Layer = ActivationFunction( A x (Input of Layer) + b )

Where b is a scalar vector that allows the output to be shifted into range of the activation function. The simplest activation function is the rectifier function defined as f(x) = max( 0, x ). This basically returns x if x is positive and 0 if x is negative. This is good if we only want positive values as output, it is really simple and it does behave like some biological networks. On the downside, it isn’t invertible so we can’t run the network backwards (useful for sanity checking), it isn’t differentiable everywhere (helps with solving for the weights) and it doesn’t provide an upper bound on the output. All that being said, ReLU (Rectified Linear Unit) neural networks are currently the most popular. A smooth version of ReLU is the softplus function f(x) = ln(1+ex). Other choices of activation function include logistic sigmoid (from probability theory) and hyperbolic tangent (tanh) which we will use.

We’re still a bit theoretical at this point, but once we consider what the inputs look like and what we want for an output then we can start to solve for the bits in the middle. If we have good values for the various A matrixes and b vectors then we can see that with some matrix multiplication, addition and simple function evaluation we can get solutions and as it turns out both modern CPUs and especially GPUs are really good at this.

Stock Market Example

We’ll now start looking at this with a simple stock market example to get an idea how this all works. Suppose we want to feed in the last 30 adjusted closing prices for the 30 stocks that compose the Dow Jones index and we want our neural network to output the next day closing prices for these 30 stocks. We will be starting simple to give the basic ideas then we’ll look at making this model more sophisticated. Let’s see how we can go about this.

Our Input Vector

For any Neural Network we have to feed a vector of floating point numbers. So let’s consider feeding in a vector consisting of the last 30 adjusted closing prices of the first Dow component followed by the last 30 adjusted closes of the next component and so on. This means out input vector will contain 900 elements containing the last 30 adjusted closes of each of the 30 Dow stocks.

You can do this but it causes problems because the activation function we are going to use returns values between -1 and 1. Typically neural networks work best with values in this range (or maybe 0 to 1 if only positive values are required). So to make this work you need to normalize the input data to something that works better. We are going to do three things:

  1. Divide each stocks price by the first price we have in its history so it starts at 1.
  2. Rather than use the actual stock price, we’ll use the stock price change (of the price normalized by #1).
  3. If NaN is returned in the historical data, we will back fill it from the next good value. Fortuneately Pandas provides a function to do this:
    trainData.fillna(method=’backfill’, inplace=True)

This then puts all the values nicely in range and makes them fairly uniform. The reason for step 3 is that when we go to train the neural network we want to train it with lots of historical data and if we don’t do this we can’t go back very far. Visa, in its current corporate incarnation, only went public in 2008 and then was added to the Dow in 2013 (replacing Bank of America). So there is no Visa historical data from before 2008. Actually I chose tanh as the activation function after switching to price changes, originally I used ReLU with real prices but it tended to be rather unstable.

Our Output Vector

Out output vector will be the next price changes for the 30 Dow component stocks. Then we just need to undo the first normalization above in order to use them.


This article was a quick introduction to the equations we are going to solve with TensorFlow and what motivates them. We started to look at how we input data into the model and we will continue next time with finding all the various matrix components by framing it as an optimization problem.

Written by smist08

September 8, 2016 at 3:49 pm

The Road to TensorFlow – Part 4: The Stock Market

with 3 comments


This is the fourth article in my series on Google TensorFlow and we still won’t get to TensorFlow in this article. We’ve covered Linux, Python and various Python libraries so far. Last time we started to use Python libraries to load stock market data ready to feed into some sort of Neural Network model constructed using TensorFlow. In this article we’re going to take a bit of a side trip into looking at a number of issues, theory and logistics around playing with the stock market.

One thing to remember is that this discussion isn’t pure Mathematics. These are all theories that provide some guidance, they might be based on a lot of historical study, but that doesn’t mean they will be true tomorrow, or even that everyone believes them today. One good reference for this stuff is the Udacity course “Machine Learning for Trading”.


Is This a Suitable Problem for AI?

The first question to ask is whether trading stocks is a suitable problem? After all people can’t predict what the stock market will do tomorrow, so why would we think a computer can? Most AI problems, like image recognition or machine translation, know the problem is solvable since people solve these problems. So they know that if they can successfully model what people are doing, then they should be able to get similar results. In this case we are attempting a problem people can’t solve (but some are better at guessing than others), and hoping that fancy algorithms and big data will perhaps give us an edge. This idea could well be a fantasy since predicting the future in general is impossible. It would be nice if we could do as well as a stock picking cat, but that cat did beat a team of professionals.

Hedge Funds

Hedge funds are typically high risk funds that perform risky trading strategies for small select clienteles. There are many types of these funds that trade in all sorts of things using all sorts of strategies. However, the ones we are interested in, in this article, are the ones that perform high volume computer trading of stocks. Typically, these are driven by algorithms with little or no human oversight and typically the Hedge fund has an extremely favorable arrangement with a given stock market to allow their computers in the stock markets data center and further that they have extremely low transaction fees. Being in the stock market’s data center means they see everything first since they have no latency. Then don’t have to wait 10ms or whatever for information to make it to your location over the internet. Using very high powered computers they can profit by trading during these latencies (possibly taking your profit).

If these advantages weren’t enough, some Hedge funds have negotiated the right to filter all stock market transactions before they happen and optionally execute the trade themselves again allowing them room to make small profits by inserting themselves into other people’s transactions.

The main takeaway from this, is that unless you are such a Hedge fund, you are at a considerable disadvantage. This is one of the main reasons that day traders have all but disappeared. Hedge funds were able to manipulate them and generally profit from the day traders.

The other thing to remember is that Hedge funds are large and capable of manipulating the market. Often they will play against known trading strategies by over selling or buying to make it look like something is happening and then tricking people into doing things that are a bad idea and profiting from it.


The Efficient Market Hypothesis

The Efficient Market Hypothesis (EMH) states that asset prices fully reflect all available information. There are weaker and stronger forms of this hypothesis, but the basic premise is that you can’t beat the market and you may as well put all your money in an index fund that just matches market performance. Basically that it is futile to try and find undervalued stocks to buy or overvalued stocks to sell.

One claim is that Hedge funds contribute to making the markets efficient. Since they trade so quickly any new information is incorporated into the prices of stocks instantly as far as you can tell. Maybe so, but it does rub the wrong way that someone is profiting this way.

Not everyone believes the EMH, but at the same time it has been proven out time and time again especially in the large heavily traded world markets.


This is the Capital Asset Pricing Model that is often used in portfolio management to manage risk, but its also often used in stock trading. A simple form of this equation is:

ri(t) = βi * rm(t) + αi

This says that the return for a given stock i at a point of time t given by ri(t) is equal to a constant βi times the market return at time t given by rm(t) plus a constant αi. Where the expected value of αi is zero. There is usually another term for the base interest rate, but that is effectively zero these days.

The upshot of this is that stocks move with the market and not individually. Each stocks beta can be determined from the stocks history and then this gives a pretty good model for stock returns. This is bad if you have some special insight into a stock, for instance if you are an expert in its industry or perhaps have a good idea of the future trend. For instance if you know something bad is going to happen, you want to short the stock, but if the market goes up that day, it could overwhelm the individual stocks bad news and you lose on your position.

If you believe the EMH then alpha will always be zero or go to zero before you can capitalize on it.

The first Hedge founds came up with a clever scheme to avoid this. If you have two stocks, one you think is going to go up (positive alpha) and another that you think will go down (negative alpha) then you can buy/sell these stocks in pairs by choosing weights of the positions that cause the two beta component to cancel out. This way you eliminate the market from the equation and can concentrate on just the stocks. This is in fact where Hedge funds got their name, using two stocks to hedge their market exposure. This worked for awhile and then others figured out ways to exploit this and it caused a market crash and bailout for a number of funds when it failed. Now this buy/sell pair strategy doesn’t work. As most strategies seem to stop working once they are widely enough known.

Finding alpha is an interesting pursuit. For Hedge funds it could be via illegal insider information as dramatized in the TV series “Billions”. Or it could be via semi-legal methods like hiring a guy in China to sit by the road and could the number of trucks that come out of a factory. Certainly studying Apple’s suppliers and factories is a huge industry in trying to gather information on secretive Apple.

The Fundamental Law of Active Management

The fundamental law of active management is the following:

Performance = skill * square root(breadth)

This basically says that the performance of a portfolio manager is equal to his skill times the square root of the number of trades he makes. This law basically says that a poor portfolio manager can make up for his stupidity via volume.

For instance, Warren Buffet is really smart (high skill) and gets a really great return. His breadth is really small, he just buys 120 stocks and holds them. So his breadth is 120. Suppose a Hedge fund has developed a computer algorithm for stock trading that is 1/1000 as smart as Warren Buffet. Then if you do the math with this formula it comes out that the Hedge fund needs to trade 120,000,000 times a year to match Warren Buffet’s performance. The scary part is that there are lots of Hedge funds that employ this strategy. They have low grade (not very smart) algorithms that can get the same return as Warren Buffet by doing huge numbers of trades.

Adjusted Close

In the previous blog posting we read in the history of adjusted closes for all thirty components of the Dow Jones index. There was also a close price returned, why did we use the adjusted close rather than the real close? If a stock does well, its price goes up and the stock gets too expensive. To help with this every now and then a company will split its stock. They will issue say 2 new stocks for each old stocks. Everyone gets these, then now have twice the stocks at half the price. So from people’s point of view they still have the same value and nothing has changed. Stocks also issue dividends. Whenever a stock does this its prices goes up the value of the dividend before payment and then goes back down right after payment. Again to stock owners this is all well and good and understood. But these two things cause havoc to developing stock market pricing models and algorithms. Without knowing anything else a stock split looks catastrophic. So to help with this, stock markets provide the adjusted close which will adjust historical data for stock splits and dividends so they don’t mess up charts and algorithms. Generally, quite a nice feature of stock feeds. If you compare adjusted close and close they will be the same back to the last event of this nature and at that point will diverge.

Stock Prices

Stock prices don’t by themselves tell you anything about a company and can’t be used to directly compare companies. A company’s value is the stock price times the number of shares. But all companies have issued different numbers of shares and have completely different histories of stock splits, additional share offerings, etc. One way to deal with this is to normalize the stock market data, for instance you could divide all the share prices in a history by the first price. This will cause the stock price to start at 1 and then evolve from there. This does provide one way to compare performance graphically. When doing AI we tend to have to normalize data since the algorithms we are going to use generally don’t like working on large ranges of numbers. We’ll talk more about that later.

Testing with Real Money

We’re not going to test anything with real money. However, most algorithms need real testing in the real market. What we are going to look at doesn’t worry about transaction fees. It also doesn’t worry about some market logistics, since we are only looking at closing prices. You can’t get the previous close price at the next day’s open due to after hours trading and in general how stock order books work. Also if you are a big Hedge fund then actually performing your trades may affect the market. I might have a brilliant algorithm that makes me lots of money in a simulator, but if I run it in the real market, the market may react and counter what I’m doing. Worst sometimes Hedge funds have caused market crashes, or caused the stock market circuit breakers to kick in as a result of their actions.


This was a really quick introduction to the stock market concepts we’ll be talking about. If you are interested, you can follow the links in the article to learn more.

Written by smist08

September 2, 2016 at 11:32 pm

The Road to TensorFlow – Part 3: Python Libraries

with 3 comments


Continuing on with my long and winding journey to learn TensorFlow, we started with Linux then went on to Python. Today we will be looking at a number of necessary Python libraries.

My background is Mathematics and I’ve always had an interest in Numerical Analysis and Scientific Computing. But I mostly left these behind when I left University. As I learned Python and started to play with it, among the attendant libraries, I was very pleasantly surprised to find that all my favorite numerical algorithms (and many more). These were now all part of the Python fairly standard libraries. Many of these core libraries are still written in their original Fortran or C code, but are tailored to fit very well into the Python ecosystem. All of this is all open source software and to a certain degree made possible by the good work of the GNU Fortran and C compilers.

These libraries led to quite a few diversions from my primary task of learning TensorFlow, but I found this to be quite a wonderful world to become conversant in.

As I completed the TensorFlow tutorials and an Udacity course, I wanted a different problem to play with rather than the standard image recognition and speech analysis projects that seem pretty standard. To use these, you need quite a bit of data to train your algorithms with, so I thought why not do something with stock market data? After all you can easily get gobs of stock market data via web service calls fairly easily (and freely).

Some Useful Libraries

Here are a few of the libraries that I found useful to help with machine learning and TensorFlow.

Numpy – this is the fundamental Python numerical package that most other libraries are built over. It includes a powerful N dimensional array object, useful linear algebra, Fourier transform, random number capabilities and much more.

Scipy – is built on numpy and includes most numerical algorithms you’ve ever heard of including numerical integration, ODE solvers, optimization, interpolation, special functions and signal processing.

Matplotlib – is a very powerful 2D plotting library that is very useful to use to visualize your results.n

Pandas – was originally written as a library to manipulate stock market data and perform the standard things market technical analysts like to do, but now it markets itself as a general purpose data analysis library.

Sympy – is a library for performing symbolic mathematics. Although I’m not using this in relation to TensorFlow (currently), it is a fascinating tool for performing symbolic algebra and calculus.

IPython – is interactive Python when you program in interactive web based notebooks. A useful tool to play with, but I tend to do my real programming in an IDE. Still if you want to quickly play with something, this is lots of fun.

Pickle – although this is a standard library, I thought I’d highlight it since we are about to use it. This library lets you easily save and load Pythons objects to disk files.

Scikit-learn – is a collection of machine learning algorithms for things like clustering, classification and regression. I.e. neural networks aren’t the only way to accomplish these tasks.

There are many more Python libraries for things like writing GUI programs, performing web requests, processing web data, accessing databases, etc. We’ll talk about those as we need them. Since Python has such a large community of users and contributors there are tons of good web pages, blogs, books courses and forums on all of these. Google is your friend.

Some Code Finally

So let’s use all of this to load some stock market data which will then be ready for our TensorFlow model. We are going to use Pandas to load some recent prices for the Dow 30 stocks and we’ll use matplotlib to display a graph of their values. This graph is a bit too busy since 30 stocks is also really too many to display at once. Also we haven’t normalized the data at all, so this doesn’t give any real way to compare them. It really only shows we’ve loaded a bunch of data which is hopefully correct.

In this snippet we only load a small bit of history, so its reasonably quick but when we want large amounts of data we will want to cache this. So when we do the web services call to get the data, we pickle it to a file (Python speak for serializing our data object and saving it to a file). If the file exists we just read it from the file and skip the web service call. To refresh the data from the web service, just delete the stocks.pickle file.

We get the data from Yahoo Finance. We could use Yahoo’s Python library directly, but I thought I might use the Pandas DataReader general purpose API to make it easy to switch to Google if Verizon shuts down (or strangles) this service now that they own Yahoo. The Web Services call returns the open, high, low, volume, close and adjusted close which is why we have the couple of lines to clean up the data and only keep the adjusted close. I’ll talk more about the stock market and what the adjusted close is next time.

The program wants to get TrainDataSetSize prices for each stock which is set to 50 below. But due to weekends and holidays, you can’t just subtract 50 from today’s date to get that. So I use a simple heuristic to ensure I get more data than that (which massively overestimates).

import time
import math
import os
from datetime import date
from datetime import timedelta
import numpy as np
import matplotlib
import pandas as pd
import pandas_datareader as pdr
from pandas_datareader import data, wb
from six.moves import cPickle as pickle

TrainDataSetSize = 50

# Load the Dow 30 stocks from Yahoo into a Pandas datasheet

dow30 = ['AXP', 'AAPL', 'BA', 'CAT', 'CSCO', 'CVX', 'DD', 'XOM',
          'GE', 'GS', 'HD', 'IBM', 'INTC', 'JNJ', 'KO', 'JPM',
          'MCD', 'MMM', 'MRK', 'MSFT', 'NKE', 'PFE', 'PG',
          'TRV', 'UNH', 'UTX', 'VZ', 'V', 'WMT', 'DIS']

stock_filename = 'stocks.pickle'
if os.path.exists(stock_filename):
         with open(stock_filename, 'rb') as f:
             trainData = pickle.load(f)
     except Exception as e:
       print('Unable to process data from', stock_filename, ':', e)
     print('%s already present - Skipping requesting/pickling.' %
     f =, 'yahoo',
     cleanData = f.ix['Adj Close']
     trainData = pd.DataFrame(cleanData)
     print('Pickling %s.' % stock_filename)
         with open(stock_filename, 'wb') as f:
           pickle.dump(trainData, f, pickle.HIGHEST_PROTOCOL)
     except Exception as e:
         print('Unable to save data to', stock_filename, ':', e)




Generally, I think this is a fairly short bit of code that accomplishes all this. This is one of the beauties of Python that it is so compact.



This was a quick introduction the Python libraries we’ll be using in addition to TensorFlow. Hopefully the quick sample program gave a taste of how we will be using them and is in fact how we will be getting training data for our TensorFlow model.



Written by smist08

August 30, 2016 at 10:49 pm

The Road to TensorFlow – Part 2: Python

with 6 comments


This is part 2 on my blog series on playing with TensorFlow. Last time I blogged on getting Linux going in a VM. This time we will be talking about the Python programming language. The API for TensorFlow is primarily aimed at Python and in fact much of the research in AI, scientific computing, numerical computing and data research all takes place in Python. There is a C++ API as well, but it seems like a good chance to give Python a try.

Python is an interpreted language that is very rich in supporting various programming paradigms like object oriented, procedural and functional. Python is open source and runs on many platforms. Most Linux’s and the MacOS come with some version of Python pre-installed. Python is very interoperable and can work with most other programming systems, and there are a huge number of libraries of functionality available to the Python programmer. Python is oriented to getting things done quickly with a minimum of code and a minimum of fuss. The name Python is a tribute to the comedy troupe Monty Python and there are many references to Monty Python throughout the documentation.


Installation and Versions

Although I generally like Python it has one really big problem that is generally a pain in the ass when setting up new systems and browsing documentation. The newest version of Python as of this writing is 3.5.2 which is the one I wanted to use along with all the attendant libraries. However, if you type python in a terminal window you get 2.7.12. This is because when Python went to version 3 it broke source code compatibility. So they made the decision to maintain version 2 going forwards while everyone updated their programs and scripts to version 3. Version 3.0 was released in 2008 and this mess is still going on eight years later. The latest Python 2.x, namely 2.7.12 was just released in June 2016 and seems to be quite actively developed by a good sized community. So generally to get anything Python 3.x you need to add a 3 to the end. So to run Python 3.5.2 in a terminal window you type python3. Similarly, the IDE is IDLE3 and the package installer is pip3. It makes it very easy to make a mistake an to get the wrong thing. Worse the naming isn’t entirely consistent across all packages, there are several that I’ve run into where you add a 2 for the 2.x version and the version 3 one is just the name. As a result, I always get a certain amount of Python 2.x stuff accidentally installed by mistake (which doesn’t hurt anything, just wastes time and disk space). This also leads to a bit of confusion when you Google for information, in that you have to be careful to get 3.x info rather than 2.x info as the wrong one may or may not work and may or may not be a best practice.

On Ubuntu Linux I just used apt-get to install the various packages I needed. I’ll talk about these a bit more in the next posting. Another option for installing Python and all the scientific libraries is to use the Anaconda distribution which is quite a good way to get everything in Python installed all at once. I used Anaconda to install Python on Windows 10 at it worked really well, you just don’t get the fine control of what it does and it creates a separate installation to keep everything separate from anything already installed.

Python the Language

Python is a very large language; it has everything from object orientation to functional programming to huge built in libraries. It does have a number of quirks though. For instance, the way you define blocks is via indentation rather than using curly brackets or perhaps end block statements. So indentation isn’t just a style guideline, it’s fundamental to how the program works. In the following bit of code:

for i in range(10):
    a = i * 8
    print( i, a )
a = 8

the two indented statements are part of the for loop and the out-dented assignment is outside the loop. You don’t define variables, they are defined when first assigned to, and you can’t use a variable without assigning it first (or an exception will be thrown). There are a lot of built in types including dictionaries and lists, but no array type (but the numpy library does add these). Notice how the for loop uses in rather than to, to do a basic loop.

I don’t want to get too much into the language since it is quite large. If you are interested there are many good sites on the web to teach Python and the O’Reilly book “Learning Python” is recommended (but quite long).

Since Python is interpreted, you don’t need to wait for any compile steps so the coding, testing, debugging cycle is quite quick. Writing tight loops in Python will be slower than C, but generally Python gives you quite good libraries to do most of what you want and the libraries tend to be written in C or Fortran and very fast. So far I haven’t found speed to be an issue. TensorFlow is also written in C for speed, plus it has the ability to run on NVidia graphics cards for an extra boost.


This was my quick intro to Python. I’ll talk more about relevant parts of Python as I go along in this series. I generally like Python and so far my only big complaint is the confusion between the version 2 world and the version 3 world.


Written by smist08

August 26, 2016 at 11:10 pm

Posted in Artificial Intelligence

Tagged with ,

The Road to TensorFlow – Part 1 Linux

with 9 comments


There have been some remarkable advancements in Artificial Intelligence type algorithms lately. I blogged on this a little while ago here. Whether its computers reading hand-writing, understanding speech, driving cars or winning at games like Go, there seems to be a continual flood of stories of new amazing accomplishments. I thought I’d spend a bit of time getting to know how this was all coming about by doing a bit of reading and playing with the various technologies.

I wanted to play with Neural Network technology, so thought the Google TensorFlow open source toolkit would be a good place to start. This led me down the road to quite a few new (to me) technologies. So I thought I’d write a few blog posts on my road to getting some working TensorFlow programs. This might take quite a few articles covering Linux, Python, Python libraries like Pandas, Stock Market technical analysis, and then TensorFlow.


The first obstacle I ran into was that TensorFlow had no install image for Windows, after a bit of Googling, I found you need to run it on MacOS or Linux. I haven’t played with Linux in a few years and I’d been meaning to give it a try.

I happened to have just read about a web site that provides VirtualBox and VMWare images of all sorts of versions of Linux all ready to go. So I thought I’d give this a try. I downloaded and installed VirtualBox and downloaded a copy of 64Bit Ubuntu Linux. Since I didn’t choose anything special I got Canonical’s Unity Desktop. Since I was trying new things, I figured oh well, lets get going.

Things went pretty well at first, I figured out how to install things on Ubuntu which uses APT (Advanced Packaging Tool) which is a command line utility to install things into Ubuntu Linux. This worked pretty well and the only problems I had were particular to installing Python which I’ll talk about when I get to Python. I got TensorFlow installed and was able to complete the tutorial, I got the IDLE3 IDE for Python going and all seemed good and I felt I was making good progress.

Then Ubuntu installed an Ubuntu update for me (which like Windows is run automatically by default). This updated many packages on my virtual image. And in the process broke the Unity desktop. Now the desktop wouldn’t come up and all I could do was run a single terminal window. So at least I could get my work off the machine. I Googled the problem and many people had it, but none of the solutions worked for me and I couldn’t resolve the problem. I don’t know if its just that Unity is finicky and buggy or if it’s a problem with running in a VirtualBox VM. Perhaps something with video drivers, who knows.

Anyway I figured to heck with Ubuntu and switched to Red Hat’s Fedora Linux. I chose a standard simple Gnome desktop and swore to never touch Unity again. I also realized that now I’m retired, I’m not a commercial user, so I can freely use VMWare, so I also switched to VMWare since I wondered if my previous problem was caused by VirtualBox. Anyway installing TensorFlow on Fedora seemed to be quite difficult. The dependencies in the TensorFlow install assume the packages that Ubuntu installs by default and apparently these are quite different that Fedora. So after madly installing things that I didn’t really think were necessary (like the Gnu Fortran compiler), I gave up on Fedora.

So I went back to and downloaded an Ubuntu image with the Gnome desktop. This then has been working great. I got everything re-installed quite quickly and was back to being productive. I like Gnome much better than Unity and I haven’t had any problems. Similarly, I think VMWare works a bit better than VirtalBox and I think I get a bit better performance in this configuration.

I have Python along with all the Python scientific and numerical computing libraries working. I have TensorFlow working. I spend most of my time in Terminal windows and the IDLE3 IDE, but occasionally use FireFox and some of the other programs pre-installed with the distribution.


I’m greatly enjoying working with Linux again, and I’m considering replacing my currently broken desktop computer with something inexpensive natively running Linux. I haven’t really enjoyed the direction Windows has taken after Windows 7 and I’m thinking of perhaps doing most of my computing on Linux and MacOS.


I am enjoying using Linux again. In spite of my initial problems with Ubuntu’s Unity Desktop and then with Fedora (running TensorFlow). Now that I have a good system that seems to be stable and working well I’m pretty happy with it. I’m also glad to be free of things like App stores and its nice to feel in control of my environment when running Linux. Anyway this was the small first step to TensorFlow.

Written by smist08

August 23, 2016 at 11:40 pm

UI Testing in Swift

with one comment


To round out my blog series on an introduction to Swift, this posting will be covering UI Testing. Previously we created a simple Swift program to draw a Koch Snowflake, adding some unit tests and then added some performance tests.

The source code for the project is on Google Drive here.

UI Testing actually runs the program like an end user would run the program and if you switch to the simulator while the test is running you can watch these actions take place. Unlike many other UI testing frameworks, this one just interacts with the screen controls, if done properly there is no code involving doing things at specific (x,y) co-ordinates. The magic that makes this work is the iOS accessibility layer that was created to help people with disabilities use Apple products. For instance, the VoiceOver feature that reads the screen needs to interact with the controls in the same way as our UI Tests.

This then means that UI Tests also provide a good means for testing some of the accessibility aspects of our iOS applications. Fully supporting accessibility is an often neglected area and really deserves more consideration. The great thing here is that by making your UI Tests thorough you are also validating that many accessibility technologies will also work.

UI Testing in XCode

When you create a new Swift project in XCode and select unit testing you also get a skeletal group for UI Tests with some setup and a dummy test. You create you test by selecting an empty (or not) test and then pressing record and then manually perform the tests. When you close the simulator a bunch of recorded code will be pasted into your project. This then is a great starting point for writing more thorough tests. You then use all the same XCTAssert type functions as in the unit testing framework to check for problems.

Screen Shot 2016-06-15 at 8.41.45 AM


Not Having Accessibility Setup Correctly

If you haven’t set an accessibility identifier for your control, you won’t get the correct code recorded. Recording will try its best, but it will give you something that probably won’t work. This happened to me. I kept the bad code from the first attempt in the file commented out so you can see it. Generally, if the accessibility is setup right, the code is simple complete and will work. If not, you will find things you did not recorded and other things having hardware or synchronicity problems (strange errors which if you google have workarounds but it all becomes quite complicated).

Screen Shot 2016-06-15 at 8.42.06 AM

Keyboards and other Hardware

I performed my tests on my MacBook which of course has a fixed keyboard. When recording tests, make sure you use the iOS keyboard (that is on the screen). Generally, you want the tests to use all the iOS stuff and not the macOS stuff which makes using the simulator easier. Another approach is to access text fields via the clipboard using cut/paste so as to avoid the keyboard entirely. I tend to think for a good UI test you should test all the cases, but perhaps not on every text field. Also beware text already in text boxes that may need to be cleared first. One way to do this (probably the best way) is to add a clear button in the text boxes properties and then press this. In the recorded sample I hit the delete key a couple of times. Note that tapping a field usually doesn’t select all the text.


Beware that if you cause something to popup or be created, chances are your test code will run faster than that and start using things before they are created. You will need to add wait loops to wait for controls to exist before using them. This case doesn’t happen in the Koch snowflake program. Generally, you don’t want to insert sleep type statements to wait a couple of seconds, this slows down your UI tests and can prove unreliable and lead to investigating a lot of false failed tests. Always better to look for specific events and to proceed quickly.

The Test

The code for the test is below. The setUp and teardown methods were generated by XCode and I didn’t change them. The code for the testExample routine was generated by recording, then I just cleaned up a bit of noise. The intent is that it sets fractal level to 3 and then to 4. If you click on the simulator while running, then you can see this happen. Unfortunately, there isn’t really a good way to validate that it works correctly, so this is really only a run without crashing sort of test, unless you manually observe it.

//  KochSnowFlakeUITests.swift
//  KochSnowFlakeUITests
//  Created by Stephen Smith on 2016-05-13.
//  Copyright © 2016 Stephen Smith. All rights reserved.

import XCTest

class KochSnowFlakeUITests: XCTestCase {

    override func setUp() {

        // Put setup code here. This method is called before the invocation of each
        // test method in the class.
        // In UI tests it is usually best to stop immediately when a failure occurs.

        continueAfterFailure = false

        // UI tests must launch the application that they test.
        //Doing this in setup will make sure it happens for each test method.

        // In UI tests it’s important to set the initial state -
        // such as interface orientation - required for your tests before they run.
        // The setUp method is a good place to do this.

    override func tearDown() {
        // Put teardown code here. This method is called after the invocation of
        // each test method in the class.

    func testExample() {

        let app = XCUIApplication()

        let deleteKey = app.keys["delete"]

        let returnButton = app.buttons["Return"]



The UI testing support built into XCode and Swift is quite nice. Certainly comparable to some quite expensive packages available in the Windows world. Since iOS and macOS are quite a controlled environment and the accessibility support is quite good, this makes this package quite nice. The main thing to watch out for is the proliferation of Apple hardware to check. It appears that going forwards Apple is spending quite a bit of time ensuring automated testing works quite well for their development platform.

Written by smist08

June 15, 2016 at 4:09 pm

Posted in Mobility, programming

Tagged with , , ,

Performance Testing in Swift

with one comment


A couple of blog posts ago I covered writing my first Swift program for iOS so that I could draw a Koch Snowflake on an iPad or an iPhone. Then last time I covered adding unit tests to that project. This time I’m going to add performance tests.

In the process of adding performance tests, I had to refactor the test project and we’ll also look at why that was and how it makes it better going forwards as more tests are added. I’ll also mention a few things that should be done if this project gets a bit bigger.

I put an updated version of the Koch Snowflake project on Google Drive here.

Performance Tests in XCode

Of course you could instrument your program yourself and perhaps write the performance results out to a file or something, for that matter you can drill down into the Swift test case class and have a look at their implementation. But XCode gives you a bit of support so you generally don’t need to. If you add self.measureBlock {} around code in a unit test then the time taken of the code inside the measureBlock will be recorded and reported inside XCode as shown in the following screenshot.

Screen Shot 2016-06-01 at 3.32.53 PM

Actually it does a bit more than that. When you add measureBlock to a unit test, then when you run that unit test, it won’t just be run once, but will be run ten times, so that the average and standard deviation will be recorded. Due to this it is crucial that any performance tests are idempotent. You can also set a baseline, so the percentage deviation from the baseline gets recorded. This is shown in the following screenshot that is a drill down from the previous screen shot.

Screen Shot 2016-06-01 at 3.33.04 PM

Hence XCode gives a fairly painless way to add some performance metrics to your unit tests.

Test Case Organization

Generally, you want your unit tests to run against every build or your product, so you want then to run in a second or two. Once the performance tests get longer, you will probably want to separate them off into a separate test group and then run this test group perhaps once over night. I haven’t done that, but if the project gets any bigger then I will.

In fact, the test framework inside XCode is quite good for performing integration tests (which would run against real databases and real servers), but since these may require some setup or be quite time consuming, you could also set these to run once per night.

There is also a separate framework for UI testing, which again is too slow to run against every build, but makes sense to run every night.

Refactoring the Unit Tests

For the performance test, I wanted to record the time it takes to draw the Koch Snowflake at various fractal levels. To do this I wanted to do something like the previous testInitialViewController routine, but it contained a lot of setup code. So first the unit test framework includes setUp function that is called before the unit tests are run and a tearDown routine that is called after then finished. So I moved the creating of the graphic context to this routine, along with the code to get the view controller started. Then it was fairly easy to add tests for fractal levels 3 through 7.

Last time I just had 2 unit tests, each was quite large and performed multiple things. Now we’ve split things up into more unit tests that do less, which is generally a better practice. This was actually forced on me since you can only have one measureBlock in any unit test, so I couldn’t performance test the different fractal levels in the same unit test (at least with separate timings). Really I should break up the turtle graphics tests into multiple unit tests, perhaps next time.

The reason I went all the way to fractal level 7, was that the performance reports in XCode are often 2 decimal places (or sometimes 3 decimal places) on the number of seconds the test takes. For my fractal, the drawing is quite quick so I needed go this high to get some longer test times recorded (kind of a good problem to have). I could have gone higher or put them in an additional loop, but thought this was sufficient.

//  KochSnowFlakeTests.swift
//  KochSnowFlakeTests
//  Created by Stephen Smith on 2016-05-13.
//  Copyright © 2016 Stephen Smith. All rights reserved.

import XCTest

@testable import KochSnowFlake

class KochSnowFlakeTests: XCTestCase {
    var storyboard:UIStoryboard!
    var viewController:ViewController!

    override func setUp() {
        // Put setup code here. This method is called before the invocation of each test method in the class.

        UIGraphicsBeginImageContextWithOptions(CGSize(width: 50, height: 50), false, 20);

        self.storyboard = UIStoryboard(name: "Main", bundle: nil)

        self.viewController = storyboard.instantiateInitialViewController() as! ViewController
        _ = viewController.view

    override func tearDown() {
        // Put teardown code here. This method is called after the invocation of each test method in the class.

    func testTurtleGraphics() {
        // Test the turtle graphics library.
        // Note we need a valid graphics context to do this.

        let context = UIGraphicsGetCurrentContext();
        let tg = TurtleGraphics(inContext: context!);
        XCTAssert(tg.x == 50, "Initial X value should be 50");
        XCTAssertEqual(tg.y, 150, "Initial Y value should be 150");
        XCTAssertEqual(tg.angle, 0, "Initial angle should be 0");
        XCTAssertEqual(tg.x, 60, "X should be incremented to 60");
        XCTAssertEqual(tg.y, 150, "Initial Y value should be 150");
        XCTAssertEqual(tg.angle, 0, "Initial angle should be 0");
        XCTAssertEqualWithAccuracy(tg.x, 60, accuracy: 0.0001, "X should be o 60");
        XCTAssertEqualWithAccuracy(tg.y, 160, accuracy: 0.0001, "Initial Y value should be 160");
        XCTAssertEqual(tg.angle, 90, "Initial angle should be 90");
        XCTAssertEqualWithAccuracy(tg.x, 60 + 10 * sqrt(2) / 2, accuracy: 0.0001, "X should be o 60+10*sqrt(2)/2");
        XCTAssertEqualWithAccuracy(tg.y, 160 + 10 * sqrt(2) / 2, accuracy: 0.0001, "Initial Y value should be 160+10*sqrt(2)/2");
        XCTAssertEqual(tg.angle, 45, "Initial angle should be 45");

    func testPerformanceLevel3()
        // Test that the storyboard is connected to the view controller and
        // that we can create and use the view and controls.

        viewController.fractalLevelTextField.text = "3"
        self.measureBlock {
            self.viewController.fracView.drawRect(CGRect(x:0, y:0, width: 50, height: 50))

        XCTAssertTrue(viewController.fracView.level == 3)
        // This next line is just to get 100% code coverage.

    func testPerformanceLevel4() {
        // This is an example of a performance test case.
        viewController.fractalLevelTextField.text = "4"
        self.measureBlock {
            self.viewController.fracView.drawRect(CGRect(x:0, y:0, width: 50, height: 50))

    func testPerformanceLevel5() {
        // This is an example of a performance test case.
        viewController.fractalLevelTextField.text = "5"
        self.measureBlock {
            self.viewController.fracView.drawRect(CGRect(x:0, y:0, width: 50, height: 50))

    func testPerformanceLevel6() {
        // This is an example of a performance test case.
        viewController.fractalLevelTextField.text = "6"
        self.measureBlock {
            self.viewController.fracView.drawRect(CGRect(x:0, y:0, width: 50, height: 50))

    func testPerformanceLevel7() {
        // This is an example of a performance test case.
        viewController.fractalLevelTextField.text = "7"
        self.measureBlock {
            self.viewController.fracView.drawRect(CGRect(x:0, y:0, width: 50, height: 50))



I found adding performance tests to my fractal iOS application quite easy. XCode gives quite nice support to perform these tests painlessly, hopefully motivating more programmers to include them.

At this point I’m not going to optimize the code as it is running fast enough. But if I ever take on drawing more sophisticated or complicated fractals, then drawing speed becomes really important. Some things to consider would be how efficient in the recursive algorithm used, and whether I’m efficiently using floating point and integer arithmetic (or are there unnecessary conversions or perhaps too much precision being used).

Written by smist08

June 7, 2016 at 2:15 am