Stephen Smith's Blog

Musings on Machine Learning…

Archive for June 2017

Learning in Brains and Computers

leave a comment »


In the last couple of articles we were considering whether the brain is a computer and then what its operating system looks like. In this article we’ll be looking at how the brain learns and comparing that to how learning in a modern AI system works. As we noted before, our DNA doesn’t contain a lot of seed data for the brain, nearly everything we know needs to be learned. This tends to be why as animals become more and more advanced, their childhoods become longer and longer. Besides growing to our full size, we also require that time to learn what we will need to survive on our own as adults for when we leave our parents. Similarly AI systems start without any knowledge, just a seed of random data to start with, then we train them so that they can do the job we desire like driving a car. However, how we train an AI system is quite different than how we train a child, though there are similarities.

How the Brain Learns

For our purposes we are looking at what happens at the neuron level during learning rather than considering higher level theories on educational methods. As we are trained two things happen, on is when something is reinforced then the neural connections are strengthened. Similarly if a pathway isn’t used then it weakens over time. This is controlled by the number of chemical transmitters at the junction between the neural connection. The other thing that happens is the neurons grow new connections. To some degree the brain is always re-wiring itself by growing new connections. As mentioned before thousands of neurons die each day and not all of them are replaced, so as we age we have fewer neurons, but this is counterbalanced by a lifetime of learning where we have continuously grown new neural connections, so as we age perhaps we have fewer neurons, but by far more neural connections. This is partly why staying mentally active and pursuing lifetime learning is so important to maintain mental health into older age.

Interestingly this is also how memory works. This same neural strength adjustment and connection growth is how we encode memories. The system is a bit more complex since we have a short term memory system from which some data is later encoded into long term memory but the basic mechanism are the same. This is why we forget things, if we don’t access a memory then the neural connection will weaken over time and eventually the memory will be forgotten.

A further feature of biological learning is how the feedback loop works. We get information through our senses and can use that for learning, but it’s been shown that the learning is much more effective if it leads to action and then the action provides feedback. For instance if you are shown a picture of a dog and told its a dog, this is far less effective than being provided a dog that you can interact with, by touching and petting. It appears that having exploratory action attached to learning is far more effective in our we learn, especially at young ages. We say this is the input – learn – action loop with feedback rather than just the input – learn loop with feedback.

How AIs Learn

Let’s look specifically at Neural Networks, which have a lot of similarities with the brain. In this case we represent all the connections between neurons as weights in a matrix where zero represents no connection and a non-zero weight represents a connection that we can strengthen by making larger or weaken by making smaller.

To train a Neural Network we need a set of data where we know the answers. Suppose we want to train a Neural Network to recognize handwritten numbers. What we need is a large database of images of handwritten numbers along with the number each image represents. We then train the Neural Network by seeding it with random weights feed each image through the Neural Network and compare how it does to the correct answer. We have sophisticated algorithms like Stochastic Gradient Descent that adjusts the weights in the matrix to produce better results. If we do this enough then we can get very good results from our Neural Network. If often apply some other adjustments such as setting small weights to zero so they really don’t represent a connection or penalizing large weights since these lead to overfitting.

This may seem like a lot of work, and it is, but it can be done in a few hours or days on a fast modern computer, using GPUs if necessary to speed things up. This relies on that we can adjust weights instantly since they are just floating point numbers in a matrix, unlike the brain which needs to make structural changes to the neuron or brain.

A Comparison

To effectively train a Neural Network to recognize handwritten decimal digits (0-9) requires a training database of around 100,000 images. One of the reasons AI has become so successful in recent years has been the creation of many such huge databases that can be used for training.

Although it might feel like it to a parent, it doesn’t require showing a toddler 100,000 images for them to learn their basic numbers. What it does take is more time and a certain amount of repetition. Also the effectiveness is increased if the child can handle the digits (like with blocks) or draw the digits with crayons.

It does take longer to train a toddler than an AI, but this is largely because growing neural connections is a slower process than executing an algorithm on a fast computer which doesn’t have any other distractions. But the toddler will quickly become more effective at performing the task than the AI.

Comparing learning to recognize digits like this may not be accurate, since in the case of the toddler, they are first learning to distinguish objects in their visual field and then recognize objects when they are rotated and seen from separate angles. So the input into learning digits for a brain probably isn’t a set of pixels directly off the optic nerve. The brain will already have applied a number of algorithms it learned previously to present a higher level representation of the digit before being asked to identify each digit. In the same way perhaps our AI algorithm for identifying digits in isolation from pixelated images is useful for AI applications, but isn’t useful on the road to true intelligence and that perhaps we shouldn’t be using these algorithms in so much isolation. We won’t start approaching strong AI till we get many more of the systems working together. For instance for self driving cars, the system has to break a scene up into separate objects before trying to identify them, creating such a system requires several Neural Networks working together to do this work.

Is AI Learning Wrong?

It would appear that the learning algorithm used by the toddler is far superior to the learning algorithm used in the computer. The toddler learns quite quickly based on just a few examples and the quality of the result often beats the quality of a Neural Network. The algorithms used in AI like Stochastic Gradient Descent tend to be very brute force, find new values of the weights that reduce the error and then keep iterating, reducing the error till you get a good enough result. If you don’t get a good enough result then fiddle with the model and try again (we now have meta-algorithms to fiddle with the model for us as well). But is this really correct? It is certainly effective, but seems to lack elegance. It also doesn’t seem to work in as varied circumstances as biological learning works. Is there a more elegant and efficient learning algorithm that is just waiting to be discovered?

Some argue that a passive AI will never work, that the AI needs a way to manipulate its world in order to add that action feedback loop to the learning process. This could well be the case. After all we are training our AI to recognize a bunch of pixels all out of context and independently. If you add the action feedback then you can handle and manipulate a digit to see it from different angles and orientations. Doing this you get far more benefit from each individual training case rather than just relying on brute force and millions of separate samples.


There are a lot of similarities in how the brain learns versus how we train AIs, but there are also a lot of fundamental differences. AIs rely much more on brute force and volume of training data. Whereas the brain requires fewer examples but can make much more out of each example. For AI to advance we really need to be building systems of multiple Neural Networks rather than focusing so much on individual applications. We are seeing this start to take shape in applications like self-driving cars. In AIs we also need to provide a way to manipulate their environment, even if this just means manipulating the images that they are provided as training data and incorporating that manipulation into the training algorithms to make them much more effective and not so reliant on big data volume. I also think that biological brains are hiding some algorithmic tricks that we still need to learn and that these learning improvements will make progress advance in leaps and bounds.


Written by smist08

June 23, 2017 at 6:49 pm

The Brain’s Operating System

with one comment


Last time we posited that the human brain and in fact any biological brain is really a type of computer. If that is so, then how is it programmed? What is its operating system? What are the algorithms that facilitate learning and intelligent action? In this article we’ll start to look at some of the properties of this Biological operating system and how it compares to a modern computer operating system like Windows.


You can install Windows from a DVD that contains about 3.8Gigabytes of information. You install this on a computer that required a huge specification for its construction including the microprocessor, bios, memory and all the other components. So the amount of information required for the combined computer and operating system is far greater than 3.8Gigabytes.

The human genome which is like the DVD for a human is used to construct both the physical human and provide any initial programming for the brain. This human genome contains only 3.2Gigabytes of information. So most of this information is used to build the heart, liver, kidneys, legs, eyes, ears as well as the brain as well as any initial information to store in the brain.

Compared to modern computer operating systems, video games and ERP systems, this is an amazingly compact specification for something as complex as a human body.

This is partly why higher mammals like humans require so much learning as children. The amount of initial information we are born with is very small and limited to the basest survival requirements like knowing how to breathe and eat. Plus perhaps a few primal reflexes like a fear of snakes.


Windows runs on very reliable hardware and yet still crashes. The equivalent of a Blue Screen of Death (BSOD) in a human or animal would likely be fatal in the wild. Sure in diseases like epilepsy, an epileptic fit could be considered a biological BSOD, but still in most healthy humans this doesn’t happen. You could also argue that if you reboot your computer every night, your chance of a BSOD is much lower and similarly if a human sleeps every night then they work much more reliably than if they don’t.

The brain has a further challenge that neural cells are dying all the time. It’s estimated that about 9000 neurons die a day. Yet the brain keeps functioning quite well in spite of this. It was originally thought that we were born with all our neurons and that after that they just died off, but more modern research has shown that new neuron cells are in fact produced but not uniformly in the brain. Imagine how well Windows would run if 9000 transistors stopped working in your CPU every day and a few random new transistors were added every now and then? How many BSOD would this cause? It’s a huge testament to the brain’s operating system that it can run so reliably for so long under these conditions.

It is believed that the algorithms must be quite simple to operate under these conditions and able to easily adapt to changing configurations. Interestingly in Neural Networks it’s found that as part of training neural networks, that removing and adding neurons during the training process reduces overfitting and produces better results even in a reliable system. We talked about this “dropout” in this article about TensorFlow. So to some degree perhaps the “unreliability” actually led to better intelligence in biological systems.


Most modern computers are roughly based on the von Neuman architecture and if they do support parallelism it through multiple von Neuman architecture computers synchronized together (ie with multiple cores). The neurons in the brain don’t follow this architecture and operate much more independently and in parallel than the logic gates in a computer do. This is more to do with the skill of the programmer than a constraint on how computer hardware is put together. As a programmer it’s hard enough to program and debug a computer that executes one instruction at a time in a simple linear fashion with simple flow of control statements. Programming a system where everything happens at once is beyond my ability as a programmer. Part of this comes down to economics. Biology programmed the brain’s algorithms over millions of years using trial and error driven by natural selection. I’m required to produce a new program in usually less than one year (now three months in Internet time). Obviously if I put in a project plan to produce a new highly parallel super ERP system in even ten years, it would be rejected as too long, never mind taking a million years.

The brain does have synchronization mechanisms. It isn’t totally an uncontrolled environment. Usually as we study biological systems, at first they just look mushy, but as we study in more detail we find there is an elegant design to them and that they do tend to build on modular building blocks. With the brain we are just starting to get into understanding all these building blocks and how they build together to make the whole.

In studying more primitive nervous systems (without brains), there are two typical simple neural connection, one is from a sensor connecting to a single action, then there is the connection from a sensor connected to a coordinator to do something like activate two flippers to swim straight. These simple coordinators are the start of the brain’s synchronization mechanisms that lead to much more sophisticated coordinated behaviour.

Recurrent and Memory

The brain also is recurrent, it feeds its output back in as input to iterate in a sense to reach a solution. There is also memory, though the brain’s memory is a bit different than computers. Computers have a memory bank that is separate then the executing program. The executing program access the memory but these are two separate systems. In the brain there is only one system and the neurons act as both logic gates (instruction executors) and memory. To some degree you can do this in a computer, but it’s considered bad programming practice. As a programmer we are trained to never hard code data in our programs and this will be pointed out to us at any code review. However the brain does this as a matter of course. But unlike computer programs the brain can dynamically change these as it feels like, unlike our computer programs that need a programmer to be assigned to do a maintenance update.

This leads to another fundamental difference between computers and humans, computers we update by re-installing software. The brain is never reinstalled, what you get initially is what you live with, it can learn and adapt, but never be reinstalled. This eliminates one of tech supports main solutions to problems, namely reinstall the operating system. The other tech support solution to problems of turn the computer off and on again is quite difficult with biological brains and involves a defibrillator.


This article was an initial look at the similarities and differences between the brain and a modern computer. We looked at a few properties of the brain’s operating system and how they compare to something like Windows. Next time we’ll start to look at how the brain is programmed, namely how we learn.


Written by smist08

June 16, 2017 at 6:52 pm

Is the Brain Really a Computer?

with 2 comments


There is a lot of debate about whether the human brain is really a computer or is it something more than a computer or is it something quite different from a computer? In this article I’m going to look at some of these arguments, many of them positing behaviours of the brain that are claimed to be impossible to be exhibited by a computer.

Some of the arguments tend to be based on a need for humans to somehow be special, similar to the passion of people who stuck to the idea that the Earth was the center of the universe because we were somehow special and they couldn’t bear the idea that we were located on one insignificant planet orbiting one of billions of suns in our galaxy in a universe of billions of galaxies.

Other arguments are based around human behaviours like humour, saying it would be impossible to program a computer to create or really appreciate humour.

We’ll look at some of these arguments and consider them in the context of what we’ve been looking at in complex emergent behaviour of simple iterated systems.

The Brain Looks Like a Computer

As biologists study the workings of the brain, it is very structurally similar to a modern computer. In the sense that a neuron cell has a number of inputs through synapses and dendrites that conduct the input signals into the cell body that then does a summing and limiting function to decide if it will fire an output signal through the axon to feed into other neural cells. This structure is very similar to the basic logic gates the modern processing units are composed of. It also seems like a very simple and logical comparison. Often the simplest and most straightforward theory is also the correct one.

Emotional Computers

One argument against the brain being a computer is that computers are logical and not emotional. How could a computer program be humorous? How could a computer program appreciate humor? How could a computer program ever be jealous? A lot of these arguments were used to highlight how humans are different than animals with the claim being that animals never find anything funny or exhibit jealousy. That these are strictly human traits and show how we are special and different in some fundamental way than animals. However modern animal research now shows that animals do exhibit these behaviours and that we aren’t special in these regards. In fact any one who own two or more dogs will certainly see a lot of jealousy exhibited, plus any dog owner knows that dogs do find some things exceedingly funny. I think the people who promote these ideas really put on the blinders and really have some deep down need to be special, to avoid all the rather clear evidence to the contrary.

There is now a branch of AI that is looking to add emotion to computer systems, so that personal assistants can be humorous and can understand and take into account our emotional state so they can be better assistants. I tend to think that long term this forcing of emotion into chat-bots and such is unnecessary and that as these programs become more complex we will see emotions start to surface as emergent properties like some of the emergent behaviour we talked about here and here.

Quantum Complexity

Another argument is that the billions of neurons in the brain would be a computer if they worked electrically and chemically. However this wouldn’t be good enough to produce human intelligence. The argument here is that neurons hide in their structure small constructs that operate at the quantum level and that these combine to form some sort of new much powerful computing structure that might be like a computer or might not. That if it is like a computer then it’s many orders of complexity more than current computer hardware, so AI can’t be anywhere close yet. Or the quantum nature of these behaviours is beyond a Turing machine and much more powerful.

The problem with this argument is that neuron cells have been studied to great depth by biologists and nothing like this has been found. Further neurons don’t contain any way to network or communicate these processes with other cells. Further we’ve studied and simulated much simpler life forms that have just a few neurons and managed to accurately simulate their behaviour, indicating that we do have a fairly good idea of how neurons work.

I think these arguments tend to be blind to how complex a few billion neurons are already and how complex emergent properties from such a system can be.

Something Undiscovered

Perhaps a more religious argument is that there is some force or dimension that science hasn’t discovered. Perhaps intelligence doesn’t reside entirely in the brain, but in something like a soul. And that its having this soul that leads to human level intelligence. Religious thinkers started to unravel this argument back in the 1600s where it was usually referred to as Cartesian Dualism. It is understood how the neurons in the brain control the body through our nervous system. The question becomes how does the soul interact or affect the brain?

What science has shown is that if the interaction was through a known force like electromagnetism or nuclear weak force, then we would be able to detect and see this in action, and it has never been observed. What is then posited is that it must be via a force that science hasn’t discovered yet. However quantum field theory eliminates this possibility. There can certainly be undiscovered forces, but due to experiments in devices like the Large Hadron Collider, we know that any undiscovered force would be so powerful that we could detect it and that the interactions would be like nuclear explosions going off (ie very hard to miss). This is because if a force interacts with a particle like an electron, then quantum field theory says that you can produce the carrier particle for this force by crashing an electron into an anti-electron (positron) with sufficient force. We’ve now done this with all the particles to very high energy levels to know there is no low energy unknow force that could be doing this. Incidentally this is the same argument basically to prove that life after death is impossible, because we would be able to detect it at the point of death.


As Biologists study the brain, it does appear that the brain acts like a computer. As our studies get more and more detailed we are thoroughly eliminating any contending theories. Further, being a computer doesn’t limit us in any way because we know how complex and amazing emergent behaviour can be when simple systems are iterated.


Written by smist08

June 14, 2017 at 9:05 pm

Posted in Artificial Intelligence

Tagged with , ,

Intelligence Through Emergent Behaviour – Part 2

with one comment


Last time we looked at a simple physical dynamical system namely Taylor Couette Fluid Flow, where a very simple experiment led to more and more complicated solutions as a parameter was varied, namely the speed of the inner cylinder. In this article we are going to look at an example from Mathematics and an example from Computer Science. What we are interested in is how very complicated behaviour results from very simply stated problems. We are looking into insights in how something as complicated as human intelligence can result from the simple behaviour of billions of our neural cells. Or for that matter can we have intelligence arise from the simple behaviour of billions of logic gates in a modern computer? The amazing thing is that as we study nature we see this phenomena more and more, whether it’s fractals appearing in nature or the chaotic behaviour of ecological systems. It turns out that much of the richness and complexity of our environment can result from a few very simple rules.

The Mandelbrot Set

Everyone has seen fantastic images of the Mandelbrot Set. Such as:

Or a zoom in:

To see more detail along with more fabulous graphics, have a look at the WikiPedia article here.

The definition of the Mandelbrot Set is remarkably simple. It is the set of complex numbers c such that the quadratic map

Zn+1 = Zn^2 + c

converges. In the graphics black represents the convergent points, and then the colors are specified by how fast a point diverges. The Mandelbrot Set is a fractal, meaning that as you zoom in on it you see the same structures recurring at all magnifications. The key point is that we get this infinite complexity out of such a simple defining equation. We are used to simple formulas like quadratics leading to simple predictable behaviour like parabolas. However once you start iterating simple behaviours you start getting this amazingly rich complexity appearing.

The Game of Life

Another well known example of complex behaviour from very simple rules is John Conway’s Game of Life. The definition of the game is quite simple, so I’ll just quote the WikiPedia entry:

“The universe of the Game of Life is an infinite two-dimensional orthogonal grid of square cells, each of which is in one of two possible states, alive or dead, or “populated” or “unpopulated”. Every cell interacts with its eight neighbours, which are the cells that are horizontally, vertically, or diagonally adjacent. At each step in time, the following transitions occur:

1. Any live cell with fewer than two live neighbours dies, as if caused by underpopulation.
2. Any live cell with two or three live neighbours lives on to the next generation.
3. Any live cell with more than three live neighbours dies, as if by overpopulation.
4. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.

The initial pattern constitutes the seed of the system. The first generation is created by applying the above rules simultaneously to every cell in the seed—births and deaths occur simultaneously, and the discrete moment at which this happens is sometimes called a tick (in other words, each generation is a pure function of the preceding one). The rules continue to be applied repeatedly to create further generations.”

For lots of examples and images have a look at the full WikiPedia article here. Below is a gif showing a set pattern spawning gliders that fly off diagonally:

Some games just die out, others are extremely chaotic. Some people tried to “solve” the Game of Life, ie given an initial condition have a formula that predicts what will happen without having to run the full simulation. But is was shown that you can actually create an Universal Turing Machine in the game and hence solving this is equivalent to the Halting Problem and hence is impossible.


The Mandelbrot Set is one example of many where a very simple problem statement leads to infinitely complex solutions. The Game of Life shows how another simple statement leads to extremely complex and unpredictable behaviour. Further since you can create a Universal Turing Machine in the Game of Life, Turing’s Completeness theorem shows you could solve any problem computation using the Game of Life. This partly shows how Turing’s work is so influential and how so many things that we may not think of a computers, are in fact equivalent to computers.

Our brain consists of billions of neurons that have very simple rules that then get applied over and over in an iterative manner, so as we’ve seen this will lead to very complicated, rich and stable patterns emerging. Our thesis then is that this is the foundation of intelligence.



Written by smist08

June 2, 2017 at 6:27 pm