
The world's most advanced version of Pong...
If you believe in Intelligent Design, this article will certainly offend your sensibilities. Today, we are going to take a look at machine learning and we have some example software to accompany the article.
Intelligent Design You Say
Proponents of Intelligent Design put forward the argument that complex systems and solutions cannot arise as the result of random variation over time. On the face of it, it seems like a valid argument as it can be difficult to resolve the complexity of the interplay between various systems and completely random activity.
In saying this, that does not mean that it is true. This belief generally arises due to the inability of a person or group to visualize how such a process functions. Not that the process does not function.
To demonstrate this I created a small program to demonstrate machine learning. You can download the application and source code here. You will need Visual Studio 2010 to compile the application from source, but you can run the application without it. Its written in C# so that we can focus on the program itself, rather than the handling and conversion of variables that we would have in C or C++. The compiled version targets x86 (32 bit platforms).
I thought we should keep this simple, so I created a clone of Pong, a very simple tennis type game from 1972 by Atari Inc.
The principle is simple. A ball bounces around the screen, at each edge there is a paddle which can move up and down. If the ball hits the paddle, it bounces back towards the opposite side of the screen. If it misses the paddle, the other player scores a point and the ball is reset.
We generally think that it takes intelligence to play a game like this and that intelligence, according to Intelligent Design proponents, is the product of yet another intelligence at work.
Let's prove that wrong shall we.
I modified the Pong clone replacing the player controls with an Artificial Intelligence. A neural network trained by a genetic algorithm.
For those following the source code, you can ignore everything except for Marx_Pong_AI.cs. The rest of the code merely creates Pong and passes information to this class. I have kept the program simple, polling data rather than raising events. This is to reduce complexity and make the code easier to follow. The class also shares data between instances to cope with two players, so there is some logic to determine which player is calling the class.
Neural Network
Neural networks are dumb. That's a very important point to note. The basic unit of a neural network is the Perceptron. The biological equivalent of the Perceptron is the Neuron. A Perceptron is very simple to implement. It takes a series of inputs, multiplies each of those inputs by a weight and sums the result. The weight is just a value that determines how much influence that input has to the overall equation.
Its a bit like mixing a cocktail. If each of the inputs are drinks, the weights control the flow of a particular drink and its contribution to the final cocktail.
The result is passed to an equation that determines the output. This can be done in a variety of ways. We could select a binary output by stating that values above a certain limit (or threshold) should take the value of 1 and anything else should take a value of 0.
We could also pass it to an equation that will reduce the value to between 0 and 1, providing all the ranges in between. For those following the code, have a look at the 'Hidden Network' under the Neural Network. You will see code similar to this:
private void hiddenNetwork1()
{
activation = (input1 * weight1) + (input2 * weight2)...
output = 1.0 / (1.0 + Math.Exp(-activation));
}
The output is in the range of 0 to 1 and we have six of these which makes up our neural network. Whilst all of these methods are receiving the same input values, they all have different weights which means they have different outputs. So, for any given series of inputs, we have six different outputs from the series of methods.
Another important point to note is the loss of information at the output stage. We cannot tell, by examining the output, what contribution each of the inputs had. This information loss has implications when it comes to adjusting the weights to refine the output. Being effectively blind after this stage, weight adjustment becomes more of a guess than science.
Whilst different configurations are possible, we shall focus on the configuration incorporated into Pong. This is a one layer, feed-forward, neural network. This is not important at this stage, nor even for the purposes of this article, I only include for those interested.
All of the outputs are passed to an output Perceptron which is just the same as the hidden network outlined earlier. The output of this Perceptron is reduced to a binary impulse (0 or 1) to indicate that the paddle should move left or right.
Initially, we flood this neural network with random weights. Every time the ball hits the paddle, we capture those weights and pass it the Genetic Algorithm. Once we have the weights of two successful hits, the Genetic Algorithm takes over the refining of the weights towards a successful solution.
By setting the weights to specific values, the output (left or right), is determined by the input. The neural network is therefore doing nothing special, it is simply reacting to the input.
Genetic Algorithm
A Genetic Algorithm is an algorithm that modifies it's input each time it is executed to arrive at a new solution. The mechanism of this modification should be driven by methods derived from principles of genetics. Namely, parents, children, crossover and mutation.
In the modified version of Pong that we are using as our example, initial parents are selected when the ball strikes a paddle. The rationale being that the weights used when the strike occurred contain part of a solution.
Once the two parents are populated, every time the A.I. misses the ball the existing weights are discarded and replaced with a variation of the parents. In Genetic Algorithm parlance, we create a new child and the existing child dies. A child maintains similar values in the parents, as these are most likely to represent good values, and randomly generates new values for the rest. If we compare this to a genetic code, we keep similar genes and mutate everything else.
To cope with the situation where all the weights are similar, but the ball keeps going out of play, we introduce the concept of death rate. A ball that keeps going out of play is similar to a population crash, where a high death rate of children reduces the numbers of a population.
In genetics, this is most like the process of inbreeding. Such events would indicate a population crash and major elements of the genetic code should be altered. We tend to look at the biological changes in some children from inbreeding as having birth defects. What is really happening is a major branching of the genetic code that gives rise to new species. We can see this in human DNA where the code can be traced by to a genetic Adam and Eve, or local variations such as Asian facial features, etc.
By default, the application will begin to mutate major elements it has kept if four balls are missed in a row. A solution is viable if it does not miss this value of balls.
We can say that it has learned to survive.
One interesting thing to note is that this 'evolution' process is not one way. A change in a major element could remove a successful strategy. Suddenly, the paddle will no longer hit the ball with a high success rate and effectively be reset to original state.
Further, aiming for perfection by lowering the death rate, to provoke faster mutations, is normally a recipe for disaster. This genetic algorithm is essentially brute forcing the weights to come to an acceptable solution. The odds of randomly generating a perfect solution, indeed if one is even possible given the inputs, is astronomically low.
The key is finding an acceptable survival rate, not perfection.
Using The Application
The application is straightforward. Once loaded it is in a paused state. By clicking “Control' and selecting 'Play', the application will begin and start developing a solution. Just let it run and it will soon learn to play by itself. This can take anywhere between 10 minutes to an hour. One side will usually learn faster than the other, that is perfectly normal. If they don't seem to be learning, restart which will populate the program with new values.
At any point, you can pause the application and save the weights of either the right or left player. If you want to play one file against another, start the application, disable Learning in the menu and load the files for left and right. Clicking 'Play' will then play the solutions against each other. I have included two solutions, one for each player, in the zip archive.
You can change the death rate to increase or decrease the rate of mutation. This is useful if you do not like a solution and want to provoke a change, or even to determine the effects of a high death rate.
To restart the learning process, just reopen the application.
The score is kept along the top and a percentage value of how successful the entire series of mutations is kept along the bottom. This bottom value also changes between green and red. A green value indicates that the current child is successful a red value indicates it is missing the ball. If you observe the value on the bottom being red for most of the time, it means that the failure rate is high, if it green most of the time it means it is successful.
Conclusion
This application demonstrates that to achieve intelligent outcomes, an originating intelligence is not required. We can go from random values and dumb structures to intelligent behavior just through random interactions.
Increasing complexity does not change this fact. Given enough time, enough permutations any complex behavior or system can randomly appear.
The next time you are tempted into thinking of Intelligent Design, remember this little example and that the universe is one huge random generator.
It was inevitable it would happen at some point.