- Source For This Article: Download source files and projects - 300 Kb
|This article will explain the actual concepts of Backward Propagation Neural Networks - in such a way that even a person with zero knowledge in neural networks can understand the required theory and concepts very easily. The related project demonstrates the designing and implementation of a fully working 'BackProp' Neural Network library, i.e, the Brain Net library as I call it. You can find the theory, illustration and concepts here - along with the explanation of the neural network library project - in this article. Also, find the full source code of the library and related demo projects (a simple pattern detector, a hand writing detection pad, an xml based neural network processing language etc) in the associated zip file.
- 1. Overview
- 2. Before We Begin.
- 3. Understanding Neural Networks
- 4. How A Neural Network Actually Works
- 5. Designing BrainNet Neural Network Library
- 5.1. The UML Model
- 5.2. A Neuron In BrainNet Library
- 5.3. The Strategy Of A Neuron
- 5.4. A Neural Network In BrainNet library
- 5.5. Training The Network
- 5.6. Running The Network
- 5.7. Creating A Network
- What is Next
- Appendix A: Small Dose Of Spiritual Programming!!
- Solution Architect: "Well, you learned something about neural networks?"
- (Dumb?) Developer: "No, I'm smart enough. I love using other's code."
- Solution Architect: "But, if you don't understand the concepts, how you can optimize and re-use other's code?"
- (Dumb?) Developer: "Err.. I feel that most others can code better than me, so why should I optimize?"
- Understand the basic theory behind neural networks (backward propagation neural networks in particular)
- Understand how neural networks actually 'work'
- Understand in more detail, the design and source code of BrainNet library.
- Understand in more detail, how to use BrainNet Library in your projects.
- Think about new possibilities of neural network programming
- Put forward some concepts to optimize and generalize BrainNet library.
- Q) Why you selected an object oriented programming model for this Neural Network Library?
- Answer - The focus is on the understandability of basic concepts, not on performance.
- Q) Is this neural network library fully optimized?
- Answer - Not yet, we are still in the beta stage. The focus is on readability, so the code is flattened so that even a beginner can understand it. Suggestions and modifications are always welcome. Send your modifications, hacks and suggestions to email@example.com
- Q) Whether this library can be used in projects?
- You can use it - as long as your usage confronts to the specifications in the associated license notice (see the source code). Anyway, I request you to send me a notification (and the modified code), if you hack it or use it in any of your projects.
The first article in this article series is titled "BrainNet Neural Network Library - Part I - Learn Neural Network Programming step by step And Develop a Simple Handwriting Detection System".
If you are really a beginner, it will help you a lot, and may provide you a step by step approach towards understanding neural networks.
This is my second article about Neural Networks in general and the BrainNet Neural Network Library in particular. This article explains Neural Networks and their working in more detail, and in a very simple way. Then I will explain the design concepts of BrainNet library.
Brain Net Neural Network library is designed and implemented using Object Oriented Concepts.
Before understanding how neurons and neural networks actually work, let us revisit the structure of a neural network. As I mentioned earlier, a neural network consists of several layers, and each layer has a number of neurons in it. Neurons is one layer is connected to multiple or all neurons in the next layer. Input is fed to the neurons in input layer, and output is obtained from the neurons in the last layer.
Fig: A Fully Connected 4-4-2 neural network with 4 neurons in input layer, 4 neurons in hidden layer and 2 neurons in output layer.An artificial neural network can learn from a set of samples.
For training a neural network, first you provide a set of inputs and outputs. For example, if you need a neural network to detect fractures from an X-Ray of a born, first you train the network with a number of samples. You provide an X-Ray, along with the information that whether that particular X-Ray has a fracture or not. After training the network a number of times with a number of samples like this (probably thousands of samples), it is assumed that the neural network can 'detect' whether a given X-Ray indicates a fracture in the born (This is just an example). The concept of training a network is detailed in my first article. Later, in this article, we will discuss the theory behind network learning.
As we already discussed, the basic component in a neural network is a neuron. First of all, let us have a very brief look towards biological neurons, and their corresponding artificial models.
The four basic components of a biological neuron are
Dendrites - Dendrites are hair like extensions of a neuron, and each dendrite can bring some input to the neuron (from neurons in the previous layer). These inputs are given to the soma.
Soma - Soma is responsible for processing these inputs, and the output is provided to other neurons through the axon and synapses.
Axon - The axon is responsible for carrying the output of soma to other neurons, through the synapses
Synapses - Synapses of one neuron is connected to the dendrites of neurons in the next layer. The connections between neurons is possible because of synapses and dendrites.
A single neuron is connected to multiple neurons (mostly, all neurons) in the next layer. Also, a neuron in one layer can accept inputs from more than one neuron (mostly, all neurons) in the previous layer.
Now, let us have a look at the model of an artificial neuron.
An artificial neuron consists of various inputs, much like the biological neuron. Instead of Soma and Axon, we have a summation unit and a transfer function unit. The output of one neuron can be given as input to multiple neurons.
Please note that for an artificial neuron, we have a weight value associated with each input. Now, let us have a look at the working of a neuron.
When inputs are fed to the neuron, the summation unit will initially find the net-value. For finding the Net Value, the product of each input value and corresponding connection weight is calculated.
i.e, input value x(i) of each input to the neuron is multiplied with the associated connection weight w(i). In simplest case, these products are summed and fed to the transfer function. See the pseudo code below, it is simpler to understand.
Also, a neuron has a bias value, which affects the net value. A bias of a neuron is set to a random value, when the network is initialized. We will change the connection weights and bias of all neurons in the network (other than neurons in the input layer), during training phase.
I.e, if x is the input, and w is the associated weight, then pseudo code for net value calculation is as follows.
netValue=0 for i=0 to neuron.inputs.count-1 netValue=netValue + x(i) * w(i) next netValue=netValue + Bias
Transfer FunctionTransfer function is a simple function, that uses the net value to generate an output. This output is then propagated to the neurons in the next layer. We can use various types of transfer functions as shown below.
Hard Limit Transfer Function: For example, a simple hard limit function will output 1 if net value is greater than 0.5, and will output 0 if the net value is lesser than 0.5 - as shown.
if (netValue<0.5) output = 0 else output = 1Sigmoid Transfer Function: Another type of transfer function is a sigmoid transfer function. A sigmoid transfer function will take a net value as input and produce an output between 0 and 1 as shown.
output = 1 / (1 + Exp(-netValue))The implementation of summation unit and transfer function unit may vary in different networks.
This, a neural network is constructed from such basic models, called neurons, arranged together in layers, and connected to each other as explained earlier. Now let us see how all these neurons work together, inside a neural network.
- Training the network - by providing inputs and corresponding outputs.
- In this phase, we train a neural network with samples to perform a particular task.
- Running the network - by providing the input to obtain the output.
- In this phase, we will provide an input to the network, and obtain the output. The output may not be accurate always. Generally speaking, the accuracy of the output during running phase depends a lot on the samples we provided during the training phase, and the number of times we trained the network.
Training is the process of adjusting the connection weights and bias of all neurons in the network (other than neurons in the input layer), to enable the network to produce expected output for all input sets.
Now, let us see how the training actually happens. Consider a small 2-2-1 network. Now, we are going to train this network with AND truth table. As you know, AND truth table is
Fig: A 2-2-1 Neural Network and Truth Table Of ANDIn the above network, N1 and N2 are neurons in input layer, N3 and N4 are neurons in hidden layer, and N5 is the neuron in output layer. The inputs are fed to N1 and N2. Each neuron in each layer is connected to all neurons in next layer. We call the above network a 2-2-1 network, based on the number of neurons in each layer.
First, let us see how we train our 2-2-1 network, the first condition in the truth table, i.e, when A=0, B=0 then output=0.
Step 1 - Feeding The InputsInitially, we will feed the inputs to the neural network. This is done by simply setting the output of neurons in Layer 1, as the input values we need to feed. I.e, as per the above example, our inputs are 0,0 and output is 0. we will set the output of Neuron N1 as 0, and the output of N2 is set to 0.
Have a look at this pseudo code, and it will make things clear. Inputs is the input array. The number of elements in Input array should match the number of neurons in input layer.
i = 0 For Each neuron In InputLayer someNeuron.OutputValue = Inputs(i) i = i + 1 Next
Step 2 - Finding the output of the networkWe have already seen how we calculate the output of a single neuron. As per our above example, the output of neurons N1 and N2 will act as the inputs of N3 and N4.
Finding the output of neural network involves, calculating the outputs of all hidden layers and output layer. As we discussed earlier, a neural network can have a number of hidden layers.
'Find output of all neurons in all hidden layers For each layer in HiddenLayers For Each neuron In layer.Neurons neuron.UpdateOutput() Next Next 'Find output of all neurons in output layer For Each neuron In OutputLayer.Neurons neuron.UpdateOutput() NextUpdateOutput() function of a single neuron works exactly as we discussed earlier. First, net value is calculated by the summation unit, and then it is provided to a transfer function to obtain the output of the neuron. Pseudo code is again shown below.
Summation Unit works like this:
Dim netValue As Single = bias For Each InputNeuron connected to ThisNeuron netValue = netValue + (Weight Associated With InputNeuron * _ Output of InputNeuron) NextI.e, as per our above example, let us calculate the net value of neuron N3. We know that N1 and N2 are connected to N3.
- Net Value Of N3 = N3.Bias + (N1.Output * Weight Of Connection From N1 to N3) + (N2.Output * Weight Of Connection From N2 to N3)
- Net Value Of N4 = N4.Bias + (N1.Output * Weight Of Connection From N1 to N4) + (N2.Output * Weight Of Connection From N2 to N4)
Now, let us see how we are generating the output, using Transfer unit. Here, we are using the sigmoid transfer function. This is exactly as we discussed earlier.
Output of Neuron = 1 / (1 + Exp( - NetValue )Now, the output of N3 and N4 will be passed to each neuron in the next layer as inputs. This process of propagating the output of one layer as the input to the next layer is called forward propagation part in the training phase.
Thus, after step 2, we just found the output of each neuron in each layer - starting from the first hidden layer to the output layer. The output of the network is simply the output of all neurons in the output layer.
Step 3 - Calculating The Error or DeltaIn this step, we will calculate the error of the network. Error or Delta can be stated as the difference between the expected output and the obtained output. For example, when we find the output value of the network for the first time, most probably the output will be wrong. We need to get 0 as the output for inputs A=0 and B=0. But the output may be, some other value like 0.55, based on the random values assigned to the bias and connection weights of each neuron.
Now let us see, how we can calculate the error. Let us see how to calculate the error or delta of each neuron in all the layers.
- First we will calculate the error or delta of each neuron in the output layer.
- The delta value thus calculated will be used to calculate the error or delta of neurons in the previous layer (i.e, the last hidden layer)
- The delta value of all neurons in the last hidden layer is used to calculate the error or delta of all neurons in the previous layer (i.e, second last hidden layer)
- This process is continued, till we reach the first hidden layer (delta of input layer is not calculated).
Time to see how things actually work. The general equation for finding the delta of a neuron is
Neuron.Delta = Neuron.Output * (1 - Neuron.Output) * ErrorFactorNow, let us see how the error factor is calculated for each neuron. The Error Factor of neurons in output layer can be calculated directly (since we know the expected output of each neuron in output layer).
For a neuron in output layer,
ErrorFactor Of An Output Layer Neuron = _ ExpectedOutput - Neuron's Actual Outputi.e, with respect to our above example, if the output of N5 is 0.5 and the expected output is 0, then error factor = 0 - 0.5 = - 0.5
For a neuron in hidden layer, error factor calculation is some what different. To calculate the error factor of a neuron in hidden layer,
- First the delta of each neuron to which this neuron is connected is multiplied with the weight of this connection
- These products are summed up together to obtain the error factor of a hidden layer neuron
'Calculating the error factor of a neuron in a hidden layer For Each Neuron N to which ThisNeuron Is Connected 'Sum up all the delta * weight errorFactor = errorFactor + (N.DeltaValue * _ Weight Of Connection From ThisNeuron To N) NextTo illustrate this, consider a neuron x1 (ThisNeuron), which is a hidden layer neuron. X1 is connected to neurons y1, y2, y3 and y4 - and these are neurons in next layer.
- Error Factor of X1 = (Y1.Delta * Weight Of Connection From X1 To Y1) + (Y2.Delta * Weight Of Connection From X1 To Y2) + (Y3.Delta * Weight Of Connection From X1 To Y3) + (Y4.Delta * Weight Of Connection From X4 To Y4)
- X1.Delta = X1.Output * (1 - X1.Output) * ErrorFactor Of X1
Step 4 - Adjusting The Weights and BiasAfter calculating the delta of all neurons in all layers, we should correct the weights and bias with respect to the error or delta, to produce a more accurate output next time. Connection Weights and Bias, together are called free parameters. Remember that a neuron should update more than one number of weights - because, as we already discussed, there is a weight associated with each connection to a neuron.
See the pseudo code for updating the free parameters of all neurons in all layers
'Update free parameters of all neurons in hidden layer For each layer in HiddenLayers For Each neuron In layer.Neurons neuron.UpdateFreeParams() Next Next 'Update free parameters of all neurons in output layer For Each neuron In OutputLayer.Neurons neuron.UpdateFreeParams() NextUpdateFreeParams() function simply does two things.
- Find the new bias of a neuron, based on the delta we calculated above
- Update the connection weights based on the delta we calculated above
New Bias Value = Old Bias Value + _ LEARNING_RATE * 1 * DeltaNow let us see how to update the connection weights. The new weight associated with an input neuron can be calculated as shown below.
New Weight = Old Weight + LEARNING_RATE * 1 * Output Of InputNeuron * DeltaAs a neuron can have more than one input, the above step should be performed for all input neurons connected to this neuron.
For Each InputNeuron N connected to ThisNeuron New Weight of N = Old Weight of N + _ LEARNING_RATE * 1 * N.Output * ThisNeuron.Delta NextNow, after step 4, we have a better network. This process is repeated for all other entries in the AND truth table - for probably more than thousand number of times, to train the network 'well'.
- Providing the inputs to the network exactly as described earlier in Step 1 above
- Calculating the outputs as explained in Step 2 above
Now, let us see how these concepts are implemented in BrainNet Neural Network Library.
We are simply mapping the above concepts to the library. Hence, the following code and explanation is very easy to understand, if you read the above concepts regarding Neural Networks.
Fig: An Partial Model of BrainNet Framework
As we discussed earlier, a Neural Network consists of various Neuron Layers, and each Neuron Layer has various Neurons. A Neuron has a strategy - which decides how it should perform tasks like summation, activation, error calculation, bias adjustment, weight adjustment etc.
To brief the UML diagram above,
- INeuron, INeuronStrategy, INeuralNetwork and INetworkFactory are interfaces
- A Neuron should implement the INeuron interface
- A Neural Network should implement the INeuralNetwork interface
- A Neuron has a strategy, and a strategy should implement the INeuronStrategy interface. We have a concrete implementation of INeuronStrategy, called BackPropNeuronStrategy (for a backward propagation neural network).
- A Neural Network is initialized and connections betweens layers are made by a neural network factory. A Factory should implement the INetworkFactory interface. We have a concrete implementation of INetworkFactory, called BackPropNetworkFactory, for creating Backward Propagation neural networks.
The major interfaces in the model are briefed below.
An interface to define a neural network factory
The interface for defining a neuron
The interface for defining the strategy of the neuron
The interface for defining a neural network
The major classes in the model are briefed below.
The elements in INeuron interface is detailed below.
'The interface for defining a neuron Public Interface INeuron 'The current bias this neuron Property BiasValue() As Single 'The current output this neuron Property OutputValue() As Single 'The current delta value this neuron Property DeltaValue() As Single 'A list of neurons to which this neuron is connected ReadOnly Property ForwardConnections() As NeuronCollection 'Gets a list of neurons connected to this neuron ReadOnly Property Inputs() As NeuronConnections 'Gets or sets the strategy of this neuron Property Strategy() As INeuronStrategy 'Method to update the output of a neuron Sub UpdateOutput() 'Method to find new delta value Sub UpdateDelta(ByVal errorFactor As Single) 'Method to update free parameters Sub UpdateFreeParams() End InterfaceA concrete neuron will implement the INeuron interface. Neuron class is a concrete implementation of INeuron. The Strategy property of a Neuron holds its current strategy. Inputs property holds the references of Neurons (in previous layer) connected to this neuron. ForwardConnections holds references to the neurons (in next layer) to which this neuron is connected.
Now, have a look at the Neuron class by extracting the source code zip of BrainNet library. Let us inspect three major functions implemented in the Neuron class - UpdateOutput, UpdateDelta and UpdateFreeParams. These functions are called by the NeuralNetwork class, by training and running the network. We will see later how the functions in NeuralNetwork class call these functions.
These functions uses the current strategy object of the neuron to perform operations.
- UpdateDelta - Find the new delta of this neuron using the current strategy. Error factor (remember that this will vary based on the layer of a neuron) will be passed to the UpdateDelta function, from the functions in Neural Network class.
- UpdateOutput - Find the new output of the neuron, by finding the net value, and then by invoking the activation function - as defined in the current strategy.
- UpdateFreeParams - Updating free parameters includes calling the functions according to the current strategy of this neuron to find new bias and to update weights.
'Calculate the error value Public Sub UpdateDelta(ByVal errorFactor As Single) Implements _ NeuralFramework.INeuron.UpdateDelta If _strategy Is Nothing Then Throw New StrategyNotInitializedException("", Nothing) 'Error factor is found and passed to this DeltaValue = Strategy.FindDelta(OutputValue, errorFactor) End Sub 'Calculate the output Public Sub UpdateOutput() _ Implements NeuralFramework.INeuron.UpdateOutput If _strategy Is Nothing Then Throw New StrategyNotInitializedException("..", Nothing) Dim netValue As Single = Strategy.FindNetValue(Inputs, BiasValue) OutputValue = Strategy.Activation(netValue) End Sub 'Calculate the free parameters Public Sub UpdateFreeParams() _ Implements NeuralFramework.INeuron.UpdateFreeParams If _strategy Is Nothing Then Throw New StrategyNotInitializedException("..", Nothing) BiasValue = Strategy.FindNewBias(BiasValue, DeltaValue) Strategy.UpdateWeights(Inputs, DeltaValue) End Sub
The elements in INeuronStrategy interface, along with description is given below.
'The interface for defining the strategy of a neuron Public Interface INeuronStrategy 'Function to find the delta or error rate of this INeuron Function FindDelta(ByVal output As Single, _ ByVal errorFactor As Single) As Single 'Activation Function, or ThreshHold function Function Activation(ByVal value As Single) As Single 'Summation Function for finding the net value Function FindNetValue(ByVal inputs As NeuronConnections, _ ByVal bias As Single) As Single 'Function for calculating new bias Function FindNewBias(ByVal bias As Single, _ ByVal delta As Single) As Single 'Function for updating weights Sub UpdateWeights(ByRef connections As NeuronConnections, _ ByVal delta As Single) End Interface
Have a look at the BackPropNeuronStrategy class, in the code, and see how these functions are implemented as we described earlier. It is pretty easy to understand.
Now, let us see how the Neural Network is implemented. Any concrete neural network should implement the INeuralNetwork interface. INeuralNetwork interface is shown below.
Public Interface INeuralNetwork 'Method to train a network Sub TrainNetwork(ByVal t As TrainingData) 'This function can be used for connecting two neurons together Sub ConnectNeurons(ByVal source As INeuron, _ ByVal destination As INeuron, ByVal weight As Single) 'This function can be used for connecting 'two neurons together with random weight Sub ConnectNeurons(ByVal source As INeuron, _ ByVal destination As INeuron) 'This function can be used for connecting neurons 'in two layers together with random weights Sub ConnectLayers(ByVal layer1 As NeuronLayer, _ ByVal layer2 As NeuronLayer) 'This function can be used for connecting all 'neurons in all layers together Sub ConnectLayers() 'This function may be used for running the network Function RunNetwork(ByVal inputs As ArrayList) As ArrayList 'This function may be used to obtain the output list Function GetOutput() As ArrayList ReadOnly Property Layers() As NeuronLayerCollection 'Gets the first (input) layer ReadOnly Property InputLayer() As NeuronLayer 'Gets the last (output) layer ReadOnly Property OutputLayer() As NeuronLayer End Interface
There are two interesting functions, TrainNetwork and RunNetwork, for training and running the network. The input to the TrainNetwork function is an object of TrainingData class. The TrainingData class has two properties of type ArrayList - Inputs and Outputs. To train the network, we put the input values to the Inputs array list, and corresponding output values are filled to the Outputs array list.
- Step1: Find the output of hidden layer neurons and output layer neurons
- Step2: Finding Delta
- 2.1) find the delta (error rate) of output layer
- 2.2) Calculate delta of all the hidden layers, backwards
- Step3: Update the free parameters of hidden and output layers
Have a look at how this goes, inside TrainNetwork function in the NeuralNetwork class, it is commented heavily. Some part of TrainNetwork function is shown below.
Dim i As Long Dim someNeuron As INeuron i = 0 'Give our inputs to the first layer. 't is an object of TrainingData class For Each someNeuron In InputLayer someNeuron.OutputValue = t.Inputs(i) i = i + 1 Next 'Step1: Find the output of hidden layer 'neurons and output layer neurons Dim nl As NeuronLayer Dim count As Long = 1 For count = 1 To _layers.Count - 1 nl = _layers(count) For Each someNeuron In nl someNeuron.UpdateOutput() Next Next 'Step2: Finding Delta '2.1) Find the delta (error rate) of output layer i = 0 For Each someNeuron In OutputLayer 'Find the target-output value and pass it someNeuron.UpdateDelta(t.Outputs(i) - _ someNeuron.OutputValue) i = i + 1 Next '2.2) Calculate delta of all the hidden layers, backwards Dim layer As Long Dim currentLayer As NeuronLayer For i = _layers.Count - 2 To 1 Step -1 currentLayer = _layers(i) For Each someNeuron In currentLayer Dim errorFactor As Single = 0 Dim connectedNeuron As INeuron For Each connectedNeuron In _ someNeuron.ForwardConnections 'Sum up all the delta * weight errorFactor = _ errorFactor + (connectedNeuron.DeltaValue * _ connectedNeuron.Inputs.Weight(someNeuron)) Next someNeuron.UpdateDelta(errorFactor) Next Next 'Step3: Update the free parameters of hidden and output layers For i = 1 To _layers.Count - 1 For Each someNeuron In _layers(i) someNeuron.UpdateFreeParams() Next Next
Running the network is pretty simple. For running the network, we just feed the inputs to the first layer, and calculate the outputs, just as explained earlier during the training phase. Here is some part of the RunNetwork function.
Dim someNeuron As INeuron Dim i As Long = 0 For Each someNeuron In InputLayer someNeuron.OutputValue = CType(inputs(i), System.Single) i += 1 Next 'Step1: Find the output of each hidden neuron layer Dim nl As NeuronLayer For i = 1 To _layers.Count - 1 nl = _layers(i) For Each someNeuron In nl someNeuron.UpdateOutput() Next Next
'Demo Routine to create a network. The input parameter is a list of 'long values that represent the number of neurons in each layer Public Sub CreateNetwork(ByVal neuronsInLayers As ArrayList) Dim bnn As New NeuralNetwork() Dim neurons As Long Dim strategy As New BackPropNeuronStrategy() 'NeuronsInLayers is an arraylist which holds 'the number of neurons in each layer For Each neurons In neuronsInLayers Dim layer As NeuronLayer Dim i As Long layer = New NeuronLayer() 'Let us add For i = 0 To neurons - 1 layer.Add(New Neuron(strategy)) Next bnn.Layers.Add(layer) Next 'Connect all layers together bnn.ConnectLayers() 'Now the network is ready, do other stuff here End FunctionOr better, you can use the BackPropNetworkFactory class to create a network easily. Have a look at the BackPropNetworkFactory class. It has two overloaded CreateNetwork functions, for creating a neural network.
- This article is much like a 'Developers Guide' of BrainNet neural network library.
- Have a look at my previous article if you haven't done that yet. It is more or less a 'user's guide' for this library - for more information regarding how to use this BrainNet Library in your own projects, and to see the demo projects in action.
Experiment yourself with the library, and try to optimize it a little bit, or even better, create a neural network yourself using this as an example. In my next article,
- I will explain how to create an XML based language yourself, for creating, training and processing neural networks.
- Explain the concept of some classes in the framework that I haven't mentioned in this article (like NXML interpreter, NetworkSerializer etc).
- You may visit my website http://amazedsaint.blogspot.com/ for a lot of tech resources, code and projects
- Read all the articles I published so far here, http://amazedsaint.blogspot.com/. - You'll find articles about Design Patterns, Neural Networks, Security, Hacking and more.
- You can subscribe to the XML atom feed of my technical articles blog, for tracking new posts. Click Here for the XML Atom Feed.