MNIST Handwritten Digit Classifier
![Neural Network implementation in Python using numpy [for beginners]]()
Usually, neural networks are made up of building blocks known as Sigmoid Neurons . These are named so because their output follows Sigmoid Function . x j are inputs, which are weighted by w j weights and the neuron has its intrinsic bias b . THe output of neuron is known as "activation ( a )". A neural network is made up by stacking layers of neurons, and is defined by the weights of connections and biases of neurons. Activations are a result dependent on a particular input. Naming and Indexing Convention:
![Neural Network implementation in Python using numpy [for beginners]]()
Layers Input layer is the 0 th layer, and output layer is the L th layer. Number of layers: N L = L + 1 . sizes = [2, 3, 1] Weights Weights in this neural network implementation are a list of matrices ( numpy.ndarrays ). weights[l] is a matrix of weights entering the l th layer of the network (Denoted as w l ). An element of this matrix is denoted as w ljk . It is a part of j th row, which is a collection of all weights entering j th neuron, from all neurons (0 to k) of (l-1) th layer. No weights enter the input layer, hence weights[0] is redundant, and further it follows as weights[1] being the collection of weights entering layer 1 and so on. weights = | [[]], [[a, b], [, | | [c, d], [q], | |_ [e, f]], [r]] _| Biases Biases in this neural network implementation are a list of one-dimensional vectors ( numpy.ndarrays ). biases[l] is a vector of biases of neurons in the l th layer of network (Denoted as b l ). An element of this vector is denoted as b lj . It is a part of j th row, the bias of j th in layer. Input layer has no biases, hence biases[0] is redundant, and further it follows as biases[1] being the biases of neurons of layer 1 and so on. biases = | [[], [[0], [[0]] | | []], [1], | |_ [2]], _| 'Z's For input vector x to a layer l , z is defined as: z l = w l . x + b l Input layer provides x vector as input to layer 1, and itself has no input, weight or bias, hence zs[0] is redundant. Dimensions of zs will be same as biases . Activations Activations of l th layer are outputs from neurons of l th which serve as input to (l+1) th layer. The dimensions of biases , zs and activations are similar. Input layer provides x vector as input to layer 1, hence activations[0] can be related to x - the input trainng example.
An implementation of multilayer neural network using python's numpy library. The implementation is a modified version of Michael Nielsen's implementation in Neural Networks and Deep Learning book.
Why a modified implementation ?This book and Stanford's Machine Learning Course by Prof. Andrew Ng are recommended as good resources for beginners. At times, it got confusing to me while referring both resources:
Stanford Course uses MATLAB, which has 1-indexed vectors and matrices.
The book uses numpy library of Python, which has 0-indexed vectors and arrays.
Further more, some parameters of a neural network are not defined for the input layer, hence I didn't get a hang of implementation using Python. For example according to the book, the bias vector of second layer of neural network was referred as bias[0] as input layer(first layer) has no bias vector. So indexing got weird with numpy and MATLAB. Brief Background:For total beginners who landed up here before reading anything about Neural Networks:
![Neural Network implementation in Python using numpy [for beginners]](http://www.codesec.net/app_attach/201610/02/20161002_422_478227_0.png!web)
Usually, neural networks are made up of building blocks known as Sigmoid Neurons . These are named so because their output follows Sigmoid Function . x j are inputs, which are weighted by w j weights and the neuron has its intrinsic bias b . THe output of neuron is known as "activation ( a )". A neural network is made up by stacking layers of neurons, and is defined by the weights of connections and biases of neurons. Activations are a result dependent on a particular input. Naming and Indexing Convention:
I have followed a particular convention in indexing quantities. Dimensions of quantities are listed according to this figure.
![Neural Network implementation in Python using numpy [for beginners]](http://www.codesec.net/app_attach/201610/02/20161002_422_478227_1.png!web)
Layers Input layer is the 0 th layer, and output layer is the L th layer. Number of layers: N L = L + 1 . sizes = [2, 3, 1] Weights Weights in this neural network implementation are a list of matrices ( numpy.ndarrays ). weights[l] is a matrix of weights entering the l th layer of the network (Denoted as w l ). An element of this matrix is denoted as w ljk . It is a part of j th row, which is a collection of all weights entering j th neuron, from all neurons (0 to k) of (l-1) th layer. No weights enter the input layer, hence weights[0] is redundant, and further it follows as weights[1] being the collection of weights entering layer 1 and so on. weights = | [[]], [[a, b], [, | | [c, d], [q], | |_ [e, f]], [r]] _| Biases Biases in this neural network implementation are a list of one-dimensional vectors ( numpy.ndarrays ). biases[l] is a vector of biases of neurons in the l th layer of network (Denoted as b l ). An element of this vector is denoted as b lj . It is a part of j th row, the bias of j th in layer. Input layer has no biases, hence biases[0] is redundant, and further it follows as biases[1] being the biases of neurons of layer 1 and so on. biases = | [[], [[0], [[0]] | | []], [1], | |_ [2]], _| 'Z's For input vector x to a layer l , z is defined as: z l = w l . x + b l Input layer provides x vector as input to layer 1, and itself has no input, weight or bias, hence zs[0] is redundant. Dimensions of zs will be same as biases . Activations Activations of l th layer are outputs from neurons of l th which serve as input to (l+1) th layer. The dimensions of biases , zs and activations are similar. Input layer provides x vector as input to layer 1, hence activations[0] can be related to x - the input trainng example.