Multi-Layer Perceptron
This guide will help you build a 3 layer Neural Network from scratch i.e without the use of any existing python libraries.
A Neural Network generally takes the form:
-
Input Layer
-
Hidden Layer(s)
-
Output Layer
For this tutorial we will implement a 3 layer NN with 2-\(n_h\)-1 architecture where \(n_h\) stands for the number of hidden nodes.
Some specifications:
-
Activation function used: Sigmoid
-
Learning Rate: 0.005
-
Number of hidden nodes, \(n_h\) = 2,4,6,8,10
-
Error: Mean Squared Error
Step 1: Load the given train and test datasets.
Step 2: Feature normalize your train and test datasets:
For each feature \(x_i\) , your normalized feature \(y_i\) will be:
yi = (xi - mi)/si
where \(m_i\) is the mean and \(s_i\) is the standard deviation of the given feature.
Use the same \(m_i\) and \(s_i\) values to normalize your test dataset. Remember, you only transform your test dataset with these values and not fit it using its own mi and si values.
Step 3: Split the train dataset into train and validation sets (an 80/20 split should work)
Step 4: The Multi Layer Perceptron Class and functions:
Our sigmoid activation function and it’s derivatives are defined as follows:
def sigmoid(t):
return 1/(1+np.exp(-t))
def sigmoid_derivative(g):
return g * (1 - g)
Our Multi-Layer Perceptron Class is defined as follows:
class MultiLayerPerceptron:
def __init__(self, x, y,nh):
self.input = x
#random weight initialization
self.weights1 = np.random.rand(self.input.shape[1],nh)
#random weight initialization
self.weights2 = np.random.rand(nh,1)
self.y = y
self.output = np.zeros(self.y.shape)
def feedforward(self):
self.layer1 = sigmoid(np.dot(self.input, self.weights1))
self.output = sigmoid(np.dot(self.layer1, self.weights2))
return self.output
def backpropagation(self):
'''
This function calculates new weight vectors and updates
the weight for both input
to hidden as well as hidden to output layers.
'''
del_w2 = np.dot(self.layer1.T, (2*(self.y - self.output)
* sigmoid_derivative(self.output)))
del_w1 = np.dot(self.input.T, (np.dot(2*(self.y -
self.output) * sigmoid_derivative(self.output),
self.weights2.T) * sigmoid_derivative(self.layer1)))
# weight update rule (negative sign changes to positive.
# because of a negative sign in derivative calculation)
self.weights1 += 0.0005*del_w1
self.weights2 += 0.0005*del_w2
def train(self, X, y):
self.output = self.feedforward()
self.backpropagation()
def validation(self,x):
'''
This function is used to test the validation test samples
using the above updated weights.
'''
self.l1 = sigmoid(np.dot(x, self.weights1))
self.out = sigmoid(np.dot(self.l1, self.weights2))
return self.out
Step 5: Stopping Criterion:
The model stops training when the validation loss no longer changes. The below code snippet implements the same.
while True:
iteration += 1
print ('--------------Iteration #{}--------------
'.format(iteration))
# training loss
training_loss.append(np.mean(np.square(trainY -
MLP.feedforward())))
print ("Loss: ",np.mean(np.square(trainY -
MLP.feedforward()))) # mean squared error
# val loss before training the model
prev_val_loss = np.mean(np.square(valY -
MLP.validation(valX)))
MLP.train(trainX,trainY)
# val loss after training the model
new_val_loss = np.mean(np.square(valY -
MLP.validation(valX)))
print ('Validation loss: ',new_val_loss)
validation_loss.append(new_val_loss)
# test loss using the updated weights
test_loss_.append(np.mean(np.square(testY -
MLP.validation(testX))))
# if the validation loss doesn't decrease/change
further, then stop.
if (prev_val_loss-new_val_loss)<0.000001:
break
Step 6: Final test loss using the new weights:
test_loss = np.mean(np.square(testY-MLP.get_test_output(testX)))
Step 7: Train and Test Accuracies:
# test accuracy
correct = 0
for row,label in zip(testX,testY):
if MLP.get_test_output(row)>0.5 and label==1.0:
correct+=1
elif MLP.get_test_output(row)<0.5 and label==0.0:
correct+=1
print ('Test Accuracy: {}'.format(correct/len(testX)))
#Training Accuracy Calculation
correct = 0
for row,label in zip(trainX,trainY):
if MLP.get_test_output(row)>0.5 and label==1.0:
correct+=1
elif MLP.get_test_output(row)<0.5 and label==0.0:
correct+=1
print ('Training Accuracy {}'.format(correct/len(trainX)))
Step 8: Train and Test Plots
plt.figure(1)
plt.plot(training_loss,label='train')
plt.plot(validation_loss,label='validation')
plt.plot(test_loss_,label='test')
plt.show()
The below learning curves correspond to different number of hidden nodes (2,4,6,8 and 10).