Forward Propagation

Now we get familiar with the deep feedforward networks’s structure, we are going to learn how the single pass through the network.

There is a two-layer neural network with one hidden layer. Here the value of input $x_1$ and $x_2$ are:

x1 = 0.2
x2 = 0.8

We start with some random weights and bias from between layer0 and layer1:

# Layer 0 to Layer 1
b1 = 0.4
b2 = 0.56
b3 = 0.64

w11 =1.0
w12 = 0.7
w13 = 0.34
w21 = 0.5
w22 = 0.8
w23 = 0.6

initLayer

This animation shows how the single pass through the network:

video

A hidden unit in hidden layer looks like this:

HiddenUnit

Here we use sigmod function as activate function h(x).

import numpy as np
def sigmoid(x):
    return 1 / (1 + np.exp(-x))

The first layer of nodes is input layer which doing nothing but input signals. Notice that we add a bias node which weight is $b_1$.

Move to the second layer, let’s calculate input of first node:

\[a_1 = w_{1,1}x_1+w_{2,1}x_2 + b_1\]

The input nodes have values of 0.2 and 0.8. The weight of bias node in input layer and the first node in hidden layer is 0.4. The weight from second node in input layer is 0.1. The link weight from third node is 0.5. So the combined $a_1$ input is:

\[a_1 = 1 * 0.2 + 0.5 * 0.8 + 0.4\] \[a_1 = 0.2 + 0.4 + 0.4\] \[a_1 =1\]

a1 = w11 * x1 + w21 * x2 + b1
print(a1)

1.0

Then we calculate the output of this node using activate function $h(x) = sigmod(x)$.

\[z_1 = sigmod(a_1)\] \[z_1 = \frac{\mathrm{1} }{\mathrm{1} + e^{-1.1} }\] \[z_1 = 0.7310585786300049\]

z1 = sigmoid(a1)
print(z1)

0.7310585786300049

Cool! We have finished calculate the first node’s output.

The remaining two nodes can be calculated with the same way.

The second node of hidden layer:

\[a_2 = w_{1,2}x_1+w_{2,2}x_2 + b_2\] \[a_2 = 0.7 * 0.2 + 0.8* 0.8 + 0.4 = 1.3400000000000003\] \[z_2 = sigmod(a_2) = 0.7924899414403644\]

a2 = w12 * x1 + w22 * x2 + b2
print(a2)

1.3400000000000003

z2 = sigmoid(a2)
print(z2)

0.7924899414403644

The third node of hidden layer:

\[a_3 = w_{1,3}x_1+w_{2,3}x_2 + b_3\] \[a_3 = 0.34 * 0.2 + 0.6 * 0.8 + 0.4 = 1.1880000000000002\] \[z_3 = sigmod(a_3) = 0.7663831750110293\]

a3 = w13 * x1 + w23 * x2 + b3
print(a3)

1.1880000000000002

z3 = sigmoid(a3)
print(z3)

0.7663831750110293

For each node we need to write the corresponding calculation formula, now we just handwritten three nodes, you are likely to get the wrong index. In the future, our network will become more and more complex, and there may be hundreds or thousands of nodes. If we still use such a way, it would be difficult to ensure that nothing goes wrong, isn’t it?

So how do we simplify this? Actually using the matrix in mathematics can simplify and speed up the node’s calculation of the computer. Let’s see what’s going on in next section.