June 16, 2014

# Picturing the output of a neural net

Some time ago during a training session a colleague asked me what a surface plot of a two input neural net would look like. That is, if you have two inputs x_1 and x_2 and plot the output y as a surface what do you get? We thought it might look like a set of waterfalls stacked on each other.

### Tip

For this post I’m going to use draw.io, wolfram alpha and some javascript. Check other tools in the toolbox.

Since neural nets are often considered a black box solution I thought it would be interesting to check this. If you can picture the output for simple case it makes things less mysterious in the more general case.

Let’s look at a neural net with a single hidden layer of two nodes and a sigmoid activation function. In other words something that looks like this. (If you need to catch up on the theory please see this brilliant book by David MacKay that’s free to view online)

Neural net with a single hidden layer

Drawn using the lovely draw.io

###### Output for a single node

We can break the task down a little by doing a surface plot of a single node, say node 1. We are using the sigmoid activation function so what we are plotting is the application of the sigmoid function to what is essentially a function describing a plane.

The activation (i.e. the plane) is $\alpha_1 = w_4x_1+w_3x_2$

Applying the sigmoid function we get

$y_1=\frac{1}{1+e^{-\alpha_1}}$

Another good tool I use now and then when testing out some maths is to is wolframe alpha. I’m going to pretend that weights w4 and w3 have been fitted as 0.1 and 0.2 and so the equation I input is  plot y=1/(1+exp(-(0.1*x1+0.2x2)))

Surface plot using the sigmoid function

What we see is sort of waterfall shape angled in the direction of the plane. This is what we’d expect. As the alpha values get higher and lower the output of the sigmoid function tends to one and zero respectively.

###### Output for multiple nodes

Lets now look at the output for the whole network. The outputs from nodes 1 and 2 feed into node 3 creating a new activation

$\alpha_3 = w_1*y_1+w_2*y_2$

This is just a weighted sum of the outputs of the nodes 1 and 2 and so should look as predicted like one waterfall stacked on another (sort of). We can check this out in wolfram alpha by submitting the following (picking say w1=0.8, w2=0.7, w5=0.3, w6=-0.2)  plot y=0.8/(1+exp(-(0.1*x1+0.2x2)))+0.7/(1+exp(-(-0.2*x1+0.3*x2)))

Activation for node 3

Adding the final sigmoid function to the activation of node 3 doesn’t change things much. It just puts things back on the scale from zero to one.

 plot y=1/(1+exp(-(0.8/(1+exp(-(0.1*x1+0.2x2)))+0.7/(1+exp(-(-0.2*x1+0.3*x2))))))

###### Exploring changes in the weights

It would be nice now to see what happens when we change the weights for the inputs. To do this I’ve used the javascript iibrary javascript-surface-plot. The sliders explore only weight ranges between 0.1 to 1 but this is enough to give us an idea of the range of shapes that the output can take. We can see that the two node hidden layer has the potential to divide the input space into roughly four zones of activation which gives us much more complexity compared to say logistic regression. If we imagine the effect of stacking more of these waterfall shapes then we see the potential of additional nodes in the hidden layer.

w1

w2

w3

w4

w5

w6