The term deep learning refers to training neural networks. And sometimes very large neural networks. So, what exactly is in your network here we are discussing some of the basic intuition with the explanation of a two-layer neural network.
Neural Network Representation
Here we talk about neural networks using a graphical representation. We will start by focusing on the case of neural networks with what is called a single hidden layer.
Here is a picture of a neural network let’s understand the different parts of this picture and give them some names. As you can see here, we have some input features X1, X2, and X3 stats vertically and this is called the input layer of the neural network. This layer contains inputs to the neural network.
Here is another layer of circles which is called a hidden layer of the neural network. And the final layer is contained on a single node and this node is called the output layer. This layer is responsible for the predicted value Y hat. In a neural network, you train a network with supervised learning. The training set contains the values of input X as well as the target output Y.
So, the hidden layer refers to the fact that, in the training set the true values for these nodes in the middle are not observed which means you don’t see what there should be in the training set. As you can see the input and the output but hidden layers cannot be seen in the training set. That’s why it’s called a hidden layer.
Working of Two-Layer Neural Network
Let’s introduce a bit more notation whereas previously we were using the vector X to denote the input features an alternative notation for the values for input features will be a superscript square bracket 0 and the term an also stands for activation and it refers to the values that different layers of the neural network are passing on to the subsequent layer passes on the value of X to the hidden layer so we are going to call that code the activation of the input layer in superscript 0 The next layer is the hidden layer which will generate some set of activation which I am going to write as superscript square bracket 1so particular this first node will generate a value a1 and the second node generates a2 and so on.
This is a four-dimensional vector in Python code with a period of 4 by 1 matrix or 4 column vector which looks like this. And it is 4 dimensional because in this case, we have 4 nodes or 4 hidden nodes in this hidden layer. Then finally the output layer will generate some output values a2 which is just a real number, and Y which is going to take on the value of a2. This is analogous to how in logistic regression we have Y hat equals a and prolific regression which we only had that one output layer. This network that you have seen here is called a two-layer neural network. The reason is that when we count layers in your network we don’t count the input layer so, the hidden layer is layer one.
The output layer is counted as layer two in other notational conventions. We will call the input layer 0. So, technically maybe there are three layers in this network an input layer, a hidden layer, and an output layer. But in terms of deep learning, we refer to this network as a two-layer neural network. Because we don’t count the input layer as an official layer.