The Perceptron
The complexity of Neural Networks comes from the interaction of much simpler components working together, as it happens in most of the complex systems. In the case of Neural Networks, each of these parts is called ‘neuron’ or ‘perceptron’.
The perceptron is seen as an analogy to a biological neuron and it is the basic processing unit that we are going to find within a neural network. Similarly to biological neurons, it has input connections from which they receive external stimulus, the input values. The perceptron performs an internal calculation and generates an output value. What we can see here is that ‘perceptron’ is just a cool name given to a mathematical function.
This internal calculation to which we referred before is a weighted sum of the input values. The weighing given to each of the inputs comes from the assigned weights of each of the input connections. This means that each of the input values will have more or less importance given the weight that it has been assigned. Those weights will be the parameters of our model, and they will be the values that will have to be tuned so our model can learn.
You may have noticed that a perceptron is very similar to a linear regression model, where we have some input values that define a line or a hyperplane to which we can vary its slope by changing the parameters. In a linear regressor we have an independent variable that indicates the intercept with the ‘Y’ axis. We also have this independent variable in a perceptron. It is called the ‘bias’, and it is modeled as another input connection to the neuron that is assigned to ‘1’, and we can manipulate it by changing the bias value.
As we can imagine, Neural Networks are built as the interconnection of several perceptrons. There is a mathematical theorem that proves that the combination of linear transformations is a linear transformation. Unfortunately, we cannot build cool applications like voice recognition systems or self-driving cars just with linear transformations. There is a last component missing in our perceptron, and it will allow us to create non-linear functions.
Activation functions are the components that will complete our neuron. We will apply this function to the output value of the linear regressor, and it will allow us to introduce non-linearities to our model. There are several activation functions and they will be defined depending on the problem that we want to solve, but we will go through this in coming articles.
With a single perceptron we can solve problems for linearly separated models, but it was proved that they were the only kind of models that the neuron alone could solve. However by inter-connecting several perceptrons we can bypass this limitation and work with non-linearly separable data.