Neural networks have received a lot of hype in recent years, and for good reason. With a basic understanding of this deep learning theory, we can create technology that solves complex problems with human, and sometimes superhuman, capabilities. Whether it be advanced signal processing, object detection, intelligent decision making, or time series analysis, neural networks are a great way to add intelligence to your projects. Before I begin explaining the details of a neural network, let me tell a short story that should give some inspiration as to why artificial neural networks were created in the first place. If you just want to jump straight into the details of neural networks then skip to the “Modeling the biological neuron” section.

## The tale of the investor, the mathematician, and the neuroscientist

Say there is a real estate investor that has data on thousands of properties around the world. This data includes each property’s location, location crime rate, property square foot, and population. He also has the price that each property was sold for. He wants to find the relationship between the data for each property and the price the property was sold for. If he finds this relationship, he can apply it to a new house that isn’t in his data and find out the price it will sell at. Then he can easily find houses that are going for cheaper than what he can resell them for.

Okay, so that’s the aspirations of a real estate investor. Unfortunately for him, he was never any good at math. So, with riches on his mind, he finds a mathematician to solve his problem. Let’s look at this problem from the mathematician’s perspective.

To the mathematician, we have a series of inputs and outputs of some function. The inputs being the property’s location, location crime rate, property square foot, and population. The output is the price the property is sold at. This function *is* the “relationship” that the investor was referring to, but what the mathematician sees is:

As the mathematician works tirelessly to find this function, the investor grows impatient. And after a couple of days, he loses his patients and becomes business partners with a more experienced investor instead of waiting on the mathematician. This more experienced investor is so good at what he does that he can show up to the property and give a great estimate of what the property can sell for without even using grade school math.

The mathematician, embarrassed, gives up on manually finding this function. Defeated, he asks himself, “How is this problem so difficult to solve using mathematics, but so intuitive to the experienced investor?”. Unable to answer this question he finds a neuroscientist in hopes to understand how the investor’s brain solves the problem.

The neuroscientist, unable to answer exactly how the investor is estimating prices so well, teaches the mathematician how the neurons in the brain work. The mathematician then models these neurons mathematically. After years and years of hard work this mathematician finally finds out how he can find the relationship between any 2 related things, including the details of a property and the price it will sell at. He then uses this information to make a hundred million dollars and lives happily ever after.

## Modeling the biological neuron

As we have seen from the story above, if we want to model things that are so intuitive to humans but incredibly complicated to model mathematically we must look at how the human brain does it.

The human brain is made of billions of neurons. A simplification of a neuron is this; many signals of varying magnitude enter the nucleus, and *if the total incoming signal is strong enough* (it’s greater than some threshold that is specific to this neuron) the neuron will fire and *output* a signal. This outputted signal is one of the inputs to another neuron, and the process continues. See the figure below to visualize this:

So how can we model this mathematically? The neuron is receiving many inputs of varying magnitude, the total signal received can be modeled as:

Okay, now if the total incoming signal is greater than this neuron’s threshold then the neuron fires. Or mathematically:

Before we call this a complete model of the neuron, we must address one thing. It is beneficial to get rid of the “if” statement in the above equation. We do this by passing the left side of this equation into what is called the *sigmoid function*. Let’s look at a graph of what we currently have versus the graph of a sigmoid function.

As you can see the sigmoid function behaves almost exactly like the step function we had before. The only difference is we don’t have to deal with that ugly if statement. The function is also differentiable now, which is useful for reasons I will explain shortly. Note that we can use any differentiable function here, just some work better than others.

Below is our final mathematical representation of a neuron.

We can now update our picture of the neuron to fit our mathematical model: