Activation function for Artificial Neural Network.

Photo by Possessed Photography on Unsplash

Activation function is the main computation core behind the artificial intelligence mostly for the Neural Network, and today will try to overview some of them by giving a short introduction and a clear example of the usual use case.

Binary step function

The Binary step function or the “Heaviside step function”, Is a function represent a signal that switches on a specific value or after a specific time a threshold. The binary step function is used mostly with one single perceptron neural network and used to separate linearly between two classes. But there’s a little caveat behind the uses of binary step function on NN, based on calculus the gradient descent of the step function is always 0 which represent no rating change for updating weights.

In the next we could find a “python” implementation of the Binary step function.

Binary step function plot

Linear Activation Function

Linear activation function it takes an input from the range [-inf, +inf] and produce a range [-inf, +inf], is little better than the binary step function where it stacked between {0, 1}. instead it can share the same issue like all the Linear function, where the derivative is always a constant make a backward propagation useless in term of updating weights. one more problem withe the linear activation it makes stacking layer in NN without no effect and the last layer as the first layer still had a linear activation.

Here is a snippet of code for linear implementation and the plot

Linear activation function plot

Sigmoid activation function

Coming to our first non-linear activation function and one of the most common used one for several cause, The same shape with the Heaviside step function but its smoothness can prevent the jumps in the output which bouncing between the 0 and 1. It makes sigmoid function the best fit for the classification helps with a clear prediction.

Sigmoid activation function

Tanh activation function

The hyperbolic tangent is a trigonometric function like the sigmoid funcion share with it all the advantages of elementary changes of gradient descent. But the Tanh activation function have its secret weapon against the strongest negative value because of the zero centrist shape.

Tanh activation function

Rectified Linear Unit (ReLU)

The rectified linear unit or RelU for shot is an activation function used for converging the Neural Network very quickly than the Sigmoid or Tanh. Despite of it looks like a linear function but it’s tricked for the negative range which gives the function a derivative. But the RelU function may become dying on the zero or negative value.

ReLU activation function

Softmax activation function

The softmax activation function is the best fit for the output layer for the ability of classifying the One-Vs-All between the output classes in a multiple classification.

Binary step function

Linear Activation Function

Sigmoid activation function

Tanh activation function

Rectified Linear Unit (ReLU)

Softmax activation function

Footer