Gradient descentis —→ an iterative optimization algorithm

→ for finding the local minimum of a function

→ by taking smaller steps

→ proportional to the negative of the gradient (opposite direction of the gradient) of the function at the current point.

## Loss Function:

Gradient Descent is performed by taking small baby steps from randomly initialized points in Loss Function J(w) to eventually reach its minima.

*Note: Here, J in J(w) stands for Jacobian — a first-order derivative vector.*

We assume that the Loss Function is Convex in nature (bowl-shaped). This helps us to consider the minima computation as a Convex Optimization problem (details out of the scope of the current writeup).

Let us see how the loss function looks for two parameters x and y. Here the vector v is a linear combination of basic vectors x and y.