This article will help you understand another unsupervised ML algorithm to solve clustering problems. Let’s begin.

k-means is one of the mildest unsupervised learning algorithms used to solve the well-known clustering problem. It is an ** iterative **algorithm that tries to partition the dataset into a ‘k’ number of pre-defined distinct non-overlapping subgroups (clusters). The main idea is to define k centers, one for each cluster. These centers should be placed in a crafty way because of different location causes the different result. So, the better choice is to place them as much as possible far away from each other. The next step is to take each point belonging to a given data set and associate it to the nearest center.

When no point is pending, the first step is completed and an early group age is done. At this point, we need to re-calculate k new centroids as the barycenter of the clusters resulting from the previous step.

After we have these k new centroids, a new binding has to be done between the same data set points and the nearest new center. A loop has been generated. As a result of this loop, we may notice that the k centers change their location step by step until no more changes are done or in other words, centers do not move anymore. Finally, this algorithm aims at minimizing an objective function known as the squared error function given by: