One of the most popular Machine-Leaning course is Andrew Ng’s machine learning course in Coursera offered by Stanford University. I tried a few other machine learning courses before but I thought he is the best to break the concepts into pieces make them very understandable.
But I think, there is just only one problem. That is, all the assignments and instructions are in Matlab. I am a Python user and did not want to learn Matlab. So, I just learned the concepts from the lectures and developed all the algorithms in Python.
I explained all the algorithms in my own way(as simply as I could) and demonstrated the development of almost all the algorithms in the different articles before. I thought I should summarise them all on one page so that if anyone wants to follow, it is easier for them. Sometimes a little help goes a long way.
If you want to take Andrew Ng’s Machine Learning course, you can audit the complete course for free as many times as you want.
Let’s dive in!
The most basic machine learning algorithm. This algorithm is based on the very basic straight line formula we all learned in school:
Y = AX + B
Remember? If not, no problem. This is a very simple formula. Here is the complete article that explains how this simple formula can be used to make predictions.
The article above works on only the datasets with a single variable. But in real life, most datasets have multiple variables. Using the same simple formula, you can develop the algorithm with multiple variables:
This one is also a sister of linear regression. But polynomial regression is able to find the relationship between the input variables and the output variable more precisely, even if the relationship between them is not linear:
Logistic regression is developed on linear regression. It also uses the same simple formula of a straight line. This is a widely used, powerful, and popular machine learning algorithm. It is used to predict a categorical variable. The following article explains the development of logistic regression step by step for binary classification:
Based on the concept of binary classification, it is possible to develop a logistic regression for multiclass classification. At the same time, Python has some optimization functions that help to do the calculation a lot faster. In the following article, I worked on both the methods to perform a multiclass classification task on a digit recognition dataset:
Neural Network has been getting more and more popular nowadays. If you are reading this article, I guess you heard of neural networks.
A neural network works much faster and much efficiently in more complex datasets. This one also involves the same formula of a straight line but the development of the algorithm is a bit more complicated than the previous ones. If you are Andrew Ng’s course, probably, you know the concepts already. Otherwise, I tried to break down the concepts as much as I could. Hopefully, it is helpful:
What if you spent all that time and developed an algorithm and then, it does not work the way you wanted. How do you fix it? You need to figure out first where the problem is. Is your algorithm faulty or you need more data to train the model or you need more features? So many questions, right? But if you do not figure out the problem first and keep moving in any direction, it may kill too much time unnecessarily. Here is how you may find the problem:
On the other hand, if the dataset is too skewed that is another type of challenge. For example, if you are working on a classification problem, where 95% of cases it is positive and only 5% of cases are negative. In that case, if you just randomly put all the output as positive, you are 95% correct. On the other hand, if the machine learning algorithm turns out to be 90% accurate, it is still not efficient, right? Because without a machine learning algorithm, you can predict with 95% accuracy. Here are some ideas to deal with these types of situation:
One of the most popular and old unsupervised learning algorithms. This algorithm does not make predictions like the previous algorithms. It makes clusters based on the similarities amongst the data. It is more like understanding the current data more effectively. Then whenever the algorithm sees new data, based on its characteristics, it decides which cluster it belongs to. This algorithm has other importance as well. It can be used for the dimensionality reduction of images.
Why do we need dimensionality reduction of an image?
Think, when we need to input a lot of images to an algorithm to train an image classification model. Very high-resolution images could be too heavy and the training process can be too slow. In that case, a lower-dimensional picture will do the job with less time. This is just one example. You probably can imagine, there are a lot of uses for the same reason.
This article is a complete tutorial on how to develop a K mean clustering algorithm and how to use that algorithm for dimensionality reduction of an image:
Another core machine learning task. Used in credit card fraud detection, to detect faulty manufacturing or even any rare disease detection or cancer cell detection. Using the Gaussian distribution(or normal distribution) method or even more simply a probability formula it can be done. Here is a complete step by step guide for developing an anomaly detection algorithm using the Gaussian distribution concepts:
If you need a refresher on a Gaussian distribution method, please check this one:
The recommendation system is everywhere. If you buy something on Amazon, it will recommend you some more products you may like, YouTube recommends the video you may like, Facebook recommends people you may know. So, we see it everywhere.
Andrew Ng’s course teaches how to develop a recommender system using the same formula we used in linear regression. Here is the step by step process of developing a movie recommendation algorithml Conclusion
Hopefully, this article will help some people to start with machine learning. The best way is by doing. If you notice most of the algorithms are based on a very simple basic formula. I see a notion that machine learning or Artificial Intelligence requires very heavy programming knowledge and very difficult math. That’s not always true. With simple codes, basic math, and stats knowledge, you can go a long way. At the same time, keep improving your programming skills to do more complex tasks.
If you are interested in machine learning, just take some time and start working on it.