Machine learning involves the use of machine learning algorithms and models. For beginners, this is very confusing as often “machine learning algorithm” is used interchangeably with “machine learning model.” Are they the same thing or something different?
An “algorithm” in machine learning is a procedure that is run on data to create a machine learning “model.” Machine learning algorithms perform “pattern recognition.” Algorithms “learn” from data, or are “fit” on a dataset.
There are many machine learning algorithms. For example, we have algorithms for classification, such as k-nearest neighbors. We have algorithms for regression, such as linear regression, and we have algorithms for clustering, such as k-means.
Examples of machine learning algorithms:
- Linear Regression
- Logistic Regression
- Decision Tree
- Artificial Neural Network
- k-Nearest Neighbors
You can think of a machine learning algorithm like any other algorithm in computer science. For example, some other types of algorithms you might be familiar with include bubble sort for sorting data and best-first for searching.
As such, machine learning algorithms have a number of properties:
- Machine learning algorithms can be described using math and pseudocode.
- The efficiency of machine learning algorithms can be analyzed and described.
- Machine learning algorithms can be implemented with any one of a range of modern programming languages.
Academics can devise entirely new machine learning algorithms and machine learning practitioners can use standard machine learning algorithms on their projects. This is just like other areas of computer science where academics can devise entirely new sorting algorithms, and programmers can use the standard sorting algorithms in their applications.
A “model” in machine learning is the output of a machine learning algorithm run on data. A model represents what was learned by a machine learning algorithm. The model is the “thing” that is saved after running a machine learning algorithm on training data and represents the rules, numbers, and any other algorithm-specific data structures required to make predictions.
Some examples might make this clearer:
- The linear regression algorithm results in a model comprised of a vector of coefficients with specific values.
- The decision tree algorithm results in a model comprised of a tree of if-then statements with specific values.
- The neural network / backpropagation / gradient descent algorithms together result in a model comprised of a graph structure with vectors or matrices of weights with specific values.
A machine learning model is more challenging for a beginner because there is not a clear analogy with other algorithms in computer science.
For example, the sorted list output of a sorting algorithm is not really a model.
The best analogy is to think of the machine learning model as a “program.”
The machine learning model “program” is comprised of both data and a procedure for using the data to make a prediction. For example, consider the linear regression algorithm and resulting model. The model is comprised of a vector of coefficients (data) that are multiplied and summed with a row of new data taken as input in order to make a prediction (prediction procedure).
We save the data for the machine learning model for later use. We often use the prediction procedure for the machine learning model provided by a machine learning library. Sometimes we may implement the prediction procedure ourselves as part of our application. This is often straightforward to do given that most prediction procedures are quite simple.
So now we are familiar with a machine learning “algorithm” vs. a machine learning “model.” Specifically, an algorithm is run on data to create a model.
- Machine Learning => Machine Learning Model
We also understand that a model is comprised of both data and a procedure for how to use the data to make a prediction on new data. You can think of the procedure as a prediction algorithm if you like.
- Machine Learning Model == Model Data + Prediction Algorithm
This division is very helpful in understanding a wide range of algorithms.
For example, most algorithms have all of their work in the “algorithm” and the “prediction algorithm” does very little. Typically, the algorithm is some sort of optimization procedure that minimizes error of the model (data + prediction algorithm) on the training dataset. The linear regression algorithm is a good example. It performs an optimization process (or is solved analytically using linear algebra) to find a set of weights that minimize the sum squared error on the training dataset.
- Algorithm: Find set of coefficients that minimize error on training dataset.
- Model Data: Vector of coefficients.
- Prediction Algorithm: Multiple and sum coefficients with input row.
Some algorithms are trivial or even do nothing, and all of the work is in the model or prediction algorithm. The k-nearest neighbors algorithm has no “algorithm” other than saving the entire training dataset. The model data, therefore, is the entire training dataset and all of the work is in the prediction algorithm, i.e. how a new row of data interacts with the saved training dataset to make a prediction.
- Algorithm: Save training data.
- Model Data: Entire training dataset.
- Prediction Algorithm: Find k most similar rows and average their target variable.