## Taking the Naive Approach to Build a Spam Classifier

Have you ever wondered how your email service provider classifies a mail as spam or not spam almost immediately after you have received it? Or have you thought how the recommendations by online e-commerce platforms changes so quickly depending on real-time user actions. These are some of the real life scenarios where Naïve Bayes classifier is put into action.

Naive Bayes is a supervised classification algorithm that is used primarily for dealing with binary and multi-class classification problems, though with some modifications, it can also be used for solving regression problems. It is one of the simplest algorithm that is used for dealing with classification problems, especially when the dataset has less data points.

In this article, we will first be looking at the mathematical concepts behind Naïve Bayes, then we will take a look at the different types of Bayes Classifier and once we have a gist of what Naïve Bayes classifier actually is, then we will try and build our very own classifier. Also do read my previous articles to have a notion of the dissimilar classification algorithms.

Naïve Bayes is one of the simplest and widely employed supervised classification algorithm that is used for dealing with classification problem. It is an intuitive Classification Algorithm that is based on the principles of Bayes Theorem, named after Reverend Thomas Bayes, a Statistician.

Naive Bayes model is easy to build and is particularly useful for dealing with smaller data sets. Apart from its simplicity, Naive Bayes is also known to outperform even highly intricate predictive models, owing to its speed and accuracy. The algorithm performs exceptionally well with text data projects including the likes of sentiment data analysis, spam detection and document categorizing. The 3 main types of Naive Bayes algorithms:

**Gaussian Naive Bayes:**Commonly used when features follow a Gaussian or normal distribution. This also requires to calculate the mean and standard deviation of the data.**Multinomial Naive Bayes**: Used for multinomially distributed data. This is suitable for classification with discrete features.**Bernoulli Naive Bayes:**Used for multivariate Bernoulli distributions. It requires the data points to be treated as binary-valued feature vectors.

Another variant of Naïve Bayes is the Complement Naïve Bayes or CNB that tends to work better than its counterpart when the classes in the training set are imbalanced. The Compliment Naive Bayes (CNB) classifier improves upon the weakness of the Naive Bayes classifier by estimating parameters from the data in all classes except the one which we are evaluating for.

Naive Bayes classifier can be trained faster as compared to its sister algorithms, and also it makes faster predictions. It can be modified with new training data without having to rebuild the model from scratch.

As mentioned earlier, this algorithm is based on the principles of Bayes’ theorem that describes the probability of an event, based on the prior knowledge of conditions that have already occurred and are in someway related to the event. The equation for **Bayes Theorem** is of the form: