The concept of Bayesian is something that just crashed into my life when I got into college that always confused me and never really got to grasp the whole of it. I happen to be able to only fully digest the Frequentist view of the world and as I was studying the Bayesian theorem and Bayesian Linear Regression, I found it very difficult to comprehend fully. This post will go through struggles I’ve had in understanding the Bayesian point of view and clear up some vague ideas about it. Also, I’ve implemented the whole process in Google Colab for anyone who wants to follow through in practice.

## Frequentist Probability vs Bayesian Probability

I think the examples of explaining the frequentist view by a coin toss or dice roll only makes it more confusing. Frequentist probability is just a simple calculation of how frequently did an event happened.

Bayesian probability, however, gives how confident we can tell about a certain event happening with evidence. In other words, knowing how confident we can say about such an event happening can also be implied from knowing the uncertainty of the event.

Let’s say, there’s a 50% chance of head from a coin toss. Frequentists say that there can be 50 times head out of 100. Rather, Bayesians say that the confidence of a head is 50% when a coin is tossed.

Instead of understanding Bayes’ theorem as a simple extension of a conditional probability, I think it’s easier to comprehend it in a way that it defines the relational confidence of an event before and after something (evidence) happened. Therefore, the uncertainty of an event decreases accordingly as it happens more and more. Also, the hypothesis can be interpreted as our belief in what we are trying to observe.

Linear regression that we are familiar with is just a simple fitting of coefficients in an equation to minimize the residual error to the target equation. Also the same concept in finding the optimal weight vector to fit the prediction data to the target data in machine learning.