Why might you want to turn Bayesian?

13 May 2020

Especially if you want to make sense of Covid-19

Are you sitting at home dreaming of a beautiful holiday destination? Sunny beach? Majestic fjord? Covid-19 made it very difficult to plan anything. Will all restrictions be lifted on 1st September? Nobody can answer that. Even your personal opinion keeps changing every day. With every new report about the lower number of cases it strengthens, and every troubling news increases the doubt.

Now, what would you do if I asked you to calculate the probability of restrictions being lifted on 1st September? Many would try to apply a (frequentist) technique that they are familiar with. For example, you can try and use logistic regression to estimate the odds… I wouldn’t do that if I were you! Tomorrow this result will be outdated, and you would have to re-run your test from scratch.

Our mind is flexible and changes our perception as soon as the information is encountered and processed. Yet, most data professionals are more comfortable with techniques that require a completely counterintuitive way of thinking, and even after working many years many have problems interpreting their results (Goodman, 1992). Personally, when evaluating everyday situations I don’t think in power and p-values. I can say with confidence, I am a Bayesian thinker. I am guessing that you are too. Bayesian statistics, despite often being forgotten, has numerous applications in data-driven decision-making and it is more wide-spread than many imagine. Even if you will continue using Frequentist methods it is good to understand how they compare.

 

Bayesian vs Frequentist

What is the difference between the Bayesian and the Frequentist statistics? The big difference between the two approaches is that Frequentist approach tends to show a snapshot of a situation. The moment a decision needs to be taken, a Frequentist re-evaluates all information needed from scratch, as if a previous analysis of the situation did not take place. Bayesians, on the other hand, update the analysis made before by including new information while taking into account prior knowledge. Exactly what your brain does after reading the news.

A true Frequentist would instead simply add a new data point to the set of information he already has on this topic. Then, a full re-run of the analysis would follow. Such an operation is very computationally taxing and as the amount of data points grows it would become more and more difficult to do full re-runs. Just imagine yourself sitting after every week’s press conference trying to make sense not only of the current information, but of everything that you have heard combined (including Trump’s sarcastic tweet comment about bleach at a news conference, 24th April 2020 [1]).

By now you are probably wondering, what exactly allows Bayesian approach to be so much more efficient?

Bayesian statistics takes its foundation from the Bayes theorem which relates prior knowledge of the situation with the information that is available and by combining these produces a new understanding of reality. In the language of Bayesian framework:

 

Language of Bayesian framework

 

Imagine we read a new report saying that the number of new patients in intensive care is 10 (much lower than it was before). This is our new data point. If we want to evaluate its influence on the probability of restrictions being lifted we still need to include the likelihood of data given the prior and the prior itself. The prior probability is what we already know about the situation, which in this case is our expectation before reading the news. Then, the likelihood of data given the prior is the probability encountering this new data point based on our current understanding of the situation. For example, we are unlikely to hear that there are only 10 patients today if there were 1000 yesterday.

Moreover, Bayesian approach does not suffer from cognitive biases like we all do. Our brain simply does not have the capacity to evaluate perfectly how likely we are to encounter information, thus it relies on many heuristics, among which is the Availability heuristic (Tversky and Kahneman, 1973) causing you to give higher weight to new information. Thus, you might get excited when yesterday’s number of patients in intensive care was 1000 and today it is 10, but a Bayesian will also evaluate how likely this information is.

Thus, using Bayes theorem and our new data point we get to the following formula:

 

Here, we easily incorporated new information, while a Frequentist would have to run a calculation over all the previous data points including the new one. Even though the answers should be similar, in this case Bayesian approach is obviously more intuitive.

[1] Even though the media has given this quite a lot of attention, in fact it is but a single data point.

 

Should we forget the frequentist approach?

Undoubtedly, Bayesian approach has its many advantages and the number of fans has been increasing since the Bayes theorem was proven in late 1700s. However, I would not recommend you going headstrong and disregarding your prior knowledge of frequentist statistics. After all, then you wouldn’t be a true Bayesian! Both approaches can be used depending on the circumstances. When we already have information about the situation, like the number of patients that have required intensive care, Bayesian is a far more superior approach. However, when you don’t know anything about the topic you want to analyse, both approaches can be valid [2]. Unsurprisingly, many choose to use Frequentist statistics in such cases, since they are more familiar with them.  This is why tests of means and proportions, which are so widespread in data analysis are predominantly done using Frequentist statistics. The same is true for ordinary and multiple regressions.

Both approaches could be used when prior information is scarce. However, when we do have prior believes Bayesian model is a better choice. This is why stock markets and meteorologists are able to quickly react to new information – they use Bayesian predictions.

[2] In Bayesian this is done using zero-information priors.

 

Summary

Bayesian statistics is indeed very widespread in its applications and is rather intuitive for a human mind to understand, and interpret. In many cases we do have information about a subject that you need to predict. Perhaps, you want to estimate how many marketable consumers you will have in a single year, then Bayesian approach might be the answer. Even better, it does not suffer from the same cognitive biases that our mind does and thus does not give the highest weight to the latest available information. Therefore, when using appropriately Bayesian statistics can help making data-driven decisions again, again and again.

 

Citations

  • Goodman, S. N. (1992). A comment on replication, p‐values and evidence. Statistics in Medicine, 11(7), 875-879.
  • Tversky, A. and Kahneman, D. (1973). Availability. A heuristic for judging frequency and probability. Cognitive Psychology, 5(2), 207-232.

Latest news

In 2022, we are going to have a great year of celebration! Many weddings expected

21 February 2022

Plan your calendar free and make sure you have plenty of party clothes in your closet,... read more

Can you escape our “Power BI Escaperoom”?

7 April 2021

In these boring lockdown-times we are all desperately looking for ways to still interact with our... read more

Building a book recommender from scratch

1 April 2021

Almost every day we go online we encounter recommender systems; if you are listening to your... read more

Subscribe to our newsletter

Never miss anything in the field of advanced analytics, data science and its application within organizations!