Bayes Theorem Definition and Examples

How to Use Bayes' Theorem to Find Conditional Probability

Bayes' Theorem is presented in neon lights at the offices of Autonomy in Cambridge.

Bayes' theorem is a mathematical equation used in probability and statistics to calculate conditional probability. In other words, it is used to calculate the probability of an event based on its association with another event. The theorem is also known as Bayes' law or Bayes' rule.


Richard Price was Bayes' literary executor. While we know what Price looked like, no verified portrait of Bayes survives.

Bayes' theorem is named for English minister and statistician Reverend Thomas Bayes, who formulated an equation for his work "An Essay Towards Solving a Problem in the Doctrine of Chances." After Bayes' death, the manuscript was edited and corrected by Richard Price prior to publication in 1763. It would be more accurate to refer to the theorem as the Bayes-Price rule, as Price's contribution was significant. The modern formulation of the equation was devised by French mathematician Pierre-Simon Laplace in 1774, who was unaware of Bayes' work. Laplace is recognized as the mathematician responsible for the development of Bayesian probability.

Formula for Bayes' Theorem

One practical application of Bayes' theorem is determining whether it's better to call or fold in poker.
Duncan Nicholls and Simon Webb, Getty Images

There are several different ways to write the formula for Bayes' theorem. The most common form is:

P(A ∣ B) = P(B ∣ A)P(A) / P(B)

where A and B are two events and P(B) ≠ 0

P(A ∣ B) is the conditional probability of event A occurring given that B is true.

P(B ∣ A) is the conditional probability of event B occurring given that A is true.

P(A) and P(B) are the probabilities of A and B occurring independently of one another (the marginal probability).


Bayes' theorem can be used to calculate the chance one condition based on the chance of another condition.
Glow Wellness / Getty Images

You might wish to find a person's probability of having rheumatoid arthritis if they have hay fever. In this example, "having hay fever" is the test for rheumatoid arthritis (the event).

  • A would be the event "patient has rheumatoid arthritis." Data indicates 10 percent of patients in a clinic have this type of arthritis. P(A) = 0.10
  • B is the test "patient has hay fever." Data indicates 5 percent of patients in a clinic have hay fever. P(B) = 0.05
  • The clinic's records also show that of the patients with rheumatoid arthritis, 7 percent have hay fever. In other words, the probability that a patient has hay fever, given they have rheumatoid arthritis, is 7 percent. B ∣ A =0.07

Plugging these values into the theorem:

P(A ∣ B) = (0.07 * 0.10) / (0.05) = 0.14

So, if a patient has hay fever, their chance of having rheumatoid arthritis is 14 percent. It's unlikely a random patient with hay fever has rheumatoid arthritis.

Sensitivity and Specificity

Bayes' theorem drug test tree diagram. U represents the event where a person is a user while + is the event a person tests positive.

Bayes' theorem elegantly demonstrates the effect of false positives and false negatives in medical tests.

  • Sensitivity is the true positive rate. It is a measure of the proportion of correctly identified positives. For example, in a pregnancy test, it would be the percentage of women with a positive pregnancy test who were pregnant. A sensitive test rarely misses a "positive."
  • Specificity is the true negative rate. It measures the proportion of correctly identified negatives. For example, in a pregnancy test, it would be the percent of women with a negative pregnancy test who were not pregnant. A specific test rarely registers a false positive.

A perfect test would be 100 percent sensitive and specific. In reality, tests have a minimum error called the Bayes error rate.

For example, consider a drug test that is 99 percent sensitive and 99 percent specific. If half a percent (0.5 percent) of people use a drug, what is the probability a random person with a positive test actually is a user?

P(A ∣ B) = P(B ∣ A)P(A) / P(B)

maybe rewritten as:

P(user ∣ +) = P(+ ∣ user)P(user) / P(+)

P(user ∣ +) = P(+ ∣ user)P(user) / [P(+ ∣ user)P(user) + P(+ ∣ non-user)P(non-user)]

P(user ∣ +) = (0.99 * 0.005) / (0.99 * 0.005+0.01 * 0.995)

P(user ∣ +) ≈ 33.2%

Only about 33 percent of the time would a random person with a positive test actually be a drug user. The conclusion is that even if a person tests positive for a drug, it is more likely they do not use the drug than that they do. In other words, the number of false positives is greater than the number of true positives.

In real-world situations, a trade-off is usually made between sensitivity and specificity, depending on whether it's more important to not miss a positive result or whether it's better to not label a negative result as a positive.