Why Bayes? — Bayes Thinking Lab

Scenario CBT Study: n = 40 · Outcome: BDI reduction (points) · Therapy vs. Control · β̂ = 4.80, SE = 2.10

Act 1 Same Data — different questions possible

What am I seeing?

The posterior distribution of β — the plausibility of all effect sizes given prior knowledge and data. The blue area to the right of the threshold shows how much probability mass lies in that region.

What should I do?

Drag the threshold with the mouse and ask yourself: above which β value is the effect clinically meaningful? E.g. 2.2 or 5.0 BDI points — the threshold is freely adjustable.

How do I interpret this?

The percentage = P(β > threshold | data, prior) — a direct probability statement about the parameter. Trying multiple thresholds is no problem: the posterior does not change, only the query does.

Why can't frequentism do this?

The 95% CI is not a probability statement about β. P(β > 3) cannot be read off directly. Testing the same dataset for β > 3 and β > 2.2 constitutes multiple testing with an inflated error rate. Bayesian: no problem — the posterior is fixed.

Frequentist

β̂ = 4.80 SE = 2.10

95% CI: [0.68, 8.92]

t(38) = 2.29 p = .023

✗ P(β > 3) = ? no direct answer

✗ P(β > 0) = ? no direct answer

✗ P(2 < β < 7) = ? no direct answer

Bayes — Posterior Distribution (drag threshold with mouse)

—

P(β > 3.0)

← drag threshold →

Is the effect clinically meaningful
(≥ 3.0 BDI points)?

✗Frequentist: not directly — p = .023 only says: "significant"

✓P(β > 3.0) = —

What is the probability the effect
is positive at all?

✗p < .05 does not answer this

✓P(β > 0) = —

Is the effect in the practically
relevant range [2, 7]?

✗CI spans more than this range

✓P(2 < β < 7) = —

Act 2 Prior as Cumulative Knowledge

What am I seeing?

Three curves: Prior (green, dashed — knowledge before the data), Likelihood (orange, dotted — what the data alone say) and Posterior (blue, filled — combined result). Below: a width comparison of 95% intervals.

What should I do?

Switch between non-informative and informative prior. The likelihood (data) stays identical — only the prior changes, and therefore the posterior.

How do I interpret this?

With an informative prior (10 prior studies: μ = 5, σ = 0.6) the posterior narrows considerably. Same data, more precise posterior — prior knowledge formally reduces uncertainty.

Why is this useful?

Frequentism structurally ignores prior knowledge. Bayes formalises cumulative learning: today's posterior is tomorrow's prior. Scientific progress as an accumulated state of knowledge.

Note: Likelihood (orange) is constant across all priors. The y-axis scales to the tallest curve — so it appears smaller when the prior or posterior is narrower and therefore taller.

Frequentist — 95% CI

[0.68, 8.92]

Width: 8.24 points

Bayes — non-informative prior

[0.68, 8.92]

Width: 8.24 points

Bayes — informative prior (10 studies)

[3.86, 6.12]

Width: 2.26 points ↓ 73% narrower

→ Frequentism has no formal mechanism for prior knowledge. With an informative prior — derived here from 10 prior studies — the posterior becomes substantially more precise: same data, narrower interval. That is cumulative learning in science.