Too vague (e.g. Uniform): you make no assumptions, but the sampler must explore huge spaces β slow, often divergent.
Too tight: you impede learning from the data. The posterior β prior, regardless of the data.
Weakly informative β the golden middle ground: roughly excludes impossible values, but leaves ample room for learning.
Positive parameters (Ο, Ο, Ξ»): ExponentialHalf-NormalHalf-tGammaLog-Normal
Proportions / probabilities: Beta
Correlation matrices: LKJ
Rule of thumb: Student-t(3,0,Ο) as a robust alternative to Normal, as heavier tails tolerate outliers in the prior.
Β· Intercept Ξ±: Normal(0, 2.5)
Β· Slope Ξ²: Normal(0, 1)
Β· Dispersion Ο: Exponential(1) or Half-Normal(0,1)
For raw scales you need to rescale:
Ο_prior β 2β3 Γ SD(y) / SD(x)
Example reaction times (Mβ400ms, SDβ80ms):
Ξ±: Normal(400, 100), Ξ² per SD(x): Normal(0, 160)
Ο as Normal(0,x) in brms: This is correct β brms automatically applies lb=0 (truncation). No special handling needed, no
half_normal().Ξ½ for Student-t too small: Ξ½=1 equals Cauchy (no E[x]), Ξ½=2 has no finite variance. Recommended: Ξ½ β₯ 3, typically 3β7 for robustness.
Forgetting a Beta prior for probabilities: zi, zoi, coi parameters in brms are on [0,1] β Beta(1,1) is Uniform, Beta(2,8) says: mostly little zero-inflation.
| Distribution | brms syntax | E[Ο] | Heavy tails | When? |
|---|---|---|---|---|
| Exponential(1) | exponential(1) | 1.0 | β | brms default; weakly informative |
| Half-Normal(1) | normal(0,1) | 0.80 | β | When Ο < 2 is expected |
| Half-Student-t(3,0,1) | student_t(3,0,1) | ~0.90 | β | Compromise; good default choice |
| Half-Cauchy(1) | cauchy(0,1) | β | ββ | Very vague; large Ο values possible |
| Gamma(2, 0.1) | gamma(2,0.1) | 20 | β | brms default for Ξ½ in Student-t |