Three Scenarios
OLS regression assumes normally distributed errors (Ξ΅).
Three scenarios from body image research show what happens when you force an LM anyway β
and how a GLM with an appropriate distribution resolves the situation.
β Therapy outcome CBT β 0/1 β Bernoulli
β‘Body checking β count data β Poisson
β’Reaction time Stroop β positively skewed β Gamma
Problem
β LM β Normal distribution (wrong)
β
β‘GLM β appropriate distribution (right)
β
β’Fit comparison: AIC & residuals
LM with Normal distribution
β Problematic
Log-Likelihood (LM)β
AIC (LM)β
GLM with appropriate distribution
β Better
Log-Likelihood (GLM)β
AIC (GLM)β
Residual distribution & AIC comparison
ΞAIC (LM β GLM)β
RMSE (always comparable β in y-units; smaller = better)
Link Function
What the link function does β visually
Linear predictor Ξ· = a + bΒ·x
Value range: ββ to +β
Value range: ββ to +β
β
E[y] = gβ»ΒΉ(Ξ·)
Value range: restricted
Value range: restricted
Explore further: These three scenarios are the conceptual entry point.
The next tool in Section I shows how GLMs correctly model the conditional distributions β
for each family and each x-value separately.
β Conditional distributions in the GLM Β· β GLM in 3D Β· β GLMM interactive
β Conditional distributions in the GLM Β· β GLM in 3D Β· β GLMM interactive
Concepts
What you learn here
What happens when you choose the wrong likelihood?
The LM minimises RSS β which is equivalent to maximum-likelihood estimation under
a normal distribution assumption. When this assumption is violated, the
estimates can still be computed, but:
β’ Predictions fall outside the valid range (p < 0 or p > 1)
β’ The log-likelihood is suboptimal β AIC/BIC worse
β’ Residuals are systematically non-normal
β’ Inference (confidence intervals, tests) is biased
β’ Predictions fall outside the valid range (p < 0 or p > 1)
β’ The log-likelihood is suboptimal β AIC/BIC worse
β’ Residuals are systematically non-normal
β’ Inference (confidence intervals, tests) is biased
What is a link function β and why do you need one?
The problem: The linear predictor Ξ· = a + bΒ·x can take any value
(ββ to +β). But many outcome variables have a restricted range:
probabilities lie in (0,1), count rates must be positive.
The solution: A link function transforms the expected value into a range where linear modelling makes sense:
Logit link (Bernoulli):
The logit maps p β (0,1) to (ββ,+β). Inversely: every Ξ· gives a p = 1/(1+e^(βΞ·)) β (0,1). The S-curve in the plot.
Log link (Poisson, Gamma):
Logarithm maps Ξ» > 0 to (ββ,+β). Inversely: Ξ» = e^Ξ· is always positive. The exponential curve in the plot.
Identity link (Normal):
No transformation β the LM is a GLM special case with normal distribution and identity link.
The solution: A link function transforms the expected value into a range where linear modelling makes sense:
Logit link (Bernoulli):
log(p/(1-p)) = Ξ·The logit maps p β (0,1) to (ββ,+β). Inversely: every Ξ· gives a p = 1/(1+e^(βΞ·)) β (0,1). The S-curve in the plot.
Log link (Poisson, Gamma):
log(Ξ») = Ξ·Logarithm maps Ξ» > 0 to (ββ,+β). Inversely: Ξ» = e^Ξ· is always positive. The exponential curve in the plot.
Identity link (Normal):
E[y] = Ξ·No transformation β the LM is a GLM special case with normal distribution and identity link.
How to choose the right family β with examples
Look at the nature of the outcome variable:
Bernoulli (Scenario 1): Did someone achieve a clinically significant improvement in body image disorder after CBT? (0=no, 1=yes). Predictor: therapeutic alliance (WAI-S, z-transformed). Generally: binary outcomes, diagnoses, decisions.
Poisson (Scenario 2): How many times per day does someone inspect negatively rated body parts in the mirror? (Body checking, 0, 1, 2, β¦). Generally: count data, event frequencies. Note: with overdispersion β Negative Binomial.
Gamma (Scenario 3): How long (ms) does someone take in an emotional Stroop test with body-related words? Generally: reaction times, waiting times, costs β always positive, right-skewed.
Normal/LM: z-standardised questionnaire scores, IQ, continuous symmetric measures with no hard natural zero boundary.
Check residuals via Posterior Predictive Check β AIC/BIC for comparing alternative families on the same data.
Bernoulli (Scenario 1): Did someone achieve a clinically significant improvement in body image disorder after CBT? (0=no, 1=yes). Predictor: therapeutic alliance (WAI-S, z-transformed). Generally: binary outcomes, diagnoses, decisions.
Poisson (Scenario 2): How many times per day does someone inspect negatively rated body parts in the mirror? (Body checking, 0, 1, 2, β¦). Generally: count data, event frequencies. Note: with overdispersion β Negative Binomial.
Gamma (Scenario 3): How long (ms) does someone take in an emotional Stroop test with body-related words? Generally: reaction times, waiting times, costs β always positive, right-skewed.
Normal/LM: z-standardised questionnaire scores, IQ, continuous symmetric measures with no hard natural zero boundary.
Check residuals via Posterior Predictive Check β AIC/BIC for comparing alternative families on the same data.
ΞAIC as a decision guide
AIC penalises poor fit and number of parameters: AIC = β2βΜ + 2k
Rule of thumb for ΞAIC = AIC(LM) β AIC(GLM):
β’ ΞAIC < 2: barely any difference
β’ ΞAIC 2β6: moderate advantage for GLM
β’ ΞAIC > 10: strong advantage, LM clearly worse
Here you see the ΞAIC live. With binary data and count data the advantage of the GLM is typically large β with continuous, symmetric data the LM may be sufficient.
Rule of thumb for ΞAIC = AIC(LM) β AIC(GLM):
β’ ΞAIC < 2: barely any difference
β’ ΞAIC 2β6: moderate advantage for GLM
β’ ΞAIC > 10: strong advantage, LM clearly worse
Here you see the ΞAIC live. With binary data and count data the advantage of the GLM is typically large β with continuous, symmetric data the LM may be sufficient.