Golem Builder – Bayes Thinking Lab

Step 1: Build DAG

DAG Canvas

Golem Builder Add variables in the sidebar,
then click nodes to draw edges.

Normal

Exposition (X)

Outcome (Y)

Adjustierungsset

Collider

Latent (U)

Causal Analysis

Effect of interest: Total causal effect (Mediator ≠ control) Direct effect of X (Control for mediator)

MUST control

MUST NOT control

CAN control

All Paths

      Every statement A ⊥⊥ B | Z must be empirically testable —
      it states that A and B are conditionally independent when controlling for Z
      (e.g. via partial correlation or chi-square test). If the data disagree,
      the DAG is misspecified.
    

At least 3 nodes required.

Specify a regression coefficient (β) for each edge and a distribution for each root variable (without parents). Then click one of the code generators.

Edge Weights (β)

From	→ To	β	×W (Moderator)	β_int

Root Node Distributions

Variable	Distribution	Mean / p	SD / –

Simulation Settings

n (sample size):

seed:

      💡 Try it yourself — make bias visible:

      Generate the simulated data, then change the model formula and observe what happens:

       · Omit the confounder (remove from brms/glmmTMB code) → β(X) becomes biased (Confounding Bias)

       · Include the collider (add to formula) → opens spurious association between X and Y (Berkson Bias)

       · Control for mediator when estimating total effect → underestimates β(X) because the indirect path is blocked

      The true β is in the simulation code — compare it with the estimate from the fitted model!

R Simulation Code

brms Code

glmmTMB Code

Concepts

DAG

What is a DAG?

A Directed Acyclic Graph encodes causal assumptions explicitly. Arrows show the direction of causal influences – no path may loop back to itself. DAGs force us to make our assumptions explicit before building models (McElreath 2020, Ch. 5–6).

Confounder

Confounder (Fork)

A common cause Z of X and Y creates a spurious correlation between X and Y. The structure Z → X and Z → Y is called a Fork. Without controlling for Z, the estimator for X → Y is biased. The backdoor criterion requires including Z in the regression.

Collider

When X → C ← Y, C is a Collider. The path X – C – Y is normally blocked. When we condition on C (control or restrict it), we open this path and create an artificial association between X and Y – Berkson's Bias. Colliders must NOT be controlled.

Mediator

Mediator (Chain)

In a chain X → M → Y, M is a Mediator. Controlling for M blocks the causal path from X to Y – we would underestimate the total causal effect. Mediators are usually not controlled unless one explicitly wants to estimate the direct effect.

Backdoor

Backdoor Criterion (Pearl)

An adjustment set Z satisfies the backdoor criterion for X → Y if Z (1) blocks all backdoor paths from X to Y and (2) contains no descendants of X. A backdoor path begins with an arrow pointing into X. This criterion is the formal basis for valid causal estimation from observational data.

Simulation

Simulated Data as Gold Standard

When we write the data-generating process ourselves, we know the true causal effects. This lets us check: does our model estimate the correct value? Bias from colliders or missing confounder control becomes visible. Simulation is the most important tool for understanding causal models.

Instrumental Variables (IV)

An instrumental variable Z allows causal identification even when an unmeasured confounder U exists. Z must satisfy three conditions: (1) Relevance: Z→X (Z influences the exposure), (2) Exclusion: Z has no direct effect on Y other than through X, (3) Independence: Z shares no common causes with Y (Z is independent of U).

Frequentist: Two-Stage Least Squares (2SLS) via ivreg(Y ~ X | Z). Bayesian: joint model of both equations with correlated residuals — rescor = TRUE estimates ρ (confounding strength of U) directly as a posterior.

Angrist & Pischke (2009). Mostly Harmless Econometrics. Princeton UP. — McElreath (2020). Statistical Rethinking, Ch. 15.

Front Door Criterion (Pearl)

When U confounds X and Y, but a mediator M exists with (1) M on all paths X→Y, (2) no backdoor X→M, (3) all backdoors M→Y blocked by X — then the causal effect is identifiable without measuring U.

Front Door formula: P(Y|do(X)) = Σ_m P(M|X) · Σ_x' P(Y|M,X=x') · P(X=x'). In the linear case: β_FD = β_XM × β_MY, where β_MY is estimated by controlling for X (X blocks M←X←U→Y).

Bayesian: multivariate brms model with rescor = FALSE yields the joint posterior of β_XM and β_MY — their product is the total causal effect as a full posterior distribution.

Pearl (2009). Causality, Ch. 3.3. Cambridge UP. — Pearl, Glymour & Jewell (2016). Causal Inference in Statistics: A Primer. Wiley.