Step 1: Build DAG
DAG Canvas
Golem Builder Add variables in the sidebar,
then click nodes to draw edges.
Normal
Exposition (X)
Outcome (Y)
Adjustierungsset
Collider
Latent (U)
Causal Analysis
Effect of interest:
MUST control
    MUST NOT control
      CAN control
        All Paths
        Every statement A ⊥⊥ B | Z must be empirically testable — it states that A and B are conditionally independent when controlling for Z (e.g. via partial correlation or chi-square test). If the data disagree, the DAG is misspecified.
        At least 3 nodes required.
        Specify a regression coefficient (β) for each edge and a distribution for each root variable (without parents). Then click one of the code generators.
        Edge Weights (β)
        From→ Toβ×W (Moderator)β_int
        Root Node Distributions
        VariableDistributionMean / pSD / –
        Simulation Settings
        💡 Try it yourself — make bias visible:
        Generate the simulated data, then change the model formula and observe what happens:
         · Omit the confounder (remove from brms/glmmTMB code) → β(X) becomes biased (Confounding Bias)
         · Include the collider (add to formula) → opens spurious association between X and Y (Berkson Bias)
         · Control for mediator when estimating total effect → underestimates β(X) because the indirect path is blocked
        The true β is in the simulation code — compare it with the estimate from the fitted model!
        Concepts
        DAG
        What is a DAG?
        A Directed Acyclic Graph encodes causal assumptions explicitly. Arrows show the direction of causal influences – no path may loop back to itself. DAGs force us to make our assumptions explicit before building models (McElreath 2020, Ch. 5–6).
        Confounder
        Confounder (Fork)
        A common cause Z of X and Y creates a spurious correlation between X and Y. The structure Z → X and Z → Y is called a Fork. Without controlling for Z, the estimator for X → Y is biased. The backdoor criterion requires including Z in the regression.
        Collider
        Collider
        When X → C ← Y, C is a Collider. The path X – C – Y is normally blocked. When we condition on C (control or restrict it), we open this path and create an artificial association between X and Y – Berkson's Bias. Colliders must NOT be controlled.
        Mediator
        Mediator (Chain)
        In a chain X → M → Y, M is a Mediator. Controlling for M blocks the causal path from X to Y – we would underestimate the total causal effect. Mediators are usually not controlled unless one explicitly wants to estimate the direct effect.
        Backdoor
        Backdoor Criterion (Pearl)
        An adjustment set Z satisfies the backdoor criterion for X → Y if Z (1) blocks all backdoor paths from X to Y and (2) contains no descendants of X. A backdoor path begins with an arrow pointing into X. This criterion is the formal basis for valid causal estimation from observational data.
        Simulation
        Simulated Data as Gold Standard
        When we write the data-generating process ourselves, we know the true causal effects. This lets us check: does our model estimate the correct value? Bias from colliders or missing confounder control becomes visible. Simulation is the most important tool for understanding causal models.
        Instrumental Variables (IV)

        An instrumental variable Z allows causal identification even when an unmeasured confounder U exists. Z must satisfy three conditions: (1) Relevance: Z→X (Z influences the exposure), (2) Exclusion: Z has no direct effect on Y other than through X, (3) Independence: Z shares no common causes with Y (Z is independent of U).

        Frequentist: Two-Stage Least Squares (2SLS) via ivreg(Y ~ X | Z). Bayesian: joint model of both equations with correlated residuals — rescor = TRUE estimates ρ (confounding strength of U) directly as a posterior.

        Angrist & Pischke (2009). Mostly Harmless Econometrics. Princeton UP. — McElreath (2020). Statistical Rethinking, Ch. 15.

        Front Door Criterion (Pearl)

        When U confounds X and Y, but a mediator M exists with (1) M on all paths X→Y, (2) no backdoor X→M, (3) all backdoors M→Y blocked by X — then the causal effect is identifiable without measuring U.

        Front Door formula: P(Y|do(X)) = Σm P(M|X) · Σx' P(Y|M,X=x') · P(X=x'). In the linear case: β_FD = β_XM × β_MY, where β_MY is estimated by controlling for X (X blocks M←X←U→Y).

        Bayesian: multivariate brms model with rescor = FALSE yields the joint posterior of β_XM and β_MY — their product is the total causal effect as a full posterior distribution.

        Pearl (2009). Causality, Ch. 3.3. Cambridge UP. — Pearl, Glymour & Jewell (2016). Causal Inference in Statistics: A Primer. Wiley.

        Golem Builder — Help