A Bayesian Additive Regression Tree Model for Learning Conditional Average Treatment Effects in Regression Discontinuity Designs

Hedibert F. Lopes; P. Richard Hahn; Rafael Alcantara

arxiv: 2503.00326 · v2 · pith:E3OZNILRnew · submitted 2025-03-01 · 📊 stat.ME · stat.ML

A Bayesian Additive Regression Tree Model for Learning Conditional Average Treatment Effects in Regression Discontinuity Designs

Rafael Alcantara , P. Richard Hahn , Hedibert F. Lopes This is my paper

Pith reviewed 2026-05-23 01:44 UTC · model grok-4.3

classification 📊 stat.ME stat.ML

keywords regression discontinuity designconditional average treatment effectBayesian additive regression treescausal inferencequasi-experimental methods

0 comments

The pith

A Bayesian additive regression tree variant estimates conditional average treatment effects near the cutoff in regression discontinuity designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper introduces a Bayesian model for estimating conditional average treatment effects in regression discontinuity designs. It modifies Bayesian additive regression trees to include linear regressions at the leaves on the running variable, a treatment indicator, and their interaction. The approach adaptively partitions the covariate space to identify regions with differing slopes on the running variable. This provides Bayesian inference on treatment effects without requiring a pre-specified basis expansion, unlike recent frequentist methods, and addresses gaps in earlier Bayesian approaches that do not readily support conditional effects.

Core claim

The model is a variant of a Bayesian additive regression tree with linear leaf-level regressions on the running variable and a treatment dummy (and their interaction). The model adaptively partitions covariate space into regions where the slope on the running variable appreciably differs, providing interpretable Bayesian inference on conditional average treatment effects near the cutoff.

What carries the argument

Bayesian additive regression tree variant with linear leaf regressions on the running variable, treatment dummy, and interaction term, enabling adaptive partitioning of covariate space based on slope differences.

If this is right

Yields interpretable Bayesian posterior inference on how treatment effects vary across covariate regions near the cutoff.
Removes the requirement for users to specify a known basis expansion for the model.
Allows the tree structure to automatically detect where the running variable slope changes appreciably.
Supports uncertainty quantification through the Bayesian framework for the conditional effects.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The adaptive partitioning may offer advantages in high-dimensional covariate settings where fixed basis expansions become impractical.
This structure could be tested for robustness by comparing against parametric RDD models on datasets with known discontinuities.
Extensions might combine the leaf models with other discontinuity designs to handle multiple running variables.

Load-bearing premise

The BART-based partitioning and linear leaf models accurately capture the conditional treatment effects without introducing bias from the adaptive partitioning process or the linear form of the leaf regressions.

What would settle it

A simulation or real-data analysis where the true conditional treatment effect varies nonlinearly with covariates in a manner not representable by the linear leaf models, or where the adaptive partitions fail to isolate slope changes, would show systematic bias in the estimated effects near the cutoff.

read the original abstract

This paper develops a performant Bayesian approach to conditional average treatment effect (CATE) estimation in regression discontinuity designs (RDD), an increasingly prevalent form of quasi-experiment that facilitates causal inference. Earlier Bayesian approaches do not easily accommodate CATE estimation while recent frequentist approaches to this problem assume a known basis expansion, a steep model specification requirement that our approach avoids. The new model is a variant of a Bayesian additive regression tree (BART) model with linear leaf-level regressions on the running variable and a treatment dummy (and their interaction). The model adaptively partitions covariate space into regions where the slope on the running variable appreciably differs, providing interpretable Bayesian inference on conditional average treatment effects near the cutoff.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a BART variant for CATE in RDD that partitions on covariates and fits linear leaves on the running variable, but the linearity restriction inside leaves is a real vulnerability for bias at the cutoff.

read the letter

The core idea is a BART model that splits on covariates and then runs a linear regression on the running variable, treatment dummy, and interaction inside each leaf. This lets the model adaptively find regions where the slope on the running variable changes and produces Bayesian posterior draws for the CATE at the cutoff. It is new in the sense that it applies this structure to RDD without forcing the user to pick a basis expansion up front, which is the main limitation the authors flag in recent frequentist work. The adaptive partitioning is a reasonable way to get heterogeneity without a fully nonparametric treatment of the running variable itself. The Bayesian framing also gives a direct route to uncertainty quantification, which is useful in applied settings. The main weakness is exactly the one the stress-test note flags: identification of the jump at the cutoff rests on the conditional expectation being approximately linear in the running variable on each side of the cutoff inside every leaf. If that fails, even perfect covariate partitions will produce biased intercept differences. The abstract does not show how the splitting criterion guards against this or whether the authors run diagnostics for residual curvature. Without those checks, the method could overstate precision in regions where the running-variable relationship bends. The paper is aimed at applied researchers who already use BART and want CATE estimates in RDD contexts, especially in social science data where covariate heterogeneity matters. It is coherent on its own terms and engages the relevant literature, so it deserves a serious referee even if the linearity assumption needs more scrutiny in revision.

Referee Report

1 major / 1 minor

Summary. The paper develops a Bayesian additive regression tree (BART) variant for estimating conditional average treatment effects (CATE) in regression discontinuity designs (RDD). The model augments standard BART with linear leaf-level regressions on the running variable, a treatment indicator, and their interaction. It adaptively partitions the covariate space into regions where the slope on the running variable differs appreciably, yielding interpretable posterior inference on CATE near the cutoff without requiring a pre-specified basis expansion.

Significance. If the central claims hold after addressing the functional-form issue below, the method would supply a flexible Bayesian alternative to existing frequentist CATE estimators for RDD that avoids strong parametric assumptions on the basis while retaining the interpretability of local linear fits. The adaptive partitioning could be useful in applications where treatment-effect heterogeneity is driven by observed covariates.

major comments (1)

[§3.1 (leaf model specification)] §3.1 (leaf model specification): the model places linear regressions on the running variable (plus treatment and interaction) inside each BART leaf defined by covariate partitions. Identification of the discontinuity at the cutoff therefore requires that the conditional expectation be approximately linear in the running variable on each side, within every leaf. If the true functions exhibit curvature in the running variable inside a leaf, the global linear fit will generally produce biased estimates of the intercept difference evaluated at the cutoff, even if the covariate partitioning is perfect. The adaptive splitting criterion does not enforce or correct for this functional-form restriction.

minor comments (1)

The abstract would be strengthened by a one-sentence summary of the simulation design and main empirical findings.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive report and the opportunity to clarify our approach. We address the single major comment below.

read point-by-point responses

Referee: [§3.1 (leaf model specification)] §3.1 (leaf model specification): the model places linear regressions on the running variable (plus treatment and interaction) inside each BART leaf defined by covariate partitions. Identification of the discontinuity at the cutoff therefore requires that the conditional expectation be approximately linear in the running variable on each side, within every leaf. If the true functions exhibit curvature in the running variable inside a leaf, the global linear fit will generally produce biased estimates of the intercept difference evaluated at the cutoff, even if the covariate partitioning is perfect. The adaptive splitting criterion does not enforce or correct for this functional-form restriction.

Authors: We agree that the model imposes an approximate linearity assumption in the running variable within each leaf, and that curvature within a leaf can bias the estimated discontinuity even with perfect covariate partitioning. This is an inherent feature of any local-linear RDD estimator (parametric or nonparametric) that does not explicitly model higher-order terms in the running variable. Our adaptive splitting on covariates is intended to localize the linear approximation to regions where the slope on the running variable is relatively stable, analogous to how a small bandwidth in classical local-linear RDD improves the approximation. However, the referee is correct that the current splitting criterion does not explicitly detect or penalize residual nonlinearity in the running variable. In the revision we will (i) add an explicit discussion of this modeling assumption and its relation to standard local-linear RDD practice, (ii) include guidance on diagnostics (e.g., posterior predictive checks within leaves), and (iii) report additional simulation results that vary the degree of within-leaf curvature to quantify sensitivity. revision: partial

Circularity Check

0 steps flagged

No circularity: model extends established BART with new leaf structure

full rationale

The paper introduces a BART variant with linear leaf regressions on the running variable, treatment indicator, and interaction. This is presented as a direct modeling choice to enable CATE estimation in RDD without requiring a pre-specified basis expansion. No derivation step reduces to a fitted parameter renamed as prediction, no self-citation is invoked as a uniqueness theorem, and the adaptive partitioning is justified by the standard BART splitting mechanism rather than by construction from the target CATE quantity. The approach is self-contained against external benchmarks (standard BART literature) and does not rely on any load-bearing self-referential step.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The abstract does not specify any free parameters, axioms, or invented entities; insufficient information available.

pith-pipeline@v0.9.0 · 5654 in / 977 out tokens · 57370 ms · 2026-05-23T01:44:51.180552+00:00 · methodology

A Bayesian Additive Regression Tree Model for Learning Conditional Average Treatment Effects in Regression Discontinuity Designs

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)