Valid Inference when Testing Violations of Parallel Trends for Difference-in-Differences

Christopher Harshaw; Jonas M. Mikhaeil

arxiv: 2510.26470 · v3 · submitted 2025-10-30 · 📊 stat.ME · econ.EM

Valid Inference when Testing Violations of Parallel Trends for Difference-in-Differences

Jonas M. Mikhaeil , Christopher Harshaw This is my paper

Pith reviewed 2026-05-18 03:17 UTC · model grok-4.3

classification 📊 stat.ME econ.EM

keywords difference-in-differencesparallel trendspreliminary testscausal inferenceconfidence intervalsvalid inferenceconditional extrapolation

0 comments

The pith

Researchers can obtain valid confidence intervals for causal effects in difference-in-differences after passing a test for parallel trends.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper tackles the known problems with preliminary tests for the parallel trends assumption in difference-in-differences designs, where standard approaches often produce biased estimates and undercovering intervals. The authors introduce simple preliminary tests paired with new confidence intervals that maintain valid coverage when the test is passed. These tools rest on a conditional extrapolation assumption that connects observable pre-treatment trend violations to the unobservable post-treatment violation. If this assumption and mild separation conditions hold, the test becomes consistent against relevant alternatives and the intervals achieve correct conditional coverage. The methods are demonstrated on synthetic data plus applications to public service recentralization in Vietnam and right-to-carry laws in Virginia.

Core claim

Under mild separation conditions and the conditional extrapolation assumption, the proposed preliminary test for parallel trends is consistent and the associated confidence intervals for the causal effect attain valid coverage conditional on the test being passed.

What carries the argument

The conditional extrapolation assumption, which formally links the unidentified post-treatment violation of parallel trends to the identified pre-treatment violations and thereby justifies post-test inference.

If this is right

The preliminary test gains power against alternatives with sizable pre-treatment trend differences.
Standard undercoverage and bias problems from existing pretest procedures are avoided when the test passes.
Applied researchers can use the intervals on datasets such as Vietnam public services and Virginia right-to-carry laws while preserving conditional validity.
The approach formalizes the implicit reasoning researchers already use when checking pre-trends before reporting difference-in-differences estimates.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar pretest procedures could be adapted to other causal identification strategies that rely on trend or slope assumptions.
Empirical papers might report both the test statistic and the adjusted intervals as standard practice to improve transparency.
Further work could examine how sensitive the coverage is to small departures from the conditional extrapolation assumption.

Load-bearing premise

The post-treatment violation of parallel trends bears a specific relationship to the pre-treatment violations that can be extrapolated from the observed data.

What would settle it

A simulation study or real-data example in which post-treatment trend violations depart substantially from the pattern implied by pre-treatment violations, causing the proposed confidence intervals to exhibit incorrect coverage rates after the test is passed.

read the original abstract

The difference-in-differences (DID) research design is a key identification strategy which allows researchers to estimate causal effects under the parallel trends assumption. While the parallel trends assumption is counterfactual and cannot be tested directly, researchers often examine pre-treatment periods to check whether the time trends are parallel before treatment is administered. A recent literature has shown that existing preliminary tests have adverse effects on conventional statistical methods for estimation and inference, including low power, bias, and undercoverage. In this paper, we describe simple preliminary tests and corresponding confidence intervals for the causal effect which overcome these issues. Under mild separation conditions, the preliminary test is shown to be consistent and the confidence intervals for the causal effect have valid coverage conditional on passing the test. Our results hold under what we refer to as the conditional extrapolation assumption, which posits a relationship between the unidentified post-treatment violation of parallel trends and the identified pre-treatment violations. We view the conditional extrapolation assumption as one formalization of the assumption which is implicitly held when conducting a preliminary test for parallel trends. To illustrate the performance of the proposed methods, we use synthetic data as well as data on recentralization of public services in Vietnam and right-to-carry laws in Virginia.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives conditional confidence intervals for DID effects after pre-testing parallel trends, but only under a new assumption linking pre- and post-treatment violations.

read the letter

This paper gives researchers a way to run preliminary tests for parallel trends in difference-in-differences and still get confidence intervals with valid coverage conditional on passing those tests. They achieve this by introducing what they call the conditional extrapolation assumption, which links the pre-treatment violations you can see to the post-treatment ones you cannot. The work does address a real issue that has come up in the DID literature lately, where pre-testing can distort standard inference through low power and undercoverage. The authors show their test is consistent under mild separation conditions and derive the conditional coverage for the causal effect intervals. They illustrate with synthetic data and two empirical cases from Vietnam and Virginia, which helps ground the theory. Where it is softer is on that extrapolation assumption. The coverage holds only if the specific relationship between pre and post violations is true, and without it the pre-test information does not carry over to the post-treatment bias. The paper presents it as a formal version of what people already assume when they pre-test, but there is no apparent robustness analysis for cases where it is only approximately true. That is a limitation that stands out. This is the sort of paper that applied researchers using DID with pre-trend checks would find useful, especially those looking for more defensible inference procedures. Methodologists in causal inference might also engage with the formal setup. It has enough substance and engagement with prior work to deserve a serious referee. I would send it for peer review. The problem it targets is important enough that the proposal merits expert feedback, even with the assumption as a point to probe.

Referee Report

2 major / 2 minor

Summary. The paper proposes simple preliminary tests for violations of the parallel trends assumption in difference-in-differences designs together with corresponding confidence intervals for the causal effect. These procedures are designed to deliver a consistent test and valid coverage for the causal-effect interval conditional on passing the test. The results are derived under mild separation conditions and a new conditional extrapolation assumption that links the unidentified post-treatment parallel-trends violation to the identified pre-treatment violations. The authors view the latter assumption as a formalization of the implicit belief underlying conventional pre-testing practice. The methods are illustrated on synthetic data and on two empirical applications (recentralization of public services in Vietnam and right-to-carry laws in Virginia).

Significance. If the central claims hold, the contribution would be substantial for applied work that routinely employs pre-tests for parallel trends. By supplying procedures whose coverage is valid conditional on passing the test, the paper directly addresses documented problems of bias and undercoverage that arise from conventional pre-testing. The explicit statement of the conditional extrapolation assumption and the provision of both theoretical guarantees and empirical illustrations are positive features.

major comments (2)

[Main theoretical results (conditional extrapolation assumption and coverage theorem)] The coverage guarantee for the causal-effect confidence interval (stated in the abstract and derived in the main theoretical section) is obtained only under the conditional extrapolation assumption. No sensitivity analysis, local robustness result, or bound on coverage distortion under approximate violations of this assumption is provided; without such analysis the practical usefulness of the conditional coverage statement remains unclear.
[Consistency result for the preliminary test] The mild separation conditions invoked for consistency of the preliminary test are not illustrated with a concrete numerical example in which separation holds yet the extrapolation assumption fails, leaving the interaction between the two sets of conditions opaque.

minor comments (2)

[Abstract and Introduction] The abstract and introduction would benefit from a short statement of the precise functional form imposed by the conditional extrapolation assumption (e.g., linear, constant, or other) rather than a purely verbal description.
[Empirical illustrations] In the empirical applications, the tables reporting pre-treatment test statistics and post-treatment confidence intervals should include the exact sample sizes and the value of the separation parameter used in the simulations for comparability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We address each major comment below and outline the revisions we plan to make to strengthen the manuscript.

read point-by-point responses

Referee: [Main theoretical results (conditional extrapolation assumption and coverage theorem)] The coverage guarantee for the causal-effect confidence interval (stated in the abstract and derived in the main theoretical section) is obtained only under the conditional extrapolation assumption. No sensitivity analysis, local robustness result, or bound on coverage distortion under approximate violations of this assumption is provided; without such analysis the practical usefulness of the conditional coverage statement remains unclear.

Authors: We appreciate the referee's point regarding the scope of the coverage guarantee. The conditional extrapolation assumption is explicitly stated as the key condition under which we obtain valid coverage for the causal effect conditional on passing the preliminary test; we present it as a transparent formalization of the implicit belief that motivates pre-testing in practice. The paper's primary contribution is to deliver consistent testing and conditionally valid inference under this assumption, thereby addressing the documented problems with conventional pre-testing. While we do not provide a full sensitivity analysis in the current version, we agree that bounds on coverage distortion under approximate violations would increase practical usefulness. In the revised manuscript we will add a dedicated subsection with a local robustness result and numerical illustrations of coverage under mild departures from the assumption. revision: yes
Referee: [Consistency result for the preliminary test] The mild separation conditions invoked for consistency of the preliminary test are not illustrated with a concrete numerical example in which separation holds yet the extrapolation assumption fails, leaving the interaction between the two sets of conditions opaque.

Authors: We thank the referee for this observation. The separation conditions ensure consistency of the preliminary test (i.e., the test rejects with probability approaching one whenever a violation of parallel trends is present), while the conditional extrapolation assumption is used separately to guarantee coverage of the post-test confidence interval. These are logically distinct: consistency of the test does not require the extrapolation assumption, but valid conditional coverage does. To clarify the interaction, we will add a concrete numerical example (in the simulation section or an appendix) in which the separation condition holds yet the extrapolation assumption is violated, showing that the test remains consistent while coverage of the causal-effect interval fails. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on explicit new assumption rather than self-referential reduction

full rationale

The paper's central results—consistency of the preliminary test and valid conditional coverage of the causal-effect confidence intervals—are derived under an explicitly introduced conditional extrapolation assumption that relates unidentified post-treatment parallel-trends violations to identified pre-treatment violations, together with mild separation conditions. This assumption is presented as a formalization of implicit practitioner beliefs rather than derived from prior results or fitted quantities within the paper. No steps reduce by construction to inputs (e.g., no fitted parameter renamed as prediction, no self-citation load-bearing the uniqueness or validity claim, and no ansatz smuggled via self-reference). The approach therefore extends standard DID inference theory in a self-contained manner conditional on the stated assumption.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on one key domain assumption that enables conditional validity; no free parameters or new entities are described.

axioms (1)

domain assumption Conditional extrapolation assumption relating unidentified post-treatment violation of parallel trends to identified pre-treatment violations.
This assumption is invoked to justify valid coverage of the confidence intervals conditional on passing the preliminary test.

pith-pipeline@v0.9.0 · 5742 in / 1157 out tokens · 44967 ms · 2026-05-18T03:17:44.998543+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Assumption 3 (Conditional Extrapolation). If S_pre <= M, then S_post <= S_pre.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 5.1 ... conditionally valid on the results of the preliminary test under the well-separated null

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.