On the Impossibility of Specification Testing of Interference Models Based on Exposure Mappings

Chao Gao; Christopher Harshaw; Fredrik S\"avje; Yitan Wang

arxiv: 2605.09726 · v2 · pith:SJZMNSRNnew · submitted 2026-05-10 · 🧮 math.ST · stat.ME· stat.TH

On the Impossibility of Specification Testing of Interference Models Based on Exposure Mappings

Chao Gao , Christopher Harshaw , Fredrik S\"avje , Yitan Wang This is my paper

Pith reviewed 2026-05-12 02:56 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH

keywords specification testingexposure mappingsinterference modelscausal inferenceType I errorType II errorimpossibility resultrandomized experiments

0 comments

The pith

Any specification test for an exposure mapping model has worst-case Type I and Type II errors summing to one when it must have power against larger models

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that specification tests for interference models based on exposure mappings cannot simultaneously control Type I error and achieve meaningful power against incorrect specifications. When a test is required to detect deviations from a given exposure mapping toward any strictly larger mapping, its worst-case Type I error rate plus worst-case Type II error rate equals one for every sample size. This bound holds even when outcomes are uniformly bounded and the alternatives are maximally separated from the null under randomized experiments. The result implies that such tests perform no better than a procedure that discards the data and rejects the null hypothesis with probability 1/2. The authors complement the impossibility result by constructing a uniformly consistent test that distinguishes the no-interference model from a network linear-in-means model.

Core claim

For any specification test of a given exposure mapping model that is required to have power against a strictly larger exposure mapping model, the supremum Type I error plus the supremum Type II error equals one. The result applies to all finite sample sizes, to outcomes taking values in [0,1], and to alternatives that differ from the null in the most extreme way permitted by the exposure mapping framework.

What carries the argument

An exposure mapping, which assigns each unit one of several discrete exposure levels determined by the full vector of treatment assignments across all units, together with the partial order on exposure mappings in which one mapping is larger than another when it induces a finer partition of units into exposure groups.

If this is right

Specification tests that do not restrict the alternative class cannot reliably detect misspecified interference models.
Useful tests require the analyst to commit in advance to a narrow pair of models rather than testing against all larger exposure mappings.
A uniformly consistent test exists for the specific case of distinguishing no interference from a network linear-in-means model.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The impossibility suggests that data-driven selection among interference models will generally require domain knowledge to limit the candidate models under consideration.
Similar limits may apply when testing other structured causal models that are defined by partitions or groupings induced by the treatment assignment.
Researchers could explore the weakest additional assumptions on the alternative class that restore the possibility of informative tests with controlled errors.

Load-bearing premise

The test must be required to have power against every possible larger exposure mapping model, with no further restrictions placed on how those alternatives differ from the null model.

What would settle it

A concrete test statistic and rejection threshold for which there exists at least one exposure mapping null model and one strictly larger alternative model such that the worst-case Type I error plus the worst-case Type II error is strictly less than one.

read the original abstract

Researchers use interference models based on exposure mappings to facilitate estimation of causal effects in randomized experiments with interference. To test the veracity of such models, researchers can use specification tests that aim to detect departures from the stipulated model. However, existing tests suffer from poor power and are often unable to detect important model violations. The main result in this paper is to show that the specification testing problem for exposure mapping models is inherently difficult, and the poor power of existing tests is inescapable. In particular, the worst-case Type I and Type II error rates must sum to one for any specification test of such models, ruling out the existence of a uniformly consistent test. This is the worst-case overall error rate achieved by a naive test that discards all data and arbitrarily rejects the null at random; the testing problem is in this sense impossible. This negative result holds true for all exposure mappings, all sample sizes, for uniformly bounded outcomes, and for alternatives that are maximally separated from the null. While some tests can detect some type of departures from the null model, there will always be relevant departures from the null that are undetectable. Informative specification tests must therefore restrict the alternative model against which they seek to attain power for, beyond the restrictions imposed by the exposure mappings alone. We illustrate this by providing a uniformly consistent test for differentiating no-interference from a network-linear-in-means model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves a clean finite-sample impossibility for general exposure-mapping specification tests but shows how restricting the alternative class fixes it.

read the letter

The core result is that no test for an exposure mapping null can control Type I error while having nontrivial power against all strictly larger exposure mappings. Worst-case Type I plus Type II error equals 1 for every finite sample size, bounded outcomes, and randomized design. This is tight because a random-guess test achieves it. The paper then gives one workable escape: a uniformly consistent test that separates no-interference from the network linear-in-means model. That positive part is useful because it shows exactly what kind of restriction on the alternative makes testing feasible. The impossibility argument looks internally consistent and does not rely on circularity or fitted parameters. It is derived directly from the definitions of the exposure mappings and the error rates. The main limitation is that the result forces researchers to commit to narrow alternative classes in advance. In practice many applications will not know which larger mapping to test against, so the negative finding may push people toward pre-specified parametric families rather than broad specification checks. The math is finite-sample and does not appear to hide asymptotic loopholes. The citation pattern is light and focused on the relevant interference literature. This paper is aimed at statisticians and econometricians who design tests for interference in experiments. It is worth a serious referee because the impossibility is sharp, the counter-example is constructive, and the practical implication is clear even if it narrows the scope of testing.

Referee Report

0 major / 2 minor

Summary. The paper establishes a finite-sample impossibility result for specification tests of exposure mapping models in randomized experiments with interference. Any test that controls Type I error under a given exposure mapping null while attaining power against the class of all strictly larger exposure mappings has worst-case Type I error (supremum over null distributions) plus worst-case Type II error (supremum over the larger class) equal to exactly 1. This holds for every finite sample size n, uniformly bounded outcomes, and randomized designs, and is tight because the random-guess test achieves equality. The authors note that informative tests therefore require further restrictions on the alternative and supply one such example: a uniformly consistent test for no-interference versus a network linear-in-means model.

Significance. If the central impossibility result holds, it has clear significance for causal inference under interference: it shows that unrestricted power against larger exposure mappings cannot improve upon random guessing in the worst case, forcing researchers to impose additional structure on alternatives (as the authors do in their positive result). The finite-sample character, the minimal assumptions (bounded outcomes and randomized designs), and the explicit tightness example are strengths. The work directly informs the design of specification tests for network experiments that use exposure mappings.

minor comments (2)

Abstract: the phrase 'alternatives which are maximally separated from the null' is used without an immediate cross-reference to the precise definition employed in the main impossibility theorem; adding a parenthetical pointer would improve readability.
Introduction: the formal definition of an exposure mapping and the partial order used to define 'strictly larger' models appear only after several paragraphs of motivation; moving a concise statement of these objects earlier would help readers follow the subsequent impossibility claim.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript and recommendation for minor revision. The referee's summary accurately captures both the impossibility result and the constructive example we provide.

Circularity Check

0 steps flagged

No significant circularity; impossibility result derived directly from error-rate definitions

full rationale

The central impossibility result follows from the definitions of Type I error (supremum rejection probability under the null exposure mapping) and Type II error (supremum acceptance probability under any strictly larger exposure mapping), together with the requirement that the test must control the former while attaining power against the unrestricted larger class. For any test, adversarial distributions can be constructed (under bounded outcomes and randomized designs) such that the sum of these worst-case errors equals 1; the random-guess test achieves equality, showing the bound is tight. This argument uses only the model definitions and finite-sample error-rate suprema; it does not rely on fitted parameters, self-citations, or any ansatz imported from prior work. The subsequent positive result (uniformly consistent test for no-interference versus network linear-in-means) is constructed explicitly under an additional restriction on the alternative and likewise contains no self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 3 axioms · 0 invented entities

The central claim rests on standard causal inference assumptions including randomized treatment assignment and uniformly bounded outcomes. No free parameters or new entities are introduced; the result is an impossibility derived from the exposure mapping framework itself.

axioms (3)

domain assumption Outcomes are uniformly bounded
Explicitly stated as holding for the impossibility result in the abstract.
domain assumption The experiment is a randomized experiment
Core setup for defining causal effects and exposure mappings.
domain assumption Interference is captured by exposure mappings
The models under test are defined via exposure mappings.

pith-pipeline@v0.9.0 · 5561 in / 1383 out tokens · 48541 ms · 2026-05-12T02:56:16.149563+00:00 · methodology

On the Impossibility of Specification Testing of Interference Models Based on Exposure Mappings

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)