Correcting nonignorable nonresponse bias in turnout estimation using callback data

Kendrick Qijun Li; Naiwen Ying; Wang Miao; Xinyu Li; Xu Shi

arxiv: 2504.14169 · v4 · submitted 2025-04-19 · 📊 stat.ME

Correcting nonignorable nonresponse bias in turnout estimation using callback data

Xinyu Li , Naiwen Ying , Kendrick Qijun Li , Xu Shi , Wang Miao This is my paper

Pith reviewed 2026-05-22 18:39 UTC · model grok-4.3

classification 📊 stat.ME

keywords nonresponse biasturnout estimationcallback datanonignorable missingnesselection surveysstableness of resistanceANES non-response study

0 comments

The pith

Callback data and a stability assumption make true voter turnout identifiable from biased election surveys.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how records of repeated contact attempts in surveys can correct overestimation of turnout caused by nonignorable nonresponse, where non-voters are less likely to participate. It introduces the stableness of resistance assumption, which holds that the effect of the missing outcome on response chance stays constant across the first two calls, and combines this with census covariates to achieve identifiability. This approach matters because conventional surveys distort views of election participation and public opinion when hard-to-reach groups differ systematically in behavior. The resulting estimates align closely with official turnout figures and document lower voting rates among more reluctant respondents.

Core claim

Under the stableness of resistance assumption, which states that the impact of the missing outcome on the response propensity is stable in the first two call attempts, and by integrating with covariate information from the census data, the proposed methods establish identifiability and develop estimation methods for turnout that produce estimates very close to the official turnout while capturing the trend of declining willingness to vote as response reluctance increases.

What carries the argument

The stableness of resistance assumption, which states that the impact of the missing outcome on the response propensity is stable in the first two call attempts, serving as the key restriction that allows identifiability when paired with census covariates.

If this is right

Election survey turnout estimates can be adjusted to match official results more closely than standard methods allow.
The pattern of lower voting willingness among increasingly reluctant respondents becomes measurable and reportable.
Nonignorable nonresponse in political surveys can be addressed using callback records that are already collected in many studies.
Combining callback sequences with external census data provides a route to identifiability without requiring full parametric models of the missingness process.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same stability restriction on response behavior could be tested or adapted for other survey outcomes such as candidate support or policy attitudes.
Polling firms might embed routine analysis of callback patterns into post-election reporting to reduce bias in published figures.
Extending the framework to later calls or to panel surveys could reveal whether the stability holds beyond the initial two attempts.

Load-bearing premise

The effect of the unobserved turnout outcome on the probability of responding remains the same between the first and second contact attempts.

What would settle it

A direct comparison showing that the estimated link between non-voting and response probability changes markedly from the first call to the second call, or that the adjusted turnout estimates diverge substantially from official figures without reproducing the observed reluctance gradient.

read the original abstract

Overestimation of turnout has long been an issue in election surveys, with nonresponse bias or voter overrepresentation identified as major sources of bias. However, adjusting for nonignorable nonresponse bias is substantially challenging. Based on the ANES Non-Response Follow-Up study concerning the 2020 U.S. presidential election, we investigate the role of callback data, that is, records of contact attempts in the survey course, in adjusting for nonresponse bias in the estimation of turnout. We propose a stableness of resistance assumption to account for nonignorable missingness in the outcome, which states that the impact of the missing outcome on the response propensity is stable in the first two call attempts. Under this assumption and by integrating with covariate information from the census data, we establish identifiability and develop estimation methods for turnout. Our methods produce estimates very close to the official turnout and successfully capture the trend of declining willingness to vote as response reluctance increases. This work highlights the importance of adjusting for nonignorable nonresponse bias and demonstrates the potential of widely available callback data for political surveys.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper uses a new stableness of resistance assumption plus census covariates to identify turnout from ANES callback data and produces estimates close to official figures, but the assumption carries the main identification burden without direct tests.

read the letter

This paper gives a way to adjust turnout estimates for nonignorable nonresponse by treating callback records from the ANES 2020 follow-up study. The authors introduce a stableness of resistance assumption that keeps the effect of the latent turnout outcome on response propensity the same across the first two calls. With census covariates added in, they claim this setup delivers point identification and estimators that track official turnout while showing lower voting rates among people who require more contact attempts.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes methods to correct for nonignorable nonresponse bias in estimating voter turnout by leveraging callback data from the ANES Non-Response Follow-Up study on the 2020 U.S. presidential election. It introduces a 'stableness of resistance' assumption stating that the effect of the latent turnout indicator on response propensity is constant across the first two call attempts. Combined with census covariates, this assumption is claimed to deliver identifiability of the turnout probability, and the resulting estimates are reported to track official turnout figures closely while capturing a decline in voting willingness as response reluctance increases.

Significance. If the identifying assumption holds and the estimation procedures are robust, the work provides a concrete demonstration of how routinely collected callback records can be used to address nonignorable missingness in election surveys, a setting where turnout overestimation is a long-standing problem. The explicit use of external census covariates to anchor estimates is a practical strength that could be extended to other survey contexts with similar call-history data.

major comments (2)

[§2] §2 (or the section defining the model): The stableness of resistance assumption is presented as the key restriction that, together with census covariates, yields point identification of the turnout probability. However, the manuscript contains no direct test, falsification check, or sensitivity analysis of this assumption in the ANES callback data. Because the assumption is the sole source of identification for the nonignorable component, its violation would render the bias correction unidentified; a concrete sensitivity exercise (e.g., allowing the effect to differ by a small amount between calls) is needed to assess robustness.
[§4] §4 or §5 (estimation and results): The claim that the proposed estimators produce turnout estimates 'very close' to official figures is central to the empirical contribution, yet the manuscript does not report quantitative measures of agreement (e.g., absolute or relative differences, confidence intervals around the estimates, or comparisons against estimators that do not impose the stability restriction). Without these, it is difficult to judge whether the agreement is substantively meaningful or an artifact of the identifying assumption.

minor comments (2)

[§2] The notation for response propensity and the latent outcome indicator should be introduced with a single consistent symbol set early in the model section to avoid later ambiguity when the stability restriction is imposed.
[Abstract] The abstract states that the methods 'successfully capture the trend of declining willingness to vote as response reluctance increases,' but the corresponding figure or table is not referenced in the text; adding an explicit cross-reference would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address each major comment below and will revise the manuscript to incorporate additional analyses that strengthen the presentation of the identifying assumption and the empirical results.

read point-by-point responses

Referee: [§2] §2 (or the section defining the model): The stableness of resistance assumption is presented as the key restriction that, together with census covariates, yields point identification of the turnout probability. However, the manuscript contains no direct test, falsification check, or sensitivity analysis of this assumption in the ANES callback data. Because the assumption is the sole source of identification for the nonignorable component, its violation would render the bias correction unidentified; a concrete sensitivity exercise (e.g., allowing the effect to differ by a small amount between calls) is needed to assess robustness.

Authors: We agree that the stableness of resistance assumption is the central identifying restriction and that its robustness should be examined explicitly. In the revised manuscript we will add a sensitivity analysis that perturbs the assumption by allowing the effect of the latent turnout indicator on response propensity to differ by a small fixed amount between the first and second calls. We will report the resulting range of turnout estimates and discuss how sensitive the conclusions are to modest violations of the stability restriction. revision: yes
Referee: [§4] §4 or §5 (estimation and results): The claim that the proposed estimators produce turnout estimates 'very close' to official figures is central to the empirical contribution, yet the manuscript does not report quantitative measures of agreement (e.g., absolute or relative differences, confidence intervals around the estimates, or comparisons against estimators that do not impose the stability restriction). Without these, it is difficult to judge whether the agreement is substantively meaningful or an artifact of the identifying assumption.

Authors: We accept that quantitative measures of agreement would improve the clarity of the empirical section. The revision will include absolute and relative differences between our turnout estimates and the official figures, together with bootstrap confidence intervals for the proposed estimators. We will also add a comparison with estimators that do not impose the stability restriction (or that use alternative identifying assumptions) so that readers can assess the contribution of the stableness-of-resistance condition. revision: yes

Circularity Check

0 steps flagged

No significant circularity; identifiability rests on external assumption plus census covariates

full rationale

The paper introduces the stableness of resistance assumption as a modeling restriction on how the latent turnout indicator affects response propensity across the first two call attempts. Identifiability is then asserted under this assumption together with integration of external census covariates. No equations or steps in the abstract or described derivation reduce a claimed result to a fitted parameter or self-citation by construction; the central identification burden is carried by the stated assumption rather than by internal data fitting that loops back on itself. Validation against official turnout figures is presented as an external check, not as part of the identification argument. This structure is self-contained and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the newly proposed stableness of resistance assumption for identifiability and on external census covariates for estimation; no other free parameters or invented entities are mentioned in the abstract.

axioms (1)

domain assumption Stableness of resistance assumption: the impact of the missing outcome on the response propensity is stable in the first two call attempts.
This assumption is invoked to establish identifiability of turnout under nonignorable nonresponse.

pith-pipeline@v0.9.0 · 5729 in / 1297 out tokens · 46518 ms · 2026-05-22T18:39:56.585167+00:00 · methodology

Correcting nonignorable nonresponse bias in turnout estimation using callback data

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)