pith. sign in

arxiv: 2604.23866 · v1 · submitted 2026-04-26 · 📊 stat.ME

A Review of Methods and Practices for Missing Data in Sequential Multiple Assignment Randomized Trials (SMARTs): An Ancillary Study of a Scoping Review

Pith reviewed 2026-05-08 05:36 UTC · model grok-4.3

classification 📊 stat.ME
keywords missing dataSMARTsequential multiple assignment randomized trialsattritionmissing at randomadaptive clinical trialssensitivity analysis
0
0 comments X

The pith

Only a small number of statistical methods exist for handling missing data in SMARTs, creating a gap with how these trials are actually conducted.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reviews existing statistical methods for missing data in sequential multiple assignment randomized trials, or SMARTs, and checks how published SMARTs deal with missingness in practice. It identifies just seven methodological papers, most assuming data are missing at random and only one covering all SMART-specific types of missingness. Across thirty published SMARTs, attrition reached a median of 18 percent, with mixed-model approaches being the most common way to handle it and few studies pre-specifying sensitivity analyses. A reader would care because missing data threatens the validity of inferences in these adaptive trials that re-randomize participants based on responses.

Core claim

Seven methodological papers were identified for missing data in SMARTs; nearly all assume missing at random, and only one addresses the full set of SMART-specific missingness types. In 30 published SMARTs, the median overall attrition was 18.1 percent. Methods for addressing missing data were described in 80 percent of manuscripts, with mixed-model methods most common at 30 percent. Among 14 studies with paired protocols, sensitivity analyses were pre-specified in only 2.

What carries the argument

A narrative review of statistical methods for missing data in SMARTs paired with secondary extraction of attrition rates, handling methods, and analysis plans from 30 published SMARTs identified in a prior scoping review.

If this is right

  • Development of additional methods is needed to handle the full range of missingness patterns unique to SMART designs.
  • Published SMARTs should more consistently report and apply appropriate missing data techniques rather than defaulting to general mixed models.
  • Protocols for SMARTs ought to pre-specify sensitivity analyses for missing data to strengthen the robustness of findings.
  • Improved alignment between methods and practice would enhance the reliability of conclusions from adaptive treatment trials.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Addressing this gap could lead to more efficient use of resources in trials for personalized medicine where treatments adapt over time.
  • Future reviews might benefit from including gray literature or ongoing trials to better gauge the full extent of the methodology-practice divide.
  • Software implementations of SMART-specific missing data methods would help close the gap by making advanced techniques accessible to trial analysts.

Load-bearing premise

The small set of seven methodological papers and the thirty SMARTs extracted from the scoping review accurately represent all available methods and current practices in the field.

What would settle it

Discovery of substantially more than seven methodological papers on missing data handling in SMARTs or evidence that most published SMARTs employ advanced techniques specifically designed for their sequential structure.

read the original abstract

Background: Missing data poses an acute threat to sequential multiple assignment randomized trial (SMART) analyses because of the sequential treatment structure and response-dependent re-randomization. Objectives: This study aimed to (1) review the current statistical methods for handling missing data in SMARTs, and (2) characterize how missing data is reported and handled in published SMARTs. Methods: We conducted a narrative review of statistical methods developed for missing data in SMARTs. Additionally, we conducted a pre-specified secondary extraction of a previously published scoping review of SMARTs focused on missing data. Extraction captured attrition rates, methods for handling missingness, and planned versus performed missing data analyses. Results: Seven methodological papers were identified; nearly all assume missing at random (MAR), and only one addresses the full set of SMART-specific missingness types. Across 30 published SMARTs, median overall attrition was 18.1% (range 0.6%-56.5%). Methods used to address missing data were described in 80% of the manuscripts; mixed-model methods were most common (30%). Among 14 studies with paired protocols, sensitivity analyses were pre-specified in 2 (14%). Conclusions: SMART-specific methodology for missing data is limited, and a substantial gap exists between available methodology and current SMART practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript presents a narrative review of statistical methods for missing data in sequential multiple assignment randomized trials (SMARTs) and a pre-specified secondary extraction from a prior scoping review of 30 published SMARTs. It identifies seven methodological papers (nearly all assuming MAR, with only one addressing the full set of SMART-specific missingness types), reports a median overall attrition of 18.1% (range 0.6%-56.5%), notes that missing-data methods were described in 80% of manuscripts (mixed models most common at 30%), and finds sensitivity analyses pre-specified in only 2 of 14 studies with protocols. The authors conclude that SMART-specific methodology is limited and a substantial gap exists between available methods and current practice.

Significance. If the counts and characterizations hold, the work is significant for highlighting an important gap in tailored methods for missing data in SMARTs, a design central to adaptive interventions. Documenting low rates of pre-specified sensitivity analyses and the predominance of general mixed-model approaches provides a concrete basis for improving reporting standards and prioritizing new methodological development. The pre-specified secondary extraction from an existing scoping review is a strength that supports reproducibility and focus on practice.

minor comments (2)
  1. [Methods] Methods section: the narrative review search strategy is described only at a high level; specifying the databases, date range, and exact search terms used to identify the seven methodological papers would allow readers to better evaluate completeness and replicability.
  2. [Results] Results section on the 30 SMARTs: while attrition rates and method descriptions are summarized, adding a table or breakdown of attrition by trial phase or by whether missingness was related to treatment response would strengthen the characterization of SMART-specific issues.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of the manuscript, recognition of its significance in highlighting the gap between SMART-specific missing data methods and current practice, and recommendation for minor revision. The pre-specified secondary extraction from the prior scoping review is indeed a strength for reproducibility.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a purely descriptive narrative review plus pre-specified secondary extraction from a prior scoping review. It reports counts of seven methodological papers (nearly all MAR-focused) and characterizes missing-data handling across 30 published SMARTs via attrition rates, methods used, and protocol comparisons. No equations, derivations, predictions, fitted parameters, or first-principles claims exist; the central finding of limited SMART-specific methodology and a gap with practice follows directly from these literature counts and extracted descriptions without any self-definitional reduction, fitted-input renaming, or load-bearing self-citation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

As a review paper there are no mathematical derivations, free parameters, or invented entities. Claims rest on the assumption that the literature search and secondary extraction are complete and representative.

pith-pipeline@v0.9.0 · 5577 in / 968 out tokens · 30300 ms · 2026-05-08T05:36:18.586186+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

3 extracted references · 2 canonical work pages

  1. [1]

    completers

    Huang Y, Zhou X-H. Identification of the optimal treatment regimen in the presence of missing covariates. Stat Med 2020; 39: 353–368. 16. Sun J, Fu B, Su L. Robust estimation of optimal dynamic treatment regimes with nonignorable missing covariates. arXiv [stat.ME]. Epub ahead of print June 28, 2025. DOI: 10.48550/arXiv.2506.22892. 17. Sun J, Fu B, Su L. ...

  2. [2]

    not reported

    Background and Rationale Sequential multiple assignment randomized trials (SMARTs) are multi-stage clinical trial designs in which participants may be randomized to treatments at two or more sequential decision points, with later randomizations potentially depending on response to earlier treatments. SMARTs are uniquely designed to generate data for estim...

  3. [3]

    Claude Opus 4.6 [Large Language Model]

    Supplementary Files The following supplementary files are attached to this preregistration: • Supplementary File 1: Extraction form workbook (Excel), containing three sheets: (a) the extraction form with 41 fields and response option instructions, (b) a codebook with field definitions, response options, and granularity rationale, and (c) the AI extraction...