A simple strategy for valid inference in target trial emulations
Pith reviewed 2026-05-07 11:49 UTC · model grok-4.3
The pith
Sample splitting preserves standard coverage guarantees in target trial emulations even after data-informed protocol choices.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A sample splitting procedure addresses concerns about selective choices and invalid statistical inference in target trial emulations. In the initial split, investigators explore the data to define a target trial protocol. When these choices are made, the target trial protocol is implemented on the second split. Although the investigators made data-informed choices to select the target trial protocol, the inference has the usual coverage guarantees.
What carries the argument
Sample splitting, which separates data exploration for protocol definition from confirmatory analysis on an independent hold-out split to carry the validity argument.
If this is right
- Observational data can support more flexible, realistic target trial protocols without invalidating the final causal estimates.
- The procedure allows investigators to learn which target trials the data can support before committing to analysis.
- Standard coverage properties apply directly to the final estimates as long as the protocol remains fixed after the exploration split.
- The method aligns with existing practice in clinical trials where pilot data inform but do not contaminate the confirmatory analysis.
Where Pith is reading between the lines
- The same splitting logic could apply to other iterative selection problems in observational research, such as variable selection before final modeling.
- Study designs with larger sample sizes would be needed in practice to retain power after splitting while still enabling protocol exploration.
- Transparent reporting of the exploration split and chosen protocol could become standard to document the data-informed steps.
- Extensions to survival outcomes or time-varying treatments would require only minor adjustments to the splitting step.
Load-bearing premise
The two data splits are independent, and once the protocol is fixed on the first split, no further data-dependent decisions are made when analyzing the second split.
What would settle it
A simulation in which the nominal coverage probability fails to hold for confidence intervals computed on the second split after protocol selection on the first split would falsify the claim.
read the original abstract
Target trial emulation has improved comparative effectiveness research by making the causal question, assumptions, and analysis plan explicit. However, target trial protocols are usually developed iteratively. After examining the data, investigators revise the protocol to reflect which target trials the observational data can realistically support. While this iterative procedure is part of normal scientific practice, it raises concerns about selective choices and invalid statistical inference. A simple procedure can address these concerns. This procedure is based on sample splitting. In the initial split, investigators explore the data to define a target trial protocol. When these choices are made, the target trial protocol is implemented on the second split. Although the investigators made data-informed choices to select the target trial protocol, the inference has the usual coverage guarantees. The procedure is created to mirror how trialists move from pilot studies to a phase 3 trial. First, they use data from pilots and early-phase trials to learn and decide on a final protocol. Then they implement this protocol and analyze a new set of data in a phase 3 trial.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a sample-splitting procedure for target trial emulations: randomly partition the observational data into two independent splits; use the first split to iteratively explore the data and finalize a complete target trial protocol (eligibility criteria, treatment strategies, outcome definitions, and analysis plan); then apply this fixed protocol without further data-dependent modifications to the second split to obtain point estimates and inference. The central claim is that the resulting inference on the second split retains the usual coverage guarantees (e.g., valid confidence intervals) despite the data-informed protocol choices, by direct analogy to the transition from pilot/phase II studies to a pre-specified phase III trial.
Significance. If the argument holds, the procedure offers a practical, low-overhead solution to a recurring methodological concern in comparative effectiveness research: how to permit the iterative, data-driven refinement of target trial protocols that is standard in practice while preserving frequentist validity. By leveraging only the independence of random splits and pre-commitment to a fixed protocol on the hold-out set, the approach requires no new estimators or assumptions beyond those already used in the chosen analysis. It could be adopted immediately in many observational studies and may help reduce selective reporting bias without sacrificing the flexibility investigators need when observational data cannot support every conceivable target trial.
major comments (2)
- The coverage claim rests on the second split being analyzed exactly as if the protocol had been pre-specified, with no further data-dependent decisions. The manuscript should explicitly state (perhaps in a dedicated subsection or appendix) the precise regularity conditions required for this conditional validity, including correct model specification for the chosen estimator and strict independence of the two splits; without this, readers cannot verify whether the guarantee survives common practical complications such as missing data handling or subgroup definitions that might inadvertently depend on the hold-out set.
- The paper does not appear to contain a formal derivation or proof of the coverage result. While the logic follows from standard properties of independent samples, a short proof sketch (or reference to the relevant theorem on sample splitting) would make the central claim load-bearing and falsifiable rather than intuitive.
minor comments (3)
- The abstract and introduction repeatedly use the phrase 'usual coverage guarantees' without defining what estimator or inferential procedure is assumed on the second split; a single clarifying sentence would remove ambiguity.
- Consider adding a small numerical illustration or flowchart showing the sequence of steps (split, protocol finalization, analysis) to aid readers unfamiliar with target trial emulation.
- The manuscript would benefit from a brief discussion of how the procedure interacts with existing guidelines (e.g., Hernán & Robins target trial framework) and whether any modifications to those guidelines are implied.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our manuscript and for the constructive comments that will strengthen the presentation. We address each major comment below.
read point-by-point responses
-
Referee: The coverage claim rests on the second split being analyzed exactly as if the protocol had been pre-specified, with no further data-dependent decisions. The manuscript should explicitly state (perhaps in a dedicated subsection or appendix) the precise regularity conditions required for this conditional validity, including correct model specification for the chosen estimator and strict independence of the two splits; without this, readers cannot verify whether the guarantee survives common practical complications such as missing data handling or subgroup definitions that might inadvertently depend on the hold-out set.
Authors: We agree that a dedicated statement of the regularity conditions will improve clarity. In the revision we will add a new subsection (Section 3.3) that lists the precise conditions: (i) the two splits are formed by random partitioning and are therefore independent; (ii) every element of the target-trial protocol—including eligibility criteria, treatment strategies, outcome definitions, the full analysis plan, model specification, missing-data handling, and any subgroup definitions—is finalized on the first split and then applied verbatim to the second split with no further data-dependent modifications; (iii) the estimator chosen for the second split satisfies its standard regularity conditions (correct specification for parametric models, or the usual assumptions for semiparametric or non-parametric estimators). We will also note that practical complications such as missing data are accommodated by pre-specifying the imputation or complete-case procedure inside the protocol on the first split, so that the conditional validity is preserved. revision: yes
-
Referee: The paper does not appear to contain a formal derivation or proof of the coverage result. While the logic follows from standard properties of independent samples, a short proof sketch (or reference to the relevant theorem on sample splitting) would make the central claim load-bearing and falsifiable rather than intuitive.
Authors: We accept the suggestion. Although the result is a direct consequence of the independence of the splits and the pre-commitment to a fixed protocol, we will insert a short proof sketch in a new appendix. The sketch shows that, conditional on the protocol selected from the first split, the second split constitutes an independent sample to which a non-random protocol is applied; therefore the usual coverage properties of the estimator hold conditionally and hence unconditionally. We will also cite standard results on sample splitting for post-selection inference (e.g., the relevant theorems in the literature on cross-validation and data-driven model selection). revision: yes
Circularity Check
No significant circularity
full rationale
The paper's central argument rests on a sample-splitting procedure: the target trial protocol (eligibility, treatment strategies, outcome definitions, and analysis plan) is developed on one data split, then fixed and applied to an independent second split. The claim that inference on the second split retains standard coverage guarantees follows from the independence of the splits and the absence of further data-dependent decisions, which is a direct application of standard properties of independent samples and pre-specified analyses. No equations, fitted parameters, or derivations are presented that reduce to their own inputs by construction; there are no self-citations invoked as load-bearing uniqueness theorems, no ansatzes smuggled via prior work, and no renaming of known results as novel derivations. The procedure is explicitly analogized to the pilot-to-phase-3 trial transition, which is externally justified and does not create internal circularity.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The two data splits are independent and identically distributed samples from the same population.
Reference graph
Works this paper leans on
-
[1]
A simple strategy for valid inference in target trial emulations Mats J. Stensrud¹ ¹ Institute of Mathematics and Chair of Biostatistics, École polytechnique fédérale de Lausanne (EPFL), Lausanne, Switzerland. Abstract / Summary Target trial emulation has improved comparative effectiveness research by making the causal question, assumptions, and analysis ...
work page 2016
-
[2]
Let’s Take the Con Out of Econometrics
Leamer EE. Let’s Take the Con Out of Econometrics. American Economic Review. 1983;73(1):31–43. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational studies for causal effects. Biometrika. 1983;70(1):41–55. Simmons JP, Nelson LD, Simonsohn U. False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis All...
work page 1983
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.