Incorporating estimands into meta-analyses of clinical trials
Pith reviewed 2026-05-18 05:53 UTC · model grok-4.3
The pith
Different target estimands at the meta-analytical level make explicit the intercurrent event strategies driving heterogeneity in clinical trial results.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By specifying different target estimands at the meta-analytical level, the source of heterogeneity due to the intercurrent event strategy becomes explicit, allowing clearer interpretation of differences in results and enhancing the applicability of the pooled estimates to healthcare decision-making.
What carries the argument
The estimand framework, specifically its strategies for intercurrent events such as treatment policy versus hypothetical strategies, applied consistently at the meta-analytical level to distinguish contributions to heterogeneity.
Load-bearing premise
Different estimand strategies for intercurrent events can be consistently defined and applied across trials in a meta-analysis even without subject-level data from the individual studies.
What would settle it
A re-analysis of the same network meta-analysis dataset using subject-level data that shows the treatment policy and hypothetical strategies cannot be applied uniformly across trials, resulting in inconsistent or uninterpretable differences in the pooled estimates.
read the original abstract
The estimand framework is increasingly established to pose research questions in confirmatory clinical trials. In evidence synthesis, the uptake of estimands has been modest, and the PICO (Population, Intervention, Comparator, Outcome) framework is more often applied. While PICOs and estimands have overlapping elements, the estimand framework explicitly considers different strategies for intercurrent events. We propose a pragmatic framework for the use of estimands in meta-analyses of clinical trials, highlighting the value of estimands to systematically identify and mitigate key sources of quantitative heterogeneity, and to enhance the applicability or external validity of pooled estimates. Focus is placed on the role of strategies for intercurrent events, within the specific context of meta-analyses for health technology assessment. We apply the estimand framework to a network meta-analysis of clinical trials, comparing the efficacy of semaglutide versus dulaglutide in type 2 diabetes. We explore the impact of a treatment policy strategy for treatment discontinuation or initiation of rescue medication versus a hypothetical strategy for the corresponding intercurrent events. The specification of different target estimands at the meta-analytical level allows us to be explicit about the source of heterogeneity, the intercurrent event strategy, driving any potential differences in results. We advocate for the integration of estimands into the planning of meta-analyses, while acknowledging that potential challenges exist in the absence of subject-level data. Estimands can complement PICOs to strengthen communication between stakeholders about what evidence syntheses seek to demonstrate, and to ensure that the generated evidence is maximally relevant to healthcare decision-makers.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a pragmatic framework for incorporating the estimand framework into meta-analyses of clinical trials, with emphasis on using different strategies for intercurrent events (e.g., treatment policy versus hypothetical) to make sources of quantitative heterogeneity explicit and to improve the external validity of pooled estimates for health technology assessment. It illustrates the approach via a network meta-analysis comparing semaglutide versus dulaglutide in type 2 diabetes, exploring how choice of intercurrent-event strategy affects results, and advocates integrating estimands alongside PICO to strengthen communication with decision-makers while noting challenges when subject-level data are unavailable.
Significance. If the framework can be operationalized reproducibly from aggregate data, it would help meta-analysts and HTA bodies be more transparent about what target population and intercurrent-event handling a synthesis is actually estimating, thereby clarifying whether observed heterogeneity is substantive or artifactual. The semaglutide–dulaglutide example is a timely case given the importance of discontinuation and rescue in diabetes trials, but the manuscript provides only high-level description of the impact exploration.
major comments (2)
- [Application to semaglutide vs dulaglutide NMA] The central claim—that specifying different target estimands at the meta-analytic level isolates the intercurrent-event strategy as the driver of heterogeneity—requires that hypothetical strategies can be consistently defined and estimated from published summary statistics alone. The manuscript does not supply an explicit, reproducible mapping (e.g., the conditional outcome distributions or modeling assumptions needed to impute outcomes under a hypothetical strategy of no discontinuation/rescue) from the aggregate data used in the NMA.
- [Abstract and application section] The abstract and application section describe exploration of treatment-policy versus hypothetical strategies but report no quantitative results, confidence intervals, or sensitivity analyses showing how the pooled estimates or heterogeneity metrics actually change under each strategy. Without these, it is not possible to evaluate whether the estimand choice materially affects conclusions or merely restates the same data under different labels.
minor comments (2)
- [Methods] Clarify in the methods whether the NMA was re-run under each estimand or whether the same summary statistics were simply re-interpreted; the current description leaves this ambiguous.
- [Application section] Add a table or figure that tabulates the specific intercurrent-event strategies applied to each trial in the network, including any trial-specific assumptions required for the hypothetical arm.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the scope and limitations of our proposed framework. We agree that operationalizing estimands from aggregate data requires care and that the application section is primarily illustrative. Below we respond point by point to the major comments.
read point-by-point responses
-
Referee: [Application to semaglutide vs dulaglutide NMA] The central claim—that specifying different target estimands at the meta-analytic level isolates the intercurrent-event strategy as the driver of heterogeneity—requires that hypothetical strategies can be consistently defined and estimated from published summary statistics alone. The manuscript does not supply an explicit, reproducible mapping (e.g., the conditional outcome distributions or modeling assumptions needed to impute outcomes under a hypothetical strategy of no discontinuation/rescue) from the aggregate data used in the NMA.
Authors: We agree that a fully reproducible numerical implementation of a hypothetical strategy from aggregate data alone would require explicit conditional distributions or modeling assumptions that are not provided in the published summaries. Our manuscript is deliberately pragmatic and acknowledges this limitation in the absence of subject-level data. The central claim is not that hypothetical strategies can be estimated without assumptions, but that explicitly specifying the estimand at the meta-analytic level makes the intercurrent-event strategy (and any associated assumptions) transparent as a potential source of heterogeneity. We will revise the application section to include a more detailed description of the types of assumptions (e.g., no outcome effect of discontinuation, or use of external discontinuation rates) that would be needed for a hypothetical strategy, while reiterating that full estimation typically requires IPD. revision: yes
-
Referee: [Abstract and application section] The abstract and application section describe exploration of treatment-policy versus hypothetical strategies but report no quantitative results, confidence intervals, or sensitivity analyses showing how the pooled estimates or heterogeneity metrics actually change under each strategy. Without these, it is not possible to evaluate whether the estimand choice materially affects conclusions or merely restates the same data under different labels.
Authors: The application is intended to demonstrate how the framework isolates the intercurrent-event strategy conceptually rather than to deliver a new quantitative re-analysis of the NMA. Because the source trials and existing NMA report treatment-policy results, shifting to hypothetical strategies would require either IPD or additional assumptions beyond the scope of the current illustrative example. We will revise the abstract and application section to clarify this intent and to note that, in practice, sensitivity analyses could be conducted when IPD are available. We will also add a brief discussion of expected directional impacts based on known discontinuation patterns in type 2 diabetes trials, without claiming new numerical estimates. revision: partial
Circularity Check
No circularity: pragmatic proposal drawing on established estimand concepts
full rationale
The paper proposes a framework for incorporating estimands into meta-analyses of clinical trials, emphasizing explicit handling of intercurrent event strategies to identify heterogeneity sources. No mathematical derivation chain, equations, or first-principles results are presented that reduce outputs to inputs by construction. The central claim relies on established prior literature (e.g., ICH E9(R1) estimand framework) rather than self-citations that bear the load or fitted parameters renamed as predictions. The semaglutide-dulaglutide NMA application is presented as an illustrative example independent of the proposal itself. This is a conceptual and pragmatic contribution, self-contained against external benchmarks with no evidence of self-definitional, fitted-input, or ansatz-smuggling patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard meta-analysis assumptions hold, including that trial-level data can be aligned to common estimand definitions.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
The specification of different target estimands at the meta-analytical level allows us to be explicit about the source of heterogeneity, the intercurrent event strategy, driving any potential differences in results.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.