Marginal and conditional summary measures: transportability and compatibility across studies
Pith reviewed 2026-05-19 02:49 UTC · model grok-4.3
The pith
Marginal and conditional summary measures do not generally coincide, so their naive pooling in evidence synthesis produces bias.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Marginal and conditional summary measures do not generally coincide, have different interpretations and correspond to different decision questions. While these aspects have primarily been recognized for non-collapsible summary measures, they are equally problematic for some collapsible measures in the presence of effect modification. The paper clarifies the interpretation and properties of several marginal and conditional summary measures, considering different types of outcomes and hypothetical outcome-generating mechanisms, describes implications of the choice of summary measure for transportability, and illustrates existing summary measure incompatibility issues in the context of evidence
What carries the argument
The distinction between marginal and conditional summary measures and the role of effect modification by covariates in altering population-level treatment effects.
If this is right
- Covariates not conventionally labeled effect modifiers can still modify population-level treatment effects.
- Naive pooling of incompatible summary measures across studies introduces bias in evidence synthesis.
- Methods for indirect treatment comparisons must align the type of summary measure used in each study.
- Full individual patient data access simplifies checking and enforcing compatibility of summary measures.
Where Pith is reading between the lines
- Meta-analyses and network meta-analyses should explicitly state the target population and the marginal-versus-conditional choice before pooling.
- Adjustment methods for indirect comparisons could incorporate automated checks for measure compatibility before producing a pooled result.
- The same logic applies to any setting that transports causal effects between populations, such as external-control-arm studies.
Load-bearing premise
Outcome-generating mechanisms include effect modification by covariates that change population-level treatment effects across different studies or transportability scenarios.
What would settle it
Generate data under a known effect-modification mechanism, compute a marginal summary from one population and a conditional summary from another, pool them without alignment, and check whether the combined estimate deviates from the true target-population effect; the deviation should vanish when only compatible measures are pooled.
Figures
read the original abstract
Marginal and conditional summary measures do not generally coincide, have different interpretations and correspond to different decision questions. While these aspects have primarily been recognized for non-collapsible summary measures, they are equally problematic for some collapsible measures in the presence of effect modification. We clarify the interpretation and properties of several marginal and conditional summary measures, considering different types of outcomes and hypothetical outcome-generating mechanisms. We describe implications of the choice of summary measure for transportability, highlighting that covariates not conventionally described as effect modifiers can modify population-level treatment effects. Finally, we illustrate existing summary measure incompatibility issues in the context of evidence synthesis, using the case of covariate adjustment methods for indirect treatment comparisons. Because marginal and conditional summary measures do not generally coincide, their na\"ive pooling in evidence synthesis can produce bias. Almost invariably, care is needed to ensure that evidence synthesis methods are combining compatible summary measures, and this may be easier to accomplish with full access to individual patient data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that marginal and conditional summary measures generally do not coincide, have distinct interpretations, and correspond to different decision questions. This distinction, long recognized for non-collapsible measures, also applies to collapsible measures under effect modification. The authors clarify properties across outcome types and outcome-generating mechanisms, discuss implications for transportability (including that non-conventional covariates can modify population-level effects), and illustrate incompatibility problems in evidence synthesis via covariate-adjusted indirect treatment comparisons, concluding that naive pooling of incompatible measures can induce bias.
Significance. If the clarifications hold, the work has moderate significance for statistical methodology in evidence synthesis and transportability. It reinforces standard results on collapsibility and effect modification with a focused illustration in indirect comparisons, which could help practitioners avoid bias when combining studies. The emphasis on compatible summary measures and the value of individual patient data is a practical contribution, though it largely synthesizes existing statistical properties rather than introducing novel derivations.
minor comments (3)
- [Abstract] Abstract: the claim that the issues are 'equally problematic for some collapsible measures' would be strengthened by briefly naming one such measure (e.g., risk difference) and the specific transportability scenario in which the population-level effect is modified.
- [Evidence synthesis illustration] Illustration of indirect comparisons: the description of bias from naïve pooling is qualitative; adding a small numerical example or sensitivity calculation showing the magnitude of incompatibility under differing covariate distributions would make the practical warning more concrete.
- [Throughout] Notation: ensure consistent use of symbols for marginal vs. conditional quantities throughout; a short table summarizing the interpretations for each outcome type would improve readability.
Simulated Author's Rebuttal
We thank the referee for their thoughtful summary of our manuscript and for recommending minor revision. The review accurately captures the paper's focus on the non-coincidence of marginal and conditional summary measures, their implications for transportability, and the risks of naive pooling in evidence synthesis. We are pleased that the practical relevance for indirect treatment comparisons and the emphasis on compatible measures and individual patient data were noted.
Circularity Check
No significant circularity; claims rest on standard collapsibility results
full rationale
The paper's core arguments derive from established statistical properties of marginal vs. conditional measures under effect modification and differing covariate distributions, using standard g-computation logic. These are not reduced to self-definitions, fitted inputs renamed as predictions, or load-bearing self-citations within the manuscript. Minor self-citations to prior methodological work exist but are not central or unverified; the derivations remain self-contained against external benchmarks like collapsibility theory.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Outcome-generating mechanisms include effect modification by covariates that can alter population-level treatment effects.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Marginal and conditional summary measures do not generally coincide... naïve pooling in evidence synthesis can produce bias.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
covariates not conventionally described as effect modifiers can modify population-level treatment effects
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
Incorporating estimands into meta-analyses of clinical trials
A framework is proposed to integrate estimands into meta-analyses of clinical trials to identify sources of heterogeneity from intercurrent event strategies and improve the external validity of pooled estimates for he...
-
Propensity Score Weighting to Ensure Balance in Key Subgroups or Strata: A Practical Guide
A guide to stratified propensity score weighting for balancing key clinical subgroups in observational studies of treatment effects.
Reference graph
Works this paper leans on
-
[1]
, " * write output.state after.block = add.period write newline
ENTRY address author booktitle chapter doi edition editor eid howpublished institution journal key month note number organization pages publisher school series title type volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1 'mid.sentence...
-
[2]
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in " " * FUNCTION format....
- [3]
-
[4]
Arel-Bundock, V, Greifer, N & Heiss, A (2024), How to interpret statistical models using marginaleffects for r and python, Journal of Statistical Software, 111, pp. 1--32
work page 2024
-
[5]
Austin, PC (2013), The performance of different propensity score methods for estimating marginal hazard ratios, Statistics in medicine, 32(16), pp. 2837--2849
work page 2013
-
[6]
Austin, PC (2014), The use of propensity score methods with survival or time-to-event outcomes: reporting measures of effect similar to those used in randomized experiments, Statistics in medicine, 33(7), pp. 1242--1258
work page 2014
-
[7]
Ballman, KV (2015), Biomarker: predictive or prognostic? Journal of clinical oncology: official journal of the American Society of Clinical Oncology, 33(33), pp. 3968--3971
work page 2015
- [8]
-
[9]
which causal measure is easier to generalize? arXiv preprint arXiv:2303.16008
Colnet, B, Josse, J, Varoquaux, G & Scornet, E (2023), Risk ratio, odds ratio, risk difference... which causal measure is easier to generalize? arXiv preprint arXiv:2303.16008
- [10]
-
[11]
Hamza, T, Chalkou, K, Pellegrini, F, Kuhle, J, Benkert, P, Lorscheider, J, Zecca, C, Iglesias-Urrutia, CP, Manca, A, Furukawa, TA et al. (2023), Synthesizing cross-design evidence and cross-format data using network meta-regression, Research Synthesis Methods, 14(2), pp. 283--300
work page 2023
-
[12]
Hamza, T, Schwarzer, G & Salanti, G (2024), crossnma: An r package to synthesize cross-design evidence and cross-format data using network meta-analysis and network meta-regression, BMC Medical Research Methodology, 24(1), pp. 1--16
work page 2024
- [13]
-
[14]
Hern \'a n, MA & Robins, JM (2020), Causal inference: what if, Boca Raton: Chapman & Hall/CRC
work page 2020
- [15]
-
[16]
Huitfeldt, A, Stensrud, MJ & Suzuki, E (2019), On the collapsibility of measures of effect in the counterfactual causal framework, Emerging themes in epidemiology, 16(1), pp. 1--5
work page 2019
- [17]
-
[18]
Keene, ON, Lynggaard, H, Englert, S, Lanius, V & Wright, D (2023), Why estimands are needed to define treatment effects in clinical trials, BMC medicine, 21(1), p. 276
work page 2023
- [19]
-
[20]
Lenth, RV (2016), Least-squares means: the r package lsmeans, Journal of statistical software, 69(1), pp. 1--33
work page 2016
- [21]
- [22]
-
[23]
Morris, TP, Walker, AS, Williamson, EJ & White, IR (2022), Planning a method for covariate adjustment in individually randomised trials: a practical guide, Trials, 23(1), pp. 1--17
work page 2022
-
[24]
M \"u tze, T, Bell, J, Englert, S, Hougaard, P, Jackson, D, Lanius, V & Ravn, H (2025), Principles for defining estimands in clinical trials—a proposal, Pharmaceutical Statistics, 24(1), p. e2432
work page 2025
-
[25]
Phillippo, D, Ades, T, Dias, S, Palmer, S, Abrams, KR & Welton, N (2016), Nice dsu technical support document 18: methods for population-adjusted indirect comparisons in submissions to nice,
work page 2016
- [26]
-
[27]
Phillippo, DM, Dias, S, Ades, A, Belger, M, Brnabic, A, Saure, D, Schymura, Y & Welton, NJ (2023), Validating the assumptions of population adjustment: application of multilevel network meta-regression to a network of treatments for plaque psoriasis, Medical Decision Making, 43(1), pp. 53--67
work page 2023
-
[28]
Phillippo, DM, Dias, S, Ades, A, Belger, M, Brnabic, A, Schacht, A, Saure, D, Kadziola, Z & Welton, NJ (2020), Multilevel network meta-regression for population-adjusted treatment comparisons, Journal of the Royal Statistical Society: Series A (Statistics in Society), 183(3), pp. 1189--1210
work page 2020
-
[29]
Phillippo, DM, Dias, S, Ades, AE & Welton, NJ (2021), Target estimands for efficient decision making: Response to comments on “assessing the performance of population adjustment methods for anchored indirect comparisons: A simulation study”, Statistics in Medicine, 40(11), pp. 2759--2763
work page 2021
-
[30]
Phillippo, DM, Remiro-Az \'o car, A, Heath, A, Baio, G, Dias, S, Ades, A & Welton, NJ (2025), Effect modification and non-collapsibility together may lead to conflicting treatment decisions: A review of marginal and conditional estimands and recommendations for decision-making, Research Synthesis Methods, 16(2), pp. 1--27
work page 2025
-
[31]
Remiro-Az \'o car, A (2022 a ), Some considerations on target estimands for health technology assessment, Statistics in Medicine, 41(28), pp. 5592--5596
work page 2022
-
[32]
Remiro-Az \'o car, A (2022 b ), Target estimands for population-adjusted indirect comparisons, Statistics in Medicine, 41(28), pp. 5558--5569
work page 2022
-
[33]
Remiro-Az \'o car, A (2024), Transportability of model-based estimands in evidence synthesis, Statistics in Medicine, 43(22), pp. 4217--4249
work page 2024
- [34]
-
[35]
Remiro-Az \'o car, A, Heath, A & Baio, G (2021 a ), Conflating marginal and conditional treatment effects: Comments on “assessing the performance of population adjustment methods for anchored indirect comparisons: A simulation study”, Statistics in Medicine, 40(11), pp. 2753--2758
work page 2021
- [36]
- [37]
- [38]
- [39]
- [40]
-
[41]
Wiley StatsRef: Statistics Reference Online
stat00513 (2016), Meta-analysis, Hedges, L. Wiley StatsRef: Statistics Reference Online
work page 2016
-
[42]
Wiley StatsRef: Statistics Reference Online
stat03728 (2014), Effect modification and interaction, Greenland, S. Wiley StatsRef: Statistics Reference Online
work page 2014
-
[43]
Wiley StatsRef: Statistics Reference Online
stat05130 (2015), Collapsibility, Greenland, S. Wiley StatsRef: Statistics Reference Online
work page 2015
-
[44]
Wiley StatsRef: Statistics Reference Online
stat05152 (2015), Effect modification, McKnight, B. Wiley StatsRef: Statistics Reference Online
work page 2015
-
[45]
Wiley StatsRef: Statistics Reference Online
stat07909 (2014), Network meta-analysis, R\"ucker, G. Wiley StatsRef: Statistics Reference Online
work page 2014
- [46]
-
[47]
Van Lancker, K, Vo, TT & Akacha, M (2022), Estimands in heath technology assessment: a causal inference perspective, Statistics in medicine, 41(28), pp. 5577--5585
work page 2022
- [48]
- [49]
-
[50]
Wei, J, Xu, J, Bornkamp, B, Lin, R, Tian, H, Xi, D, Zhang, X, Zhao, Z & Roychoudhury, S (2024), Conditional and unconditional treatment effects in randomized clinical trials: Estimands, estimation, and interpretation, Statistics in Biopharmaceutical Research, 16(3), pp. 371--381
work page 2024
- [51]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.