A cautious use of auxiliary outcomes for decision-making in randomized clinical trials

Lorenzo Trippa; Massimiliano Russo; Steffen Ventz

arxiv: 2501.04187 · v3 · submitted 2025-01-07 · 📊 stat.AP

A cautious use of auxiliary outcomes for decision-making in randomized clinical trials

Massimiliano Russo , Steffen Ventz , Lorenzo Trippa This is my paper

Pith reviewed 2026-05-23 05:58 UTC · model grok-4.3

classification 📊 stat.AP

keywords clinical trialsauxiliary outcomesBayesian decision theorytype I error controlinterim analysisrandomized trialsefficiency gainsmultiple endpoints

0 comments

The pith

A Bayesian framework lets randomized trials use auxiliary outcomes like response rates for decisions while exactly controlling type I error without assuming how they relate to the primary outcome.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a decision-theoretic method that lets trial investigators bring faster auxiliary data, such as treatment response or biomarker readings, into both interim stopping rules and final conclusions alongside slower primary endpoints like survival. Existing practice usually discards the auxiliary data because any mismatch with the primary endpoint could push error rates above their nominal levels. The new rules are built so that type I error and other frequentist properties remain controlled for every possible joint distribution of the two outcomes, with no modeling of their dependence required. This opens the door to shorter or smaller trials when auxiliary measurements arrive early, while preserving the guarantees regulators and statisticians demand. The authors supply algorithms that achieve the control and demonstrate measurable efficiency gains under standard evaluation criteria.

Core claim

We develop a Bayesian decision-theoretic framework that uses both primary and auxiliary outcomes for interim and final decision-making. The framework allows investigators to control standard frequentist operating characteristics, such as the type I error rate, and can be used with auxiliary outcomes from emerging technologies, such as circulating tumor assays. False positive rates and other frequentist operating characteristics are rigorously controlled without any assumption about the concordance between primary and auxiliary outcomes. Algorithms implement the approach and show that incorporating auxiliary information can lead to relevant efficiency gains.

What carries the argument

Bayesian decision-theoretic rules that construct interim and final decisions to enforce exact or approximate control of type I error and related operating characteristics for arbitrary joint distributions of primary and auxiliary outcomes.

If this is right

Efficiency gains become available in trial duration or sample size when auxiliary data arrive earlier than the primary endpoint.
The same control holds when auxiliary data come from new technologies such as circulating tumor assays.
Algorithms exist that produce the required decision rules while preserving the frequentist guarantees.
No modeling step that links the auxiliary outcome to the primary outcome is needed to keep error rates in check.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of decision rules from any dependence model could let the same machinery apply to other pairs of delayed and fast endpoints in non-oncology settings.
Regulatory acceptance might hinge on whether the algorithms can be pre-specified and verified by simulation for each concrete trial design.
If the control holds in finite samples, the method could reduce the ethical cost of continuing trials after strong auxiliary signals appear.

Load-bearing premise

That decision rules and algorithms exist which maintain the target type I error rate for every possible joint distribution of the primary and auxiliary outcomes.

What would settle it

A concrete joint distribution of primary and auxiliary outcomes, together with a data-generating process under the null, for which the proposed decision procedure yields a type I error rate above the nominal level.

read the original abstract

Clinical trials often collect data on multiple outcomes, such as overall survival (OS), progression-free survival (PFS), and response to treatment (RT). In most cases, however, study designs only use primary outcome data for interim and final decision-making. In several disease settings, clinically relevant outcomes, for example OS, become available years after patient enrollment. Moreover, the effects of experimental treatments on OS might be less pronounced compared to auxiliary outcomes such as RT. We develop a Bayesian decision-theoretic framework that uses both primary and auxiliary outcomes for interim and final decision-making. The framework allows investigators to control standard frequentist operating characteristics, such as the type I error rate and can be used with auxiliary outcomes from emerging technologies, such as circulating tumor assays. False positive rates and other frequentist operating characteristics are rigorously controlled without any assumption about the concordance between primary and auxiliary outcomes. We discuss algorithms to implement this decision-theoretic approach and show that incorporating auxiliary information into interim and final decision-making can lead to relevant efficiency gains according to established and interpretable metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims a Bayesian framework for using auxiliary outcomes in trial decisions with type I error control for arbitrary joints and no concordance assumptions, but the stress-test logic suggests this forces the auxiliaries to be ignored.

read the letter

The main thing to know is that this paper offers a decision-theoretic way to fold auxiliary outcomes into interim and final trial decisions while claiming exact frequentist type I error control that holds for any joint distribution of primary and auxiliary data. The abstract stresses that no concordance assumption is needed and that efficiency gains are possible when the primary endpoint arrives late. That framing is the novel piece relative to standard surrogate or adaptive design work. The motivation is practical and clear: trials often have quicker data on response or new assays that could help without waiting years for survival. The paper does a decent job laying out why ignoring that information feels wasteful. The soft spot sits at the center of the claim. The stress-test argument holds up on the given text: if rejection can depend on the auxiliary value for a fixed primary outcome under the null, an adversary can always pick the conditional distribution to make rejection probability 1. The only rules that keep the supremum over all joints at or below alpha are those whose rejection indicator is a function of the primary alone. That directly undercuts the assertion that both outcomes are used while control is maintained without assumptions. No equations, algorithms, or simulations appear in the abstract to show a workaround, so the central guarantee looks difficult to deliver as stated. This is aimed at trial statisticians who work on adaptive or multi-outcome designs. A reader already thinking about delayed endpoints might get some value from the setup even if the guarantee needs revision. It deserves a serious referee to check the actual construction against the stress-test concern rather than a desk reject.

Referee Report

1 major / 1 minor

Summary. The manuscript develops a Bayesian decision-theoretic framework for incorporating both primary and auxiliary outcomes into interim and final decision-making in randomized clinical trials. It asserts that the framework controls frequentist operating characteristics such as type I error rate (and others) for arbitrary joint distributions of the outcomes, without requiring any concordance assumptions between primary and auxiliary, while also yielding efficiency gains; algorithms for implementation are discussed.

Significance. If the central claim of exact or approximate frequentist control without concordance assumptions were to hold while still allowing nontrivial use of auxiliary data, the result would be significant for clinical trial design, particularly in settings where auxiliary outcomes (e.g., from circulating tumor assays) become available earlier than the primary. The paper references established metrics for efficiency gains and applicability to emerging technologies.

major comments (1)

[Abstract] Abstract: The assertion that 'False positive rates and other frequentist operating characteristics are rigorously controlled without any assumption about the concordance between primary and auxiliary outcomes' cannot hold for nontrivial use of auxiliary data. For any decision rule whose rejection indicator depends on the realized auxiliary value at a fixed primary outcome y_p in the null support, an adversary can always choose a conditional distribution of the auxiliary given y_p such that P(reject | y_p) = 1, driving the type I error above the nominal level while preserving the null marginal on the primary. The only rules achieving sup_P(type I error) ≤ α over all joints are those that are deterministic functions of the primary outcome alone. This directly undermines the framework's stated ability to use auxiliary outcomes while maintaining rigorous control.

minor comments (1)

The abstract states that 'We discuss algorithms to implement this decision-theoretic approach and show that incorporating auxiliary information... can lead to relevant efficiency gains' but the provided text contains no explicit algorithm descriptions, pseudocode, or simulation results that would allow verification of the claimed control or gains.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for raising this important point about the scope of our frequentist control claims. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: The assertion that 'False positive rates and other frequentist operating characteristics are rigorously controlled without any assumption about the concordance between primary and auxiliary outcomes' cannot hold for nontrivial use of auxiliary data. For any decision rule whose rejection indicator depends on the realized auxiliary value at a fixed primary outcome y_p in the null support, an adversary can always choose a conditional distribution of the auxiliary given y_p such that P(reject | y_p) = 1, driving the type I error above the nominal level while preserving the null marginal on the primary. The only rules achieving sup_P(type I error) ≤ α over all joints are those that are deterministic functions of the primary outcome alone. This directly undermines the framework's stated ability to use auxiliary outcomes while maintaining rigorous control.

Authors: We agree with the referee that the argument is correct: uniform control of the type I error (i.e., sup over all joints with fixed null marginal on the primary) cannot be achieved by any nontrivial dependence on the auxiliary outcome. Our intended meaning of 'without any assumption about the concordance' was that the procedure does not require the auxiliary and primary to be positively associated or to have concordant treatment effects; the control was meant to hold under a fixed but arbitrary joint distribution that is modeled or estimated without imposing concordance. However, the abstract phrasing is imprecise and can be read as claiming distribution-free control over arbitrary joints, which the referee's counterexample shows is impossible. We will revise the abstract, the description of the framework, and related sections to clarify that frequentist control holds for a given joint distribution (without requiring concordance within that distribution) and that the guarantee is not uniform over all possible joints. This revision will be incorporated in the next version of the manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity; control claimed for arbitrary joints without reduction to fit or self-citation

full rationale

The paper's central claim is that a Bayesian decision-theoretic framework controls type I error and other frequentist characteristics for any joint distribution of primary and auxiliary outcomes, with no assumptions on concordance. The provided abstract and reader's summary contain no equations, fitted parameters, or self-citations that reduce this control to a post-hoc definition or construction. No patterns of self-definitional claims, fitted inputs called predictions, or ansatz smuggling appear. The derivation is presented as holding by the framework's design across arbitrary joints, making it self-contained against external benchmarks. A potential logical tension with the skeptic attack (that non-trivial auxiliary dependence would violate the supremum bound) concerns correctness rather than circularity in the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review provides no explicit model equations, priors, or distributional assumptions; free parameters, axioms, and invented entities cannot be enumerated.

pith-pipeline@v0.9.0 · 5712 in / 982 out tokens · 30240 ms · 2026-05-23T05:58:52.331692+00:00 · methodology

A cautious use of auxiliary outcomes for decision-making in randomized clinical trials

Core claim

What carries the argument

If this is right

Where Pith is reading between the lines

Load-bearing premise

What would settle it

discussion (0)