Finite-Sample Analysis of Elimination in Active Hypothesis Testing

Hoang Ngoc Nguyen; Ivan Ruchkin; Jie Xu; Ziyuan Lin

arxiv: 2605.01039 · v1 · submitted 2026-05-01 · 💻 cs.LG

Finite-Sample Analysis of Elimination in Active Hypothesis Testing

Ziyuan Lin , Hoang Ngoc Nguyen , Jie Xu , Ivan Ruchkin This is my paper

Pith reviewed 2026-05-09 19:33 UTC · model grok-4.3

classification 💻 cs.LG

keywords active hypothesis testingeliminationfinite-sample analysisstopping timeTrack-and-Stopfixed-confidencesequential testing

0 comments

The pith

Elimination in active hypothesis testing shortens finite-sample stopping times via tighter bounds on reduced sets.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper studies the effect of hypothesis elimination on stopping times in fixed-confidence active hypothesis testing. It augments the Track-and-Stop algorithm so that champion-specific active-opponent sets are progressively pruned and sensing effort is reallocated to surviving alternatives. The analysis derives a non-asymptotic upper bound on expected stopping time in which the benefit of elimination appears in the non-leading term through improved tracking and concentration constants on the smaller hypothesis set. An aggressiveness parameter is added to adjust the elimination threshold and trade off stopping speed against guarantee strength. Experiments on synthetic Gaussian problems confirm the predicted finite-sample improvements.

Core claim

The paper establishes that augmenting Track-and-Stop with progressive pruning of champion-specific active-opponent sets produces a non-asymptotic upper bound on expected stopping time, with the finite-sample gain from elimination appearing in the non-leading term because of tighter tracking and concentration constants on the reduced hypothesis set.

What carries the argument

Elimination-augmented Track-and-Stop algorithm that prunes champion-specific active-opponent sets and reallocates sensing effort while preserving the fixed-confidence guarantee.

If this is right

Expected stopping times decrease in finite samples without altering the leading asymptotic term.
Sensing resources concentrate on fewer hypotheses, producing tighter concentration rates.
The aggressiveness parameter lets users trade faster elimination for a modestly weaker confidence guarantee.
The method applies directly to safety-critical sequential testing where sample budgets are limited.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar pruning of opponent sets could shorten sample requirements in other adaptive testing or bandit procedures.
The relative benefit of elimination may grow in settings with larger or more overlapping hypothesis classes.
The approach could combine with other adaptive sampling rules if matching concentration inequalities are available.

Load-bearing premise

Pruning active-opponent sets for the current champion preserves the overall fixed-confidence guarantee without introducing bias into the concentration bounds.

What would settle it

Repeated trials on the synthetic Gaussian instances in which the observed expected stopping time exceeds the derived non-asymptotic upper bound or shows no reduction relative to the non-elimination baseline.

Figures

Figures reproduced from arXiv: 2605.01039 by Hoang Ngoc Nguyen, Ivan Ruchkin, Jie Xu, Ziyuan Lin.

**Figure 1.** Figure 1: Elimination Mechanism. Upper: the active-opponent set Gt(h ∗) shrinks as opponents are eliminated (×); Lower: the information rate (orange) starts below D0 during warm-up (red), then rises after each elimination, approaching DK. and the correctness guarantee relaxes to Pr(hˆ(τ ) ̸= h ∗ ) ≤ celimδ α , where Ce1,σ, Ce2,σ, Ce3, celim > 0 are chain-dependent constants independent of δ. Remark 2 (Speed–safety … view at source ↗

**Figure 3.** Figure 3: Relaxed-confidence trade-off in the skewed setting: stopping time [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 2.** Figure 2: Exact-confidence stopping-time comparison versus [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 4.** Figure 4: Mechanism diagnostics with grey global elimination line. First: [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

A fixed-confidence, finite-sample problem of active hypothesis testing arises in many safety-critical applications. Situated in the context of sequential hypothesis testing, this paper studies the effect of hypothesis elimination on the stopping time. We introduce an elimination-augmented Track-and-Stop algorithm, in which champion-specific active-opponent sets are progressively pruned, and sensing effort is reallocated toward the surviving alternatives. Our analysis derives a non-asymptotic upper bound on the expected stopping time. The gain in finite-sample from elimination appears on the scale of the non-leading term, resulting from tighter tracking and concentration constants on the reduced hypothesis set. Furthermore, we introduce an aggressiveness parameter to modulate the trade-off between faster elimination and weaker confidence guarantee. An experimental study on synthetic Gaussian instances confirms the theoretical predictions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds elimination to Track-and-Stop and derives a non-asymptotic bound where the finite-sample gain shows up only in the lower-order term.

read the letter

This paper takes the Track-and-Stop algorithm for fixed-confidence active hypothesis testing and layers on progressive elimination of non-competitive hypotheses. It reallocates sensing effort to the surviving set and gives a non-asymptotic upper bound on expected stopping time. The claimed improvement from elimination appears in the non-leading term through tighter tracking and concentration constants on the smaller hypothesis set. They also add an aggressiveness parameter that trades faster pruning against a weaker guarantee, and they run experiments on synthetic Gaussian instances that line up with the predictions.

Referee Report

2 major / 2 minor

Summary. The paper analyzes the impact of hypothesis elimination in fixed-confidence active hypothesis testing. It introduces an elimination-augmented Track-and-Stop algorithm that progressively prunes champion-specific active-opponent sets and reallocates sensing effort to surviving hypotheses. The central result is a non-asymptotic upper bound on the expected stopping time, with the finite-sample gain from elimination appearing only in the non-leading term via tighter tracking and concentration constants on the reduced set. An aggressiveness parameter modulates the elimination threshold to trade off speed against the confidence guarantee. Experiments on synthetic Gaussian instances are used to confirm the theoretical predictions.

Significance. If the non-asymptotic bound holds after properly handling data-dependent pruning, the work provides a concrete finite-sample characterization of elimination benefits in active testing, which is relevant for safety-critical sequential decision problems. The observation that gains appear only in lower-order terms is precise and useful for understanding when elimination is worthwhile. The aggressiveness parameter offers a tunable knob, and the synthetic experiments provide initial validation. However, the modest scale of the improvement and the need for rigorous control of selection bias limit the immediate practical impact.

major comments (2)

[analysis of the non-asymptotic bound] The non-asymptotic upper bound on expected stopping time relies on tracking and concentration arguments applied to the reduced hypothesis set after pruning. Because elimination decisions are made on the same observation sequence used for the concentration inequalities, the surviving-set bounds are data-dependent. Standard sub-Gaussian or KL tail bounds do not automatically remain valid; an additional union-bound, martingale correction, or explicit accounting for the selection bias induced by the aggressiveness threshold appears necessary to preserve the exact fixed-confidence guarantee. The manuscript should clarify how this dependence is controlled in the derivation of the bound.
[derivation of the upper bound] The claim that the finite-sample gain appears only on the scale of the non-leading term is load-bearing for the paper's message about the value of elimination. This requires that the leading term (typically governed by the KL divergence or information ratio to the true hypothesis) remains unchanged while only the constants in the tracking error or concentration terms improve. The proof sketch should explicitly separate these terms and show that pruning does not alter the leading coefficient.

minor comments (2)

[algorithm description] The aggressiveness parameter is introduced to modulate the elimination threshold, but its precise effect on the confidence guarantee (e.g., whether it introduces a multiplicative factor or additive slack) should be stated more explicitly, perhaps with a dedicated proposition.
[experiments] The experimental section uses synthetic Gaussian instances; it would be helpful to report the exact parameter settings (means, variances, number of hypotheses) and to include a plot or table showing the empirical stopping time versus the theoretical bound for different aggressiveness values.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. The comments help clarify the presentation of the non-asymptotic analysis. We address each major comment below and will incorporate the suggested clarifications in the revised version.

read point-by-point responses

Referee: [analysis of the non-asymptotic bound] The non-asymptotic upper bound on expected stopping time relies on tracking and concentration arguments applied to the reduced hypothesis set after pruning. Because elimination decisions are made on the same observation sequence used for the concentration inequalities, the surviving-set bounds are data-dependent. Standard sub-Gaussian or KL tail bounds do not automatically remain valid; an additional union-bound, martingale correction, or explicit accounting for the selection bias induced by the aggressiveness threshold appears necessary to preserve the exact fixed-confidence guarantee. The manuscript should clarify how this dependence is controlled in the derivation of the bound.

Authors: We appreciate the referee's emphasis on rigorously handling the data dependence induced by pruning. In the proof of the main non-asymptotic bound (Theorem 1), we control this via a combination of a union bound over possible elimination epochs and a martingale concentration inequality applied to the stopped process. The aggressiveness parameter is chosen so that each elimination decision occurs only after the empirical log-likelihood ratio exceeds a threshold that permits a peeling argument to bound the selection bias; this ensures the fixed-confidence guarantee is preserved exactly as stated. We will add a dedicated remark in Section 3 and an expanded appendix subsection that spells out this martingale correction and the union-bound accounting. revision: yes
Referee: [derivation of the upper bound] The claim that the finite-sample gain appears only on the scale of the non-leading term is load-bearing for the paper's message about the value of elimination. This requires that the leading term (typically governed by the KL divergence or information ratio to the true hypothesis) remains unchanged while only the constants in the tracking error or concentration terms improve. The proof sketch should explicitly separate these terms and show that pruning does not alter the leading coefficient.

Authors: We agree that an explicit decomposition is essential to substantiate the claim. In our derivation, the leading term of the upper bound on expected stopping time is governed by the information ratio (or sum of KL divergences) between the true hypothesis and the worst-case alternative, evaluated on the full initial hypothesis set; this coefficient is invariant to subsequent pruning because the initial tracking phase and the asymptotic rate are determined before any eliminations occur. Elimination only improves the multiplicative constants in the tracking-error and concentration terms that appear in the lower-order additive terms. We will revise the proof sketch in Section 4 to include an explicit separation of the bound into its leading and non-leading components, with a short lemma showing invariance of the leading coefficient to the aggressiveness parameter. revision: yes

Circularity Check

0 steps flagged

No circularity: non-asymptotic bound derived from standard tracking and concentration inequalities on pruned sets

full rationale

The paper's central result is a non-asymptotic upper bound on expected stopping time for an elimination-augmented Track-and-Stop algorithm. The derivation applies standard concentration and tracking arguments to the reduced hypothesis set after progressive pruning of champion-specific active-opponent sets. The finite-sample gain appears only in non-leading terms via tighter constants, and the aggressiveness parameter simply modulates the elimination threshold. No step reduces by construction to a fitted input, self-definition, or load-bearing self-citation chain; the analysis remains self-contained against external sub-Gaussian/KL tail bounds and does not invoke uniqueness theorems or ansatzes from prior author work as the sole justification. This matches the default expectation for papers whose core claims rest on established martingale or concentration tools.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The analysis rests on standard concentration inequalities and tracking arguments for sequential hypothesis testing; the aggressiveness parameter is introduced as a tunable element without independent empirical grounding beyond the synthetic experiments.

free parameters (1)

aggressiveness parameter
Introduced to modulate the trade-off between faster elimination and weaker confidence guarantee; its value is chosen by the user rather than derived.

axioms (1)

domain assumption Fixed-confidence sequential hypothesis testing admits progressive elimination of alternatives while preserving overall guarantees via adjusted tracking sets.
Invoked when defining champion-specific active-opponent sets and reallocating sensing effort.

pith-pipeline@v0.9.0 · 5430 in / 1239 out tokens · 26415 ms · 2026-05-09T19:33:09.647280+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages

[1]

Wald, Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, vol

A. Wald, Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, vol. 16, no. 2, pp. 117–186, 1945

work page 1945
[2]

Armitage, Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis.Journal of the Royal Statistical Society

P. Armitage, Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis.Journal of the Royal Statistical Society. Series B (Methodological), vol. 12, no. 1, pp. 137–144, 1950

work page 1950
[3]

Teixeira, I

A. Teixeira, I. Shames, H. Sandberg, and K. H. Johansson, Distributed fault detection and isolation resilient to network model uncertainties. IEEE Transactions on Cybernetics, vol. 44, no. 11, pp. 2024–2037, 2014

work page 2024
[4]

N. A. Goodman, P. R. Venkata, and M. A. Neifeld, Adaptive waveform design and sequential hypothesis testing for target recognition with active sensors,IEEE Journal of Selected Topics in Signal Processing, vol. 1, no. 1, pp. 105–113, Jun. 2007

work page 2007
[5]

Cohen and Q

K. Cohen and Q. Zhao, Active hypothesis testing for anomaly detec- tion.IEEE Transactions on Information Theory, vol. 61, no. 3, pp. 1432–1450, 2015

work page 2015
[6]

Huang, K

B. Huang, K. Cohen, and Q. Zhao, Active anomaly detection in heterogeneous processes.IEEE Transactions on Information Theory, vol. 65, no. 4, pp. 2284–2301, 2019

work page 2019
[7]

Chernoff, Sequential design of experiments.The Annals of Math- ematical Statistics, vol

H. Chernoff, Sequential design of experiments.The Annals of Math- ematical Statistics, vol. 30, no. 3, pp. 755–770, 1959

work page 1959
[8]

Naghshvar and T

M. Naghshvar and T. Javidi, Active sequential hypothesis testing.The Annals of Statistics, vol. 41, no. 6, pp. 2703–2738, 2013

work page 2013
[9]

Nitinawarat, G

S. Nitinawarat, G. K. Atia, and V . V . Veeravalli, Controlled sensing for multihypothesis testing.IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2451–2464, 2013

work page 2013
[10]

G. R. Prabhu, S. Bhashyam, A. Gopalan, and R. Sundaresan, Se- quential multi-hypothesis testing in multi-armed bandit problems: An approach for asymptotic optimality.IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4790–4817, 2022

work page 2022
[11]

Kaufmann, O

E. Kaufmann, O. Capp ´e, and A. Garivier, On the complexity of best- arm identification in multi-armed bandit models.Journal of Machine Learning Research, vol. 17, no. 1, pp. 1–42, 2016

work page 2016
[12]

Garivier and E

A. Garivier and E. Kaufmann, Optimal best arm identification with fixed confidence. inProceedings of the 29th Conference on Learning Theory,JMLR Workshop and Conference Proceedings, vol. 49, pp. 1–30, 2016

work page 2016
[13]

Tirinzoni and R

A. Tirinzoni and R. Degenne, On elimination strategies for bandit fixed-confidence identification.Advances in Neural Information Pro- cessing Systems (NeurIPS), vol. 35, 2022

work page 2022
[14]

Vershinin, A

G. Vershinin, A. Cohen, and O. Gurewitz, Multi-stage active sequential hypothesis testing with clustered hypotheses. inProceedings of the IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, pp. 1–6, 2025

work page 2025
[15]

K. Gan, S. Jia, and A. Li, Greedy approximation algorithms for active sequential hypothesis testing.Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 5012–5024, 2021

work page 2021
[16]

Gurevich, K

A. Gurevich, K. Cohen, and Q. Zhao, Sequential anomaly detection under a nonlinear system cost.IEEE Transactions on Signal Process- ing, vol. 67, no. 14, pp. 3689–3703, 2019

work page 2019
[17]

T. M. Cover and J. A. Thomas,Elements of Information Theory. USA: Wiley-Interscience, 2006

work page 2006
[18]

Kaufmann and W

E. Kaufmann and W. M. Koolen, Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals.Journal of Machine Learning Research, vol. 22, no. 246, pp. 1–44, 2021. APPENDIX A. Experimental Design and Results We consider Gaussian observation models with unit vari- ance. For each actiona∈ Aand hypothesish∈ H, Ot |(A t =a, h ∗ ...

work page 2021
[19]

Two tables are provided: average stopping time in table I and error rate in table II

Results of Experiment I:δ-Confidence stopping-time comparison:We report the raw results across the three environments for varying confidence parameterδ. Two tables are provided: average stopping time in table I and error rate in table II. TABLE I AVERAGE STOPPING TIME FOR DIFFERENT ENVIRONMENTS ANDδ (EXPERIMENTI). Environment δ Greedy TaS StopElim FullEli...

work page
[20]

We record both the stopping time and the empirical terminal error rate and put the results into table III, IV, and V

Results of Experiment II: Relaxed-Confidence Time- Error Trade-off:In this experiment, we fixδ= 0.1and vary the elimination parameterα∈ {0.2,0.4,0.6,0.8,1.0}under the all three environments. We record both the stopping time and the empirical terminal error rate and put the results into table III, IV, and V. TABLE III STOPPING TIME FOR DIFFERENTα(EXPERIMEN...

work page 1959
[21]

Proof.Since ˆh(t)maximizes the likelihood,Pt i=1 logp Ai(Oi | ˆh(t))≥ Pt i=1 logp Ai(Oi |h ∗), which givesZ t(h∗, ˆh(t))≤0

Basic ingredients and proof inputs: Lemma 1(Pairwise domination of the MLE champion).If ˆh(t)̸=h ∗, then there existsg̸=h ∗ such thatZ t(h∗, g)≤0. Proof.Since ˆh(t)maximizes the likelihood,Pt i=1 logp Ai(Oi | ˆh(t))≥ Pt i=1 logp Ai(Oi |h ∗), which givesZ t(h∗, ˆh(t))≤0. Lemma 2(Forced exploration; Garivier & Kaufmann, Lemma 7 [12]).Under the C-Tracking se...

work page
[22]

In particular,D k ≥ Dk−1 for everyk= 1,

Basic geometric facts: Lemma 6(Set monotonicity).For everyS ′ ⊆S⊆ H \{h ∗} and everyw∈∆(A),f S′(w)≥f S(w). In particular,D k ≥ Dk−1 for everyk= 1, . . . , K−2. Proof.SinceS ′ ⊆S, the minimization definingf S′(w)is over a smaller set thanf S(w), sof S′(w)≥f S(w). Applying withS ′ =S k,S=S k−1,w=w (k−1) givesD k = fSk(w(k))≥f Sk(w(k−1))≥f Sk−1(w(k−1)) =D k−...

work page
[23]

There exist constants Bch, Cch >0such that Pr(Tch > u)≤B ch u e−Cch √u,∀u≥1

Champion stabilization: Lemma 9(Champion stabilization tail).RecallT ch := inf{s≥1 : ˆh(r) =h ∗ for allr≥s}. There exist constants Bch, Cch >0such that Pr(Tch > u)≤B ch u e−Cch √u,∀u≥1. Proof.Fixg̸=h ∗. Fora∈ A, let ρa,g := Z p pa(x|h ∗)p a(x|g)dx, c a,g :=−logρ a,g ≥0. By identifiability, for everyg̸=h ∗ there exists at least one asuch thatc a,g >0. Cond...

work page
[24]

IfT ch ≤T 1 ≤ · · · ≤T K−1, the champion stabilizes first, eliminations occur in order, and the argument is the most direct

work page
[25]

IfT r < Tch ≤T K−1 for somer, then one or more elim- inations occur before the champion has permanently stabilized. This is not a problem on the chain eventE: the active set still shrinks correctly, and the temporary mismatch caused by a wrong champion is captured by the champion-induced deficitΓ ch(t). The proof strategy is to show that for every suffici...

work page
[26]

Fixr∈ {1,

Information lower bounds and recursive surrogate cer- tificates: a) Stage-rdrift lower bound.: Proposition 4(Average-target lower bound before stager). Fixr∈ {1, . . . , K−1},t≥3. SupposeT r−1 ≤s < T r with mch(t)≤s≤t(by conventionT 0 := 0, so forr= 1the condition reduces tom ch(t)≤s < T 1). OnE ∩E ch t , fSr−1(¯u(s))≥Dr−1 − Πr−1+Dr−1mch(t) s , whereΠ r−1...

work page
[27]

We thus construct a deterministic certificate directly boundingT K−1

Eventual exhaustion certificate and expectation bound: Since the algorithm stops when the active-opponent set of the champion becomes empty, the stopping time coincides with the final elimination time:τ=T K−1 onE. We thus construct a deterministic certificate directly boundingT K−1. Definition 10(Deterministic eventual exhaustion certificate). Define Hσ(t...

work page
[28]

In particular, forα= 1, the leading term isL/D 0 (matching Theorem 1 in Section IV); forα <1, the leading term is αL/D0 (matching Proposition 3 in Section IV)

Explicit bounds and fixed-chain theorem: Theorem 3(Fixed-chain expectation bound).For everyσ∈ S, everyα∈(0,1], and everyδ∈(0,1), E[τ1 Eσ]≤Pr(E σ) αL D0 + b D0 log(e+L) +C 1,σ p Llog(e+L) +C 2,σ log2(e+L) + eC3,σ. In particular, forα= 1, the leading term isL/D 0 (matching Theorem 1 in Section IV); forα <1, the leading term is αL/D0 (matching Proposition 3 ...

work page
[29]

To pass from one chain to the whole correct- elimination event, it is therefore enough to sum over the disjoint chain events

Aggregation over chains:At this point, the fixed-chain analysis is complete: Theorem 3 already provides, for each admissible chainσ∈S, an upper bound on the contribution E[τ1 Eσ]. To pass from one chain to the whole correct- elimination event, it is therefore enough to sum over the disjoint chain events. This corollary is the appendix-level statement of P...

work page
[30]

guess-and-verify

Control of the complement term:In this section we bound Rcomp(δ) :=E[τ1 E ccorr], the contribution toE[τ]from sample paths on which the algorithm’s elimination sequence does not correspond to any valid chainσ∈S. SinceE c corr has no chain structure, the chain-specific tools of Section VI-B.5 no longer apply, and all constants below carry the subscript “co...

work page
[31]

Specializingα= 1recovers Theorem 1 of the main text; specializingα < 1recovers Corollary 1

Full expectation bound:We now combine the bound onE[τ1 Ecorr](Corollary 2, Section B.5) with the bound on the complementR comp(δ)(Proposition 10, Section B.6) to obtain the full expectation bound onE[τ]. Specializingα= 1recovers Theorem 1 of the main text; specializingα < 1recovers Corollary 1. Since Proposition 9 holds for all α∈(0,1]andδ∈(0,1), andR com...

work page

[1] [1]

Wald, Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, vol

A. Wald, Sequential tests of statistical hypotheses.The Annals of Mathematical Statistics, vol. 16, no. 2, pp. 117–186, 1945

work page 1945

[2] [2]

Armitage, Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis.Journal of the Royal Statistical Society

P. Armitage, Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis.Journal of the Royal Statistical Society. Series B (Methodological), vol. 12, no. 1, pp. 137–144, 1950

work page 1950

[3] [3]

Teixeira, I

A. Teixeira, I. Shames, H. Sandberg, and K. H. Johansson, Distributed fault detection and isolation resilient to network model uncertainties. IEEE Transactions on Cybernetics, vol. 44, no. 11, pp. 2024–2037, 2014

work page 2024

[4] [4]

N. A. Goodman, P. R. Venkata, and M. A. Neifeld, Adaptive waveform design and sequential hypothesis testing for target recognition with active sensors,IEEE Journal of Selected Topics in Signal Processing, vol. 1, no. 1, pp. 105–113, Jun. 2007

work page 2007

[5] [5]

Cohen and Q

K. Cohen and Q. Zhao, Active hypothesis testing for anomaly detec- tion.IEEE Transactions on Information Theory, vol. 61, no. 3, pp. 1432–1450, 2015

work page 2015

[6] [6]

Huang, K

B. Huang, K. Cohen, and Q. Zhao, Active anomaly detection in heterogeneous processes.IEEE Transactions on Information Theory, vol. 65, no. 4, pp. 2284–2301, 2019

work page 2019

[7] [7]

Chernoff, Sequential design of experiments.The Annals of Math- ematical Statistics, vol

H. Chernoff, Sequential design of experiments.The Annals of Math- ematical Statistics, vol. 30, no. 3, pp. 755–770, 1959

work page 1959

[8] [8]

Naghshvar and T

M. Naghshvar and T. Javidi, Active sequential hypothesis testing.The Annals of Statistics, vol. 41, no. 6, pp. 2703–2738, 2013

work page 2013

[9] [9]

Nitinawarat, G

S. Nitinawarat, G. K. Atia, and V . V . Veeravalli, Controlled sensing for multihypothesis testing.IEEE Transactions on Automatic Control, vol. 58, no. 10, pp. 2451–2464, 2013

work page 2013

[10] [10]

G. R. Prabhu, S. Bhashyam, A. Gopalan, and R. Sundaresan, Se- quential multi-hypothesis testing in multi-armed bandit problems: An approach for asymptotic optimality.IEEE Transactions on Information Theory, vol. 68, no. 7, pp. 4790–4817, 2022

work page 2022

[11] [11]

Kaufmann, O

E. Kaufmann, O. Capp ´e, and A. Garivier, On the complexity of best- arm identification in multi-armed bandit models.Journal of Machine Learning Research, vol. 17, no. 1, pp. 1–42, 2016

work page 2016

[12] [12]

Garivier and E

A. Garivier and E. Kaufmann, Optimal best arm identification with fixed confidence. inProceedings of the 29th Conference on Learning Theory,JMLR Workshop and Conference Proceedings, vol. 49, pp. 1–30, 2016

work page 2016

[13] [13]

Tirinzoni and R

A. Tirinzoni and R. Degenne, On elimination strategies for bandit fixed-confidence identification.Advances in Neural Information Pro- cessing Systems (NeurIPS), vol. 35, 2022

work page 2022

[14] [14]

Vershinin, A

G. Vershinin, A. Cohen, and O. Gurewitz, Multi-stage active sequential hypothesis testing with clustered hypotheses. inProceedings of the IEEE International Symposium on Information Theory (ISIT), Ann Arbor, MI, USA, pp. 1–6, 2025

work page 2025

[15] [15]

K. Gan, S. Jia, and A. Li, Greedy approximation algorithms for active sequential hypothesis testing.Advances in Neural Information Processing Systems (NeurIPS), vol. 34, pp. 5012–5024, 2021

work page 2021

[16] [16]

Gurevich, K

A. Gurevich, K. Cohen, and Q. Zhao, Sequential anomaly detection under a nonlinear system cost.IEEE Transactions on Signal Process- ing, vol. 67, no. 14, pp. 3689–3703, 2019

work page 2019

[17] [17]

T. M. Cover and J. A. Thomas,Elements of Information Theory. USA: Wiley-Interscience, 2006

work page 2006

[18] [18]

Kaufmann and W

E. Kaufmann and W. M. Koolen, Mixture Martingales Revisited with Applications to Sequential Tests and Confidence Intervals.Journal of Machine Learning Research, vol. 22, no. 246, pp. 1–44, 2021. APPENDIX A. Experimental Design and Results We consider Gaussian observation models with unit vari- ance. For each actiona∈ Aand hypothesish∈ H, Ot |(A t =a, h ∗ ...

work page 2021

[19] [19]

Two tables are provided: average stopping time in table I and error rate in table II

Results of Experiment I:δ-Confidence stopping-time comparison:We report the raw results across the three environments for varying confidence parameterδ. Two tables are provided: average stopping time in table I and error rate in table II. TABLE I AVERAGE STOPPING TIME FOR DIFFERENT ENVIRONMENTS ANDδ (EXPERIMENTI). Environment δ Greedy TaS StopElim FullEli...

work page

[20] [20]

We record both the stopping time and the empirical terminal error rate and put the results into table III, IV, and V

Results of Experiment II: Relaxed-Confidence Time- Error Trade-off:In this experiment, we fixδ= 0.1and vary the elimination parameterα∈ {0.2,0.4,0.6,0.8,1.0}under the all three environments. We record both the stopping time and the empirical terminal error rate and put the results into table III, IV, and V. TABLE III STOPPING TIME FOR DIFFERENTα(EXPERIMEN...

work page 1959

[21] [21]

Proof.Since ˆh(t)maximizes the likelihood,Pt i=1 logp Ai(Oi | ˆh(t))≥ Pt i=1 logp Ai(Oi |h ∗), which givesZ t(h∗, ˆh(t))≤0

Basic ingredients and proof inputs: Lemma 1(Pairwise domination of the MLE champion).If ˆh(t)̸=h ∗, then there existsg̸=h ∗ such thatZ t(h∗, g)≤0. Proof.Since ˆh(t)maximizes the likelihood,Pt i=1 logp Ai(Oi | ˆh(t))≥ Pt i=1 logp Ai(Oi |h ∗), which givesZ t(h∗, ˆh(t))≤0. Lemma 2(Forced exploration; Garivier & Kaufmann, Lemma 7 [12]).Under the C-Tracking se...

work page

[22] [22]

In particular,D k ≥ Dk−1 for everyk= 1,

Basic geometric facts: Lemma 6(Set monotonicity).For everyS ′ ⊆S⊆ H \{h ∗} and everyw∈∆(A),f S′(w)≥f S(w). In particular,D k ≥ Dk−1 for everyk= 1, . . . , K−2. Proof.SinceS ′ ⊆S, the minimization definingf S′(w)is over a smaller set thanf S(w), sof S′(w)≥f S(w). Applying withS ′ =S k,S=S k−1,w=w (k−1) givesD k = fSk(w(k))≥f Sk(w(k−1))≥f Sk−1(w(k−1)) =D k−...

work page

[23] [23]

There exist constants Bch, Cch >0such that Pr(Tch > u)≤B ch u e−Cch √u,∀u≥1

Champion stabilization: Lemma 9(Champion stabilization tail).RecallT ch := inf{s≥1 : ˆh(r) =h ∗ for allr≥s}. There exist constants Bch, Cch >0such that Pr(Tch > u)≤B ch u e−Cch √u,∀u≥1. Proof.Fixg̸=h ∗. Fora∈ A, let ρa,g := Z p pa(x|h ∗)p a(x|g)dx, c a,g :=−logρ a,g ≥0. By identifiability, for everyg̸=h ∗ there exists at least one asuch thatc a,g >0. Cond...

work page

[24] [24]

IfT ch ≤T 1 ≤ · · · ≤T K−1, the champion stabilizes first, eliminations occur in order, and the argument is the most direct

work page

[25] [25]

IfT r < Tch ≤T K−1 for somer, then one or more elim- inations occur before the champion has permanently stabilized. This is not a problem on the chain eventE: the active set still shrinks correctly, and the temporary mismatch caused by a wrong champion is captured by the champion-induced deficitΓ ch(t). The proof strategy is to show that for every suffici...

work page

[26] [26]

Fixr∈ {1,

Information lower bounds and recursive surrogate cer- tificates: a) Stage-rdrift lower bound.: Proposition 4(Average-target lower bound before stager). Fixr∈ {1, . . . , K−1},t≥3. SupposeT r−1 ≤s < T r with mch(t)≤s≤t(by conventionT 0 := 0, so forr= 1the condition reduces tom ch(t)≤s < T 1). OnE ∩E ch t , fSr−1(¯u(s))≥Dr−1 − Πr−1+Dr−1mch(t) s , whereΠ r−1...

work page

[27] [27]

We thus construct a deterministic certificate directly boundingT K−1

Eventual exhaustion certificate and expectation bound: Since the algorithm stops when the active-opponent set of the champion becomes empty, the stopping time coincides with the final elimination time:τ=T K−1 onE. We thus construct a deterministic certificate directly boundingT K−1. Definition 10(Deterministic eventual exhaustion certificate). Define Hσ(t...

work page

[28] [28]

In particular, forα= 1, the leading term isL/D 0 (matching Theorem 1 in Section IV); forα <1, the leading term is αL/D0 (matching Proposition 3 in Section IV)

Explicit bounds and fixed-chain theorem: Theorem 3(Fixed-chain expectation bound).For everyσ∈ S, everyα∈(0,1], and everyδ∈(0,1), E[τ1 Eσ]≤Pr(E σ) αL D0 + b D0 log(e+L) +C 1,σ p Llog(e+L) +C 2,σ log2(e+L) + eC3,σ. In particular, forα= 1, the leading term isL/D 0 (matching Theorem 1 in Section IV); forα <1, the leading term is αL/D0 (matching Proposition 3 ...

work page

[29] [29]

To pass from one chain to the whole correct- elimination event, it is therefore enough to sum over the disjoint chain events

Aggregation over chains:At this point, the fixed-chain analysis is complete: Theorem 3 already provides, for each admissible chainσ∈S, an upper bound on the contribution E[τ1 Eσ]. To pass from one chain to the whole correct- elimination event, it is therefore enough to sum over the disjoint chain events. This corollary is the appendix-level statement of P...

work page

[30] [30]

guess-and-verify

Control of the complement term:In this section we bound Rcomp(δ) :=E[τ1 E ccorr], the contribution toE[τ]from sample paths on which the algorithm’s elimination sequence does not correspond to any valid chainσ∈S. SinceE c corr has no chain structure, the chain-specific tools of Section VI-B.5 no longer apply, and all constants below carry the subscript “co...

work page

[31] [31]

Specializingα= 1recovers Theorem 1 of the main text; specializingα < 1recovers Corollary 1

Full expectation bound:We now combine the bound onE[τ1 Ecorr](Corollary 2, Section B.5) with the bound on the complementR comp(δ)(Proposition 10, Section B.6) to obtain the full expectation bound onE[τ]. Specializingα= 1recovers Theorem 1 of the main text; specializingα < 1recovers Corollary 1. Since Proposition 9 holds for all α∈(0,1]andδ∈(0,1), andR com...

work page