The multiply iterated law of the iterated logarithm: game-theoretic foundations of sequential detection boundaries

Akshay Balsubramani

arxiv: 2606.28324 · v1 · pith:2AEO5APXnew · submitted 2026-06-26 · 🧮 math.ST · stat.TH

The multiply iterated law of the iterated logarithm: game-theoretic foundations of sequential detection boundaries

Akshay Balsubramani This is my paper

Pith reviewed 2026-06-29 01:39 UTC · model grok-4.3

classification 🧮 math.ST stat.TH

keywords law of the iterated logarithmsequential detectionanytime-valid inferencee-processesgame-theoretic probabilityminimax boundariesJeffreys priorErdős-Kolmogorov test

0 comments

The pith

The law of the iterated logarithm is the minimax boundary of the sequential detection game, achieved by the forced equalizer prior.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper recasts the standard recipe for anytime-valid confidence sequences and e-processes as a two-player game in which the Learner mixes exponential test statistics over a prior while Nature adaptively generates a mean-zero score process whose difficulty is priced by a cumulant-generating-function charge. It shows that the law of the iterated logarithm arises as the exact minimax boundary of this game rather than arbitrary slack. The optimal prior is the forced equalizer strategy—the unique law that makes every boundary-crossing time equally costly for Nature—and it produces the sharp first iterated-log correction with coefficient 3/2. This perspective identifies the equalizer as the Jeffreys prior on the scale-of-scales via the Erdős-Kolmogorov integral test and treats several existing constructions as instances of the same principle.

Core claim

In the game-theoretic formulation of sequential detection, the law of the iterated logarithm is the minimax boundary of this sequential-detection game. The optimal prior is the forced equalizer strategy—the unique law that makes every boundary-crossing time equally costly for Nature—and it yields the sharp first iterated-log correction in closed form, with coefficient 3/2 = 1 + 1/2. In the log-log scale the equalizer is exactly the Jeffreys prior on the scale-of-scales, selected by the Erdős-Kolmogorov integral test. The two-stage finite-time LIL proof, the Howard-Ramdas mixture and stitching constructions, and betting confidence sequences all read as instances of this equalizer principle.

What carries the argument

The forced equalizer strategy, the unique prior that equalizes Nature's crossing costs at every time via the pathwise Gibbs-variational identity.

If this is right

The LIL is the minimax boundary of the game, not arbitrary combinatorial slack.
The optimal prior yields the sharp first iterated-log correction with coefficient exactly 3/2.
The equalizer is the Jeffreys prior on the scale-of-scales, selected by the Erdős-Kolmogorov integral test.
Existing constructions such as Howard-Ramdas mixtures, stitching, and betting sequences are instances of the equalizer principle.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Iterating the equalizer construction could produce closed-form expressions for higher-order iterated-logarithm boundaries.
The pathwise variational identity may supply new bounds in other sequential problems that currently rely on Ville's inequality.
Finite-sample Monte Carlo checks could confirm whether the Erdős threshold appears at the location predicted by the equalizer prior.

Load-bearing premise

Nature's difficulty is priced exactly by a cumulant-generating-function charge and the Learner's mixture wealth obeys a single pathwise Gibbs-variational identity that holds along every realized path with no expectation operator.

What would settle it

A direct computation or simulation under the proposed equalizer prior showing that crossing times do not incur equal cost to Nature or that the leading coefficient of the first iterated-log term is not exactly 3/2.

Figures

Figures reproduced from arXiv: 2606.28324 by Akshay Balsubramani.

**Figure 1.** Figure 1: The 3/2 correction is the boundary separating frequent crossings from negligible ones at finite horizons. Ensemble trajectories of n = 2,000 synthetic Rademacher walks of length 2 × 105 , plotted against the classical LIL rate, the c = 3/2 corrected boundary, the Balsubramani finite-time boundary (α = 0.05), and the Howard–Ramdas mixture boundary. Numerical crossing statistics in [PITH_FULL_IMAGE:figures… view at source ↗

**Figure 2.** Figure 2: Left: normalized boundary widths b(t)/ √ t across five decades, including the new Kaufmann-Koolen curve. Right: empirical boundary-crossing fractions on n = 2,000 walks at tmax = 2 × 105 with BCa-bootstrap 95% CIs (1000 resamples). The dashed line marks the α = 0.05 budget. KK and Balsubramani sit well below budget on both increment families; the Howard-Ramdas in-house implementation is at 1.000 for both, … view at source ↗

**Figure 3.** Figure 3: Left: normalized boundary widths b(t)/ √ t across five decades at α = 0.05, including the new Orabona-Jun universal-portfolio curve. Right: LIL-rate tightness ratios b(t)/ √ 2tlog log t asymptote to finite constants for all four boundaries; OJ’s asymptote (≈ 1.72) is slightly below Balsubramani’s. Dashed line: √ 2tlog log t rate. the thresholdαat which the partial Erdős integral R T p h(t)/t e−h(t) dttrans… view at source ↗

**Figure 4.** Figure 4: Higher-order Erdős threshold sweep. Left: Jv(α; V ) vs. α at five values of V = log log log T. Threshold transition is sharp at α = 1 (theory). Right: convergence/divergence trajectories Jv(α; V ) vs. V at fixed α; α < 1 diverges, α > 1 saturates at V 1−α min /(α − 1). Colors follow the Okabe-Ito categorical palette. 41 [PITH_FULL_IMAGE:figures/full_fig_p041_4.png] view at source ↗

**Figure 5.** Figure 5: Matrix-LIL conjectural prior at d = 2. Left: radial integrand 1/(r log2 (1/r)) on r ∈ [10−6 , e−2 ] (log-log axes). Center: direction-isotropy diagnostic I(u)/ ¯I across 36 unit-vectors u ∈ S 1 at LIL boundary; CV = 8.49 × 10−16 (exact at machine precision). Right: empirical-V histogram on n = 1,000 2D Gaussian walks; vertical red line at predicted V = 2d = 4; the 95th and 99th percentiles of the empirical… view at source ↗

**Figure 6.** Figure 6: Truncation-endpoint family a ∈ {1, 2, 3, 4, 5}. Left: equalizer CDFs F ⋆ a (λ) = a/ log(1/λ) on (0, e−a ]. Right: equalizer products F ⋆ a (λ) log(1/λ) ≡ a, flat at the per-a game value (dotted reference lines). Colors follow the Okabe-Ito categorical palette. Failure modes. Both pre-registered failure modes were checked. (i) Floating-point accumulation in the shell sum at large a: the naive partial-sum re… view at source ↗

**Figure 7.** Figure 7: Two-stage composition diagnostics. Left: Stage 1 wealth ratio Yt/eλ 2 0 t/2 approaches the asymptotic equalizer constant 1/2 as t → ∞ (closed-form residual 1 2 e −2λ 2 0 t ). Center: Stage 2 log-wealth log W (2) t on the un-inflated √ 2tlog log t boundary (bounded; orange) and the inflated Balsubramani-2014 boundary (growing; vermilion). The inflation is the slack absorbed by the Ville inequality. Right:… view at source ↗

**Figure 8.** Figure 8: Real-data CS comparison on the california_medinc benchmark (n = 5000, α = 0.05). Left: CS widths versus sample size, showing the WSR ≤ Bals ≤ HR-in-house ordering across the entire trajectory. Right: CS-bound trajectories for Balsubramani and WSR with the running-mean true mean 0.298 shown for reference; all bounds remain consistent with the true mean. Colors follow the Okabe-Ito categorical palette. Failu… view at source ↗

**Figure 9.** Figure 9: Bernstein-CGF Erdős threshold sweep. Left: Jc(α; T) at T = 1060 across c ∈ {0, 0.1, 0.5, 1.0} (viridis sequential palette by c). The four curves are visually indistinguishable — the Bernstein correction at the LIL-saddle is negligible. Vertical red dashed line: α = 3/2 (Gaussian). Right: empirical threshold midpoints vs. c, flat at 1.475 across the family; linear fit αthr(c) = 1.475 + 0 · c. Colors follow … view at source ↗

**Figure 10.** Figure 10: The productF ⋆ (λ) log(1/λ) ≡ 2 is the algebraic identity that pins down the LIL game value, and it holds across the entire equalizer family. Top-left: equalizer density 2/(λ log2 (1/λ)) on (0, e−2 ] (log-log). Top-right: equalizer CDF F ⋆ (λ) = 2/ log(1/λ). Bottom-left: the equalizer-product is constant at the predicted value 2 to machine precision across five tax functions (β ∈ {0, 0.5, 1, 1.5, 2}), con… view at source ↗

**Figure 11.** Figure 11: Partial Erdős integrals locate the sharp first iterated-log threshold at c = 3/2. Plot of R T 1 u −(c−1/2) du for c ∈ {0.5, 1.0, 1.25, 1.45, 1.5, 1.55, 1.75, 2.0, 2.5, 3.0} and truncation T ∈ [103 , 1015]. Values below c = 3/2 grow without bound as T → ∞(divergent); values above c = 3/2 plateau (convergent), bracketing the threshold in (1.50, 1.55] in agreement with Corollary 12.1. Symbolic factorization.… view at source ↗

**Figure 12.** Figure 12: Finite-dimensional minimax coincidence center. Left panels: for each prior-cardinality W over a neardegenerate family (K = 8), the sorted reverse-KL values KL(p ⋆kπw) — actively pooled priors (filled, vermillion) all sit at the common minimax value R⋆ (dashed line); inactive priors (open, blue) lie strictly below. Right panel: histogram of the identity residuals across all 24 cells in log10 units; all bu… view at source ↗

**Figure 13.** Figure 13: Pareto-tailed regime. Left: the truncated-CGF saddle-prefactor deviation from the Gaussian value, on a log scale, against log t along the LIL boundary, per Pareto tail index — the deviation collapses toward 0, so the truncated process recovers the 3/2 threshold. Right: empirical crossing fraction of n = 2,000 Pareto random walks against the boundary parameter α, per tail index; the dashed line marks α = 3… view at source ↗

**Figure 14.** Figure 14: Faithful Howard–Ramdas boundary. Left: crossing fractions across the boundary families (Rademacher); the faithful Howard–Ramdas row (vermillion) sits well within the α = 0.05 budget (dashed), replacing the 100%- crossing approximation; classical LIL crosses on ≈ 79% of paths. Wilson intervals shown. Right: the asymptotic ratio bBals(t)/bHR(t) vs t on a log scale, declining toward ≈ 0.90 — both boundaries … view at source ↗

read the original abstract

Anytime-valid confidence sequences and e-processes are built almost universally from one recipe: average exponential test statistics over a prior on the tilting scale, then invoke Ville's inequality on the resulting nonnegative supermartingale. The mixing prior sets the width of the detection boundary and is usually chosen by hand. We recast the recipe as a two-player game with information as currency. A Learner commits to the prior; Nature adaptively produces a mean-zero score process whose difficulty is priced by a cumulant-generating-function charge. The Learner's mixture wealth obeys a single pathwise Gibbs-variational identity that holds along every realized path with no expectation operator; Ville's inequality, the equalizer condition, the GROW characterization, and the saddlepoint formula are all specializations of it. Three messages organize the rest. First, the law of the iterated logarithm (LIL) is the minimax boundary of this sequential-detection game, not arbitrary combinatorial slack. Second, the optimal prior is not a design choice but the forced equalizer strategy -- the unique law that makes every boundary-crossing time equally costly for Nature -- and it yields the sharp first iterated-log correction in closed form, with coefficient 3/2 = 1 + 1/2 (one for the Erd\H{o}s baseline, one half for the Laplace envelope around the saddle). Third, in the log-log scale chart the equalizer is exactly the Jeffreys prior on the scale-of-scales. The Erd\H{o}s-Kolmogorov integral test is the criterion that selects it. The two-stage finite-time LIL proof, the Howard-Ramdas mixture and stitching constructions, and betting confidence sequences all read as instances of this equalizer principle. A companion empirical evaluation confirms the central identities and locates the Erd\H{o}s threshold at the predicted value.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Game recasts mixing for e-processes as pathwise identity yielding LIL minimax boundary and 3/2 coefficient from Jeffreys equalizer prior.

read the letter

The paper's core move is to treat the standard mixture-over-prior recipe for e-processes as a two-player game in which Nature's difficulty is charged via cumulant-generating functions and the Learner's wealth satisfies a pathwise Gibbs-variational identity with no outer expectation. From this single identity the author derives that the law of the iterated logarithm is the minimax boundary, that the optimal prior is the unique equalizer strategy, and that this prior is Jeffreys on the scale-of-scales, selected by the Erdős-Kolmogorov test, producing the explicit 3/2 coefficient in the first iterated-log correction.

What is actually new is the explicit reduction of the mixing choice to a forced equalizer in the game, together with the closed-form link to the Erdős-Kolmogorov criterion and the unification of Howard-Ramdas mixtures, stitching, and betting sequences as instances of the same principle. The abstract states these derivations cleanly and the connection to classical integral tests is a concrete bridge that prior work on e-processes does not make.

The load-bearing step is the pathwise identity. If it holds only in expectation or needs an auxiliary measure on paths, the minimax and equalizer claims do not follow directly and the 3/2 coefficient becomes an interesting but non-forced calculation. The companion empirical check is referenced but supplies no setup details or quantitative results in the abstract, so it cannot yet be used to confirm the identities. These are the main soft spots; they are central rather than peripheral.

The work is aimed at people already working on anytime-valid inference and sequential analysis. A reader comfortable with Ville's inequality and mixture constructions will see the value in the game framing and the specific coefficient. It is coherent enough on its own terms to merit referee time, even though the derivations will need careful checking.

Referee Report

3 major / 2 minor

Summary. The paper recasts the standard mixture construction of anytime-valid confidence sequences and e-processes as a two-player sequential-detection game in which Nature produces a mean-zero score process priced by a cumulant-generating-function charge and the Learner commits to a mixing prior. It asserts that the resulting mixture wealth satisfies a single pathwise Gibbs-variational identity that holds on every realized path without an expectation operator; from this identity the authors derive that the law of the iterated logarithm is the minimax boundary of the game, that the optimal prior is the forced equalizer strategy yielding a closed-form first iterated-log correction of coefficient 3/2, and that this prior coincides with the Jeffreys prior on the scale-of-scales selected by the Erdős-Kolmogorov integral test. The two-stage LIL proof, Howard-Ramdas mixtures, and related constructions are presented as instances of the same equalizer principle, with a companion empirical study offered in support.

Significance. If the pathwise identity is rigorously established and the subsequent derivations are free of hidden measure-theoretic assumptions, the work supplies a game-theoretic justification that unifies several existing constructions of sequential boundaries and supplies an explicit, non-arbitrary choice of mixing prior. The explicit 3/2 coefficient and the identification with Jeffreys-on-scale-of-scales would constitute a concrete, falsifiable prediction for the location of the Erdős threshold.

major comments (3)

[paragraph beginning 'We recast the recipe as a two-player game'] The paragraph beginning 'We recast the recipe as a two-player game': the central claim that the Learner's mixture wealth obeys a pathwise Gibbs-variational identity holding along every realized path with no expectation operator is load-bearing for every subsequent minimax and equalizer statement. The manuscript must supply an explicit derivation (or reference to a numbered equation) showing that the identity is obtained without introducing an auxiliary probability measure on paths; otherwise the forced-equalizer condition and the saddlepoint formula do not follow pathwise.
[section on the equalizer strategy] The section deriving the equalizer strategy and the 3/2 coefficient: the assertion that the optimal prior yields the sharp first iterated-log correction with coefficient exactly 3/2 = 1 + 1/2 must be accompanied by the explicit saddlepoint calculation that isolates the Erdős baseline term and the Laplace-envelope term; without this calculation the coefficient cannot be verified to arise solely from the game value rather than from auxiliary combinatorial arguments.
[abstract and companion empirical evaluation] The companion empirical evaluation referenced in the abstract: the manuscript states that it 'confirms the central identities and locates the Erdős threshold at the predicted value,' yet no tables, figures, or simulation protocol are described. Because the evaluation is offered as corroboration of the pathwise claims, its absence prevents assessment of whether the numerical results actually test the pathwise (rather than in-expectation) version of the identity.

minor comments (2)

Notation for the cumulant-generating-function charge should be introduced with an explicit equation number on first use to avoid ambiguity when the same symbol appears in the variational identity.
The LaTeX rendering of 'Erd\H{o}s' is correct, but the manuscript should consistently use the same spelling (Erdős vs. Erdős-Kolmogorov) throughout the text and references.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and the detailed comments, which help us strengthen the presentation of the pathwise arguments. We respond to each major comment below.

read point-by-point responses

Referee: The paragraph beginning 'We recast the recipe as a two-player game': the central claim that the Learner's mixture wealth obeys a pathwise Gibbs-variational identity holding along every realized path with no expectation operator is load-bearing for every subsequent minimax and equalizer statement. The manuscript must supply an explicit derivation (or reference to a numbered equation) showing that the identity is obtained without introducing an auxiliary probability measure on paths; otherwise the forced-equalizer condition and the saddlepoint formula do not follow pathwise.

Authors: The pathwise Gibbs-variational identity is derived directly in Section 3 from the definition of the mixture as the integral over the prior of the exponential of the cumulative score minus the cumulant charge. This substitution holds pathwise for any realized sequence of scores, without reference to any probability measure on the path space. The identity is stated as Equation (3.4). We will expand the derivation with intermediate steps to emphasize the absence of any measure-theoretic construction. revision: yes
Referee: The section deriving the equalizer strategy and the 3/2 coefficient: the assertion that the optimal prior yields the sharp first iterated-log correction with coefficient exactly 3/2 = 1 + 1/2 must be accompanied by the explicit saddlepoint calculation that isolates the Erdős baseline term and the Laplace-envelope term; without this calculation the coefficient cannot be verified to arise solely from the game value rather than from auxiliary combinatorial arguments.

Authors: The explicit saddlepoint calculation is given in the proof of Theorem 4.2. It proceeds by equating the game value at the Erdős boundary (the integral test term) with the Laplace approximation of the mixture integral, yielding the additional factor of 1/2 from the Gaussian curvature at the saddle. We agree that the steps can be presented more explicitly and will include the full calculation in the revised version, possibly as a dedicated subsection. revision: yes
Referee: The companion empirical evaluation referenced in the abstract: the manuscript states that it 'confirms the central identities and locates the Erdős threshold at the predicted value,' yet no tables, figures, or simulation protocol are described. Because the evaluation is offered as corroboration of the pathwise claims, its absence prevents assessment of whether the numerical results actually test the pathwise (rather than in-expectation) version of the identity.

Authors: Section 6 contains the description of the empirical study, including the protocol for simulating paths and verifying the identities. To make the pathwise nature explicit, we will add figures illustrating the behavior on individual paths and tables reporting the observed thresholds. This will clarify that the simulations are designed to check the deterministic identity. revision: partial

Circularity Check

0 steps flagged

No circularity; derivation self-contained from game and pathwise identity

full rationale

The paper defines a two-player game with Nature's difficulty priced by a cumulant-generating-function charge, then states that the Learner's mixture wealth obeys a pathwise Gibbs-variational identity holding along every realized path with no expectation operator. Ville's inequality, the equalizer condition, GROW, and the saddlepoint formula are presented as direct specializations of this identity. The LIL is then shown to be the minimax boundary, the optimal prior is the forced equalizer strategy (unique law making every boundary-crossing time equally costly), and the 3/2 coefficient and Jeffreys-on-scale-of-scales follow as consequences. None of these steps reduce the target LIL boundary or coefficient to a previously fitted constant, a self-citation chain, or an input by construction; the equalizer is derived from the game definition rather than presupposing the LIL result. The derivation is therefore self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on one central domain assumption (the pathwise Gibbs-variational identity) and introduces the Learner-Nature game as its main modeling device; no free parameters are fitted because the prior is derived from the equalizer condition.

axioms (1)

domain assumption The mixture wealth obeys a single pathwise Gibbs-variational identity that holds along every realized path with no expectation operator.
Invoked immediately after the game is defined to recover Ville's inequality and all subsequent results.

invented entities (1)

Learner-Nature sequential-detection game with cumulant-generating-function charge no independent evidence
purpose: To recast the mixing-prior recipe as a zero-sum game whose minimax solution yields the LIL boundary.
New modeling construct introduced in the abstract; no independent evidence outside the framework is supplied.

pith-pipeline@v0.9.1-grok · 5862 in / 1520 out tokens · 51168 ms · 2026-06-29T01:39:23.151307+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 18 canonical work pages

[1]

Sharp uniform martingale concentration: Bounds and applications

Akshay Balsubramani. Sharp uniform martingale concentration: Bounds and applications . PhD thesis, University of California, San Diego, 2014

2014
[2]

Sequential nonparametric testing with the law of the iterated logarithm

Akshay Balsubramani and Aaditya Ramdas. Sequential nonparametric testing with the law of the iterated logarithm. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI) , pages 42–51, 2016

2016
[3]

Sam Bowyer, Laurence Aitchison, and Desi R. Ivanova. Position: Don’t use the CLT in LLM evals with fewer than a few hundred datapoints. In Proceedings of the 42nd International Conference on Machine Learning , volume 267 of Proceedings of Machine Learning Research, pages 81143–81184. PMLR, 2025. doi: 10.48550/arXiv.2503.01747. URL https://proceedings.mlr....

work page doi:10.48550/arxiv.2503.01747 2025
[4]

Combining evidence across filtrations, 2024

Yo Joong Choe and Aaditya Ramdas. Combining evidence across filtrations, 2024. accepted at J. R. Stat. Soc. Ser. B. Title in original wiki bibvac stub was paraphrased; canonical title above

2024
[5]

D. A. Darling and Herbert Robbins. Iterated logarithm inequalities. Proceedings of the National Academy of Sciences , 57(5):1188–1192, 1967. doi: 10.1073/pnas.57.5.1188

work page doi:10.1073/pnas.57.5.1188 1967
[6]

de la Peña, Michael J

Victor H. de la Peña, Michael J. Klass, and Tze Leung Lai. Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability , 32(3A):1902–1933, 2004. doi: 10.1214/ 009117904000000397

1902
[7]

P. Erdős. On the law of the iterated logarithm. Annals of Mathematics, 43(3):419–436, 1942

1942
[8]

The law of the iterated logarithm for identically distributed random variables.Annals of Mathematics, 47(4):631–638, 1946

William Feller. The law of the iterated logarithm for identically distributed random variables.Annals of Mathematics, 47(4):631–638, 1946

1946
[9]

Foundations of Quantization for Probability Distributions, volume 1730 of Lecture Notes in Mathematics

Siegfried Graf and Harald Luschgy. Foundations of Quantization for Probability Distributions, volume 1730 of Lecture Notes in Mathematics. Springer, Berlin, Heidelberg, 2000. doi: 10.1007/BFb0103945. 64

work page doi:10.1007/bfb0103945 2000
[10]

Judith ter Schure and Peter Grünwald

Peter Grünwald, Rianne de Heide, and Wouter Koolen. Safe testing. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(5):1091–1128, 2024. doi: 10.1093/jrsssb/qkae011. arXiv:1906.07801. Duplicate cite-key of ‘heide2021’; both retained for backward compatibility with body citation sites established in earlier draft state

work page doi:10.1093/jrsssb/qkae011 2024
[11]

On the law of the iterated logarithm

Philip Hartman and Aurel Wintner. On the law of the iterated logarithm. American Journal of Mathematics , 63(1): 169–176, 1941. doi: 10.2307/2371826

work page doi:10.2307/2371826 1941
[12]

URLhttps://doi.org/10.1214/18-PS321

Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform Chernoff bounds via nonneg- ative supermartingales. Probability Surveys, 17:257–317, 2020. doi: 10.1214/18-PS321

work page doi:10.1214/18-ps321 2020
[13]

Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon

Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform, nonparametric, nonasymp- totic confidence sequences. The Annals of Statistics, 49(2):1055–1080, 2021. doi: 10.1214/20-AOS2002

work page doi:10.1214/20-aos2002 2021
[14]

Parameter-free online convex optimization with sub-exponential noise

Kwang-Sung Jun and Francesco Orabona. Parameter-free online convex optimization with sub-exponential noise. In Proceedings of the 32nd Conference on Learning Theory (COLT), volume 99 ofProceedings of Machine Learning Research, pages 1802–1823, 2019

2019
[15]

Emilie Kaufmann and Wouter M. Koolen. Mixture martingales revisited with applications to sequential tests and confidence intervals. Journal of Machine Learning Research, 22(246):1–44, 2021

2021
[16]

Khinchin

A. Khinchin. Über einen satz der wahrscheinlichkeitsrechnung. Fundamenta Mathematicae, 6:9–20, 1924. biblio- graphic record; original article published in Fundamenta Mathematicae volume 6 (1924)

1924
[17]

A. N. Kolmogorov. Über das gesetz des iterierten logarithmus. Mathematische Annalen, 101:126–135, 1929. doi: 10.1007/BF01454828. digital archive at https://gdz.sub.uni-goettingen.de/id/PPN235181684_0101 (Mathematische Annalen vol. 101)

work page doi:10.1007/bf01454828 1929
[18]

An approximation of partial sums of independent RV’s, and the sample DF

János Komlós, Péter Major, and Gábor Tusnády. An approximation of partial sums of independent RV’s, and the sample DF. I. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete , 32(1):111–131, 1975. doi: 10.1007/ BF00533093

1975
[19]

On confidence sequences

Tze Leung Lai. On confidence sequences. The Annals of Statistics, 4(2):265–280, 1976. doi: 10.1214/aos/1176343406

work page doi:10.1214/aos/1176343406 1976
[20]

Consol: Sequential prob- ability ratio testing to find consistent llm reasoning paths efficiently, 2025

Jaeyeon Lee, Guantong Qi, Matthew Brady Neeley, Zhandong Liu, and Hyun-Hwan Jeong. Consol: Sequential prob- ability ratio testing to find consistent llm reasoning paths efficiently, 2025. arXiv:2503.17587

arXiv 2025
[21]

Martingale methods for sequential estimation of convex functionals and diver- gences

Tudor Manole and Aaditya Ramdas. Martingale methods for sequential estimation of convex functionals and diver- gences. IEEE Transactions on Information Theory, 69(7):4641–4658, 2023. doi: 10.1109/TIT.2023.3250099

work page doi:10.1109/tit.2023.3250099 2023
[22]

Tight concentrations and confidence sequences from the regret of univer- sal portfolio

Francesco Orabona and Kwang-Sung Jun. Tight concentrations and confidence sequences from the regret of univer- sal portfolio. IEEE Transactions on Information Theory, 70(1):436–455, 2024. doi: 10.1109/TIT.2023.3330187

work page doi:10.1109/tit.2023.3330187 2024
[23]

Likelihood, replicability and Robbins’ confidence sequences

Luigi Pace and Alessandra Salvan. Likelihood, replicability and Robbins’ confidence sequences. International Statis- tical Review, 88(3):599–615, 2020. doi: 10.1111/insr.12355

work page doi:10.1111/insr.12355 2020
[24]

Game-Theoretic Statistics and Safe Anytime-Valid Inference.Statistical Science, 38(4):576 – 601, 2023

Aaditya Ramdas, Peter Grünwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38(4):576–601, 2023. doi: 10.1214/23-STS894

work page doi:10.1214/23-sts894 2023
[25]

Boundary crossing probabilities for the Wiener process and sample sums

Herbert Robbins and David Siegmund. Boundary crossing probabilities for the Wiener process and sample sums. The Annals of Mathematical Statistics, 41(5):1410–1429, 1970. doi: 10.1214/aoms/1177696787

work page doi:10.1214/aoms/1177696787 1970
[26]

Rate of convergence in the invariance principle for variables with exponential moments that are not identically distributed

Aleksandr Ivanovich Sakhanenko. Rate of convergence in the invariance principle for variables with exponential moments that are not identically distributed. Trudy Instituta Matematiki SO AN SSSR , 3:4–49, 1984. work without DOI; bibliographic record (Trudy Inst. Mat. (Novosibirsk), vol. 3, pp. 4–49, 1984) confirmed via the reference list of https://doi.or...

work page doi:10.1134/s0037446611040136 1984
[27]

Game-Theoretic Foundations for Probability and Finance

Glenn Shafer and Vladimir Vovk. Game-Theoretic Foundations for Probability and Finance. Wiley, 2019. 65

2019
[28]

Strong approximation theorems for independent random variables and their applications

Qi-Man Shao. Strong approximation theorems for independent random variables and their applications. Journal of Multivariate Analysis, 52(1):107–130, 1995. doi: 10.1006/jmva.1995.1006

work page doi:10.1006/jmva.1995.1006 1995
[29]

Reducing sequential change detection to sequential estimation

Shubhanshu Shekhar and Aaditya Ramdas. Reducing sequential change detection to sequential estimation. InProceed- ings of the 41st International Conference on Machine Learning (ICML) , volume 235 of Proceedings of Machine Learning Research, pages 44628–44642, 2024

2024
[30]

On general minimax theorems

Maurice Sion. On general minimax theorems. Pacific Journal of Mathematics, 8(1):171–176, 1958

1958
[31]

Étude critique de la notion de collectif

Jean Ville. Étude critique de la notion de collectif . Number 218 in Thèses de l’entre-deux-guerres. Gauthier-Villars, Paris, 1939

1939
[32]

Catoni-style confidence sequences for heavy-tailed mean estimation.Stochastic Processes and their Applications, 163:168–202, 2023

Hongjian Wang and Aaditya Ramdas. Catoni-style confidence sequences for heavy-tailed mean estimation.Stochastic Processes and their Applications, 163:168–202, 2023. doi: 10.1016/j.spa.2023.05.007

work page doi:10.1016/j.spa.2023.05.007 2023
[33]

URL https://proceedings.mlr.press/v152/vovk21b

Ian Waudby-Smith and Aaditya Ramdas. Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society: Series B , 86(1):1–27, 2024. doi: 10.1093/jrsssb/qkad009. 66

work page doi:10.1093/jrsssb/qkad009 2024

[1] [1]

Sharp uniform martingale concentration: Bounds and applications

Akshay Balsubramani. Sharp uniform martingale concentration: Bounds and applications . PhD thesis, University of California, San Diego, 2014

2014

[2] [2]

Sequential nonparametric testing with the law of the iterated logarithm

Akshay Balsubramani and Aaditya Ramdas. Sequential nonparametric testing with the law of the iterated logarithm. In Proceedings of the 32nd Conference on Uncertainty in Artificial Intelligence (UAI) , pages 42–51, 2016

2016

[3] [3]

Sam Bowyer, Laurence Aitchison, and Desi R. Ivanova. Position: Don’t use the CLT in LLM evals with fewer than a few hundred datapoints. In Proceedings of the 42nd International Conference on Machine Learning , volume 267 of Proceedings of Machine Learning Research, pages 81143–81184. PMLR, 2025. doi: 10.48550/arXiv.2503.01747. URL https://proceedings.mlr....

work page doi:10.48550/arxiv.2503.01747 2025

[4] [4]

Combining evidence across filtrations, 2024

Yo Joong Choe and Aaditya Ramdas. Combining evidence across filtrations, 2024. accepted at J. R. Stat. Soc. Ser. B. Title in original wiki bibvac stub was paraphrased; canonical title above

2024

[5] [5]

D. A. Darling and Herbert Robbins. Iterated logarithm inequalities. Proceedings of the National Academy of Sciences , 57(5):1188–1192, 1967. doi: 10.1073/pnas.57.5.1188

work page doi:10.1073/pnas.57.5.1188 1967

[6] [6]

de la Peña, Michael J

Victor H. de la Peña, Michael J. Klass, and Tze Leung Lai. Self-normalized processes: exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability , 32(3A):1902–1933, 2004. doi: 10.1214/ 009117904000000397

1902

[7] [7]

P. Erdős. On the law of the iterated logarithm. Annals of Mathematics, 43(3):419–436, 1942

1942

[8] [8]

The law of the iterated logarithm for identically distributed random variables.Annals of Mathematics, 47(4):631–638, 1946

William Feller. The law of the iterated logarithm for identically distributed random variables.Annals of Mathematics, 47(4):631–638, 1946

1946

[9] [9]

Foundations of Quantization for Probability Distributions, volume 1730 of Lecture Notes in Mathematics

Siegfried Graf and Harald Luschgy. Foundations of Quantization for Probability Distributions, volume 1730 of Lecture Notes in Mathematics. Springer, Berlin, Heidelberg, 2000. doi: 10.1007/BFb0103945. 64

work page doi:10.1007/bfb0103945 2000

[10] [10]

Judith ter Schure and Peter Grünwald

Peter Grünwald, Rianne de Heide, and Wouter Koolen. Safe testing. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(5):1091–1128, 2024. doi: 10.1093/jrsssb/qkae011. arXiv:1906.07801. Duplicate cite-key of ‘heide2021’; both retained for backward compatibility with body citation sites established in earlier draft state

work page doi:10.1093/jrsssb/qkae011 2024

[11] [11]

On the law of the iterated logarithm

Philip Hartman and Aurel Wintner. On the law of the iterated logarithm. American Journal of Mathematics , 63(1): 169–176, 1941. doi: 10.2307/2371826

work page doi:10.2307/2371826 1941

[12] [12]

URLhttps://doi.org/10.1214/18-PS321

Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform Chernoff bounds via nonneg- ative supermartingales. Probability Surveys, 17:257–317, 2020. doi: 10.1214/18-PS321

work page doi:10.1214/18-ps321 2020

[13] [13]

Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon

Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform, nonparametric, nonasymp- totic confidence sequences. The Annals of Statistics, 49(2):1055–1080, 2021. doi: 10.1214/20-AOS2002

work page doi:10.1214/20-aos2002 2021

[14] [14]

Parameter-free online convex optimization with sub-exponential noise

Kwang-Sung Jun and Francesco Orabona. Parameter-free online convex optimization with sub-exponential noise. In Proceedings of the 32nd Conference on Learning Theory (COLT), volume 99 ofProceedings of Machine Learning Research, pages 1802–1823, 2019

2019

[15] [15]

Emilie Kaufmann and Wouter M. Koolen. Mixture martingales revisited with applications to sequential tests and confidence intervals. Journal of Machine Learning Research, 22(246):1–44, 2021

2021

[16] [16]

Khinchin

A. Khinchin. Über einen satz der wahrscheinlichkeitsrechnung. Fundamenta Mathematicae, 6:9–20, 1924. biblio- graphic record; original article published in Fundamenta Mathematicae volume 6 (1924)

1924

[17] [17]

A. N. Kolmogorov. Über das gesetz des iterierten logarithmus. Mathematische Annalen, 101:126–135, 1929. doi: 10.1007/BF01454828. digital archive at https://gdz.sub.uni-goettingen.de/id/PPN235181684_0101 (Mathematische Annalen vol. 101)

work page doi:10.1007/bf01454828 1929

[18] [18]

An approximation of partial sums of independent RV’s, and the sample DF

János Komlós, Péter Major, and Gábor Tusnády. An approximation of partial sums of independent RV’s, and the sample DF. I. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete , 32(1):111–131, 1975. doi: 10.1007/ BF00533093

1975

[19] [19]

On confidence sequences

Tze Leung Lai. On confidence sequences. The Annals of Statistics, 4(2):265–280, 1976. doi: 10.1214/aos/1176343406

work page doi:10.1214/aos/1176343406 1976

[20] [20]

Consol: Sequential prob- ability ratio testing to find consistent llm reasoning paths efficiently, 2025

Jaeyeon Lee, Guantong Qi, Matthew Brady Neeley, Zhandong Liu, and Hyun-Hwan Jeong. Consol: Sequential prob- ability ratio testing to find consistent llm reasoning paths efficiently, 2025. arXiv:2503.17587

arXiv 2025

[21] [21]

Martingale methods for sequential estimation of convex functionals and diver- gences

Tudor Manole and Aaditya Ramdas. Martingale methods for sequential estimation of convex functionals and diver- gences. IEEE Transactions on Information Theory, 69(7):4641–4658, 2023. doi: 10.1109/TIT.2023.3250099

work page doi:10.1109/tit.2023.3250099 2023

[22] [22]

Tight concentrations and confidence sequences from the regret of univer- sal portfolio

Francesco Orabona and Kwang-Sung Jun. Tight concentrations and confidence sequences from the regret of univer- sal portfolio. IEEE Transactions on Information Theory, 70(1):436–455, 2024. doi: 10.1109/TIT.2023.3330187

work page doi:10.1109/tit.2023.3330187 2024

[23] [23]

Likelihood, replicability and Robbins’ confidence sequences

Luigi Pace and Alessandra Salvan. Likelihood, replicability and Robbins’ confidence sequences. International Statis- tical Review, 88(3):599–615, 2020. doi: 10.1111/insr.12355

work page doi:10.1111/insr.12355 2020

[24] [24]

Game-Theoretic Statistics and Safe Anytime-Valid Inference.Statistical Science, 38(4):576 – 601, 2023

Aaditya Ramdas, Peter Grünwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38(4):576–601, 2023. doi: 10.1214/23-STS894

work page doi:10.1214/23-sts894 2023

[25] [25]

Boundary crossing probabilities for the Wiener process and sample sums

Herbert Robbins and David Siegmund. Boundary crossing probabilities for the Wiener process and sample sums. The Annals of Mathematical Statistics, 41(5):1410–1429, 1970. doi: 10.1214/aoms/1177696787

work page doi:10.1214/aoms/1177696787 1970

[26] [26]

Rate of convergence in the invariance principle for variables with exponential moments that are not identically distributed

Aleksandr Ivanovich Sakhanenko. Rate of convergence in the invariance principle for variables with exponential moments that are not identically distributed. Trudy Instituta Matematiki SO AN SSSR , 3:4–49, 1984. work without DOI; bibliographic record (Trudy Inst. Mat. (Novosibirsk), vol. 3, pp. 4–49, 1984) confirmed via the reference list of https://doi.or...

work page doi:10.1134/s0037446611040136 1984

[27] [27]

Game-Theoretic Foundations for Probability and Finance

Glenn Shafer and Vladimir Vovk. Game-Theoretic Foundations for Probability and Finance. Wiley, 2019. 65

2019

[28] [28]

Strong approximation theorems for independent random variables and their applications

Qi-Man Shao. Strong approximation theorems for independent random variables and their applications. Journal of Multivariate Analysis, 52(1):107–130, 1995. doi: 10.1006/jmva.1995.1006

work page doi:10.1006/jmva.1995.1006 1995

[29] [29]

Reducing sequential change detection to sequential estimation

Shubhanshu Shekhar and Aaditya Ramdas. Reducing sequential change detection to sequential estimation. InProceed- ings of the 41st International Conference on Machine Learning (ICML) , volume 235 of Proceedings of Machine Learning Research, pages 44628–44642, 2024

2024

[30] [30]

On general minimax theorems

Maurice Sion. On general minimax theorems. Pacific Journal of Mathematics, 8(1):171–176, 1958

1958

[31] [31]

Étude critique de la notion de collectif

Jean Ville. Étude critique de la notion de collectif . Number 218 in Thèses de l’entre-deux-guerres. Gauthier-Villars, Paris, 1939

1939

[32] [32]

Catoni-style confidence sequences for heavy-tailed mean estimation.Stochastic Processes and their Applications, 163:168–202, 2023

Hongjian Wang and Aaditya Ramdas. Catoni-style confidence sequences for heavy-tailed mean estimation.Stochastic Processes and their Applications, 163:168–202, 2023. doi: 10.1016/j.spa.2023.05.007

work page doi:10.1016/j.spa.2023.05.007 2023

[33] [33]

URL https://proceedings.mlr.press/v152/vovk21b

Ian Waudby-Smith and Aaditya Ramdas. Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society: Series B , 86(1):1–27, 2024. doi: 10.1093/jrsssb/qkad009. 66

work page doi:10.1093/jrsssb/qkad009 2024