Eventually LIL Regret: Almost Sure lnln T Regret for a sub-Gaussian Mixture on Unbounded Data
Pith reviewed 2026-05-16 22:37 UTC · model grok-4.3
The pith
A sub-Gaussian mixture yields a deterministic regret bound of order ln ln V_T that holds eventually on almost every path for unbounded data.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
On the Ville event E_0 of probability one the regret of the sub-Gaussian mixture is eventually bounded by a universal constant times ln ln V_T, where V_T is the cumulative variance process; for any fixed alpha the finite-time bound on E_alpha is ln squared of 1/alpha over V_T plus ln of 1/alpha plus ln ln V_T up to constants, and this bound is deterministic once the observed sequence lies inside the event.
What carries the argument
The Ville event E_alpha, the set of paths on which the mixture's cumulative betting wealth or sum process remains controlled by the variance process V_T.
If this is right
- Regret guarantees become deterministic once a path enters the high-probability Ville event.
- Unbounded observations are admissible because the bound depends on realized variance rather than a uniform bound.
- The same mixture simultaneously serves adversarial regret analysis and stochastic concentration arguments.
- Conditional bounds of this form supply a direct bridge between game-theoretic statistics and online learning.
Where Pith is reading between the lines
- Similar pathwise bounds may exist for other exponential or sub-exponential mixtures when an appropriate variance proxy is substituted for V_T.
- The ln ln V_T rate suggests that mixture-based algorithms could achieve almost-sure regret in non-stationary or drifting environments where variance grows slowly.
- One could test whether replacing the mixture with a different aggregation rule yields a tighter constant in front of the ln ln term on the same event.
Load-bearing premise
The observed sequence must lie inside the Ville event E_alpha.
What would settle it
A single infinite path inside E_0 on which the regret exceeds any fixed multiple of ln ln V_T for infinitely many times with V_T growing to infinity.
read the original abstract
We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $\mathcal E_\alpha$, this regret till time $T$ is bounded by $\ln^2(1/\alpha)/V_T + \ln (1/\alpha) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/\alpha) + \ln \ln V_T$ if $V_T \geq \ln(1/\alpha)$.) If the data were stochastic, then one can show that $\mathcal E_\alpha$ has probability at least $1-\alpha$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $\mathcal E_0$ of probability one, the regret on every path in $\mathcal E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proves a pathwise (deterministic) regret bound for Robbins' classic sub-Gaussian mixture. On every trajectory in the Ville event E_alpha (which has probability at least 1-alpha under sub-Gaussian, symmetric or variance-bounded distributions), the regret up to time T satisfies R_T <= C (ln^2(1/alpha)/V_T + ln(1/alpha) + ln ln V_T) for a universal constant C, where V_T is the cumulative variance process. The bound simplifies to an eventually ln ln V_T form (up to constants) on the probability-1 event E_0. The work positions this as a bridge between adversarial online learning (bounded data) and game-theoretic statistics (unbounded data under stochastic assumptions).
Significance. If the derivation is correct, the result is significant: it supplies an almost-sure, pathwise regret bound that holds deterministically on a high-probability event without requiring bounded observations, thereby linking the two literatures. The reduction to an eventually ln ln V_T bound on E_0 is a strong asymptotic statement. The manuscript ships an explicit mixture construction and a deterministic argument that avoids hidden stochastic steps inside the pathwise claim.
major comments (1)
- The central derivation that converts the Robbins mixture into the stated pathwise bound on E_alpha is only sketched in the abstract; the explicit steps relating the mixture weights, the variance process V_T, and the regret on every path in E_0 must be written out in full (ideally with the key inequality labeled as a numbered display) so that the reduction to ln ln V_T can be verified directly.
minor comments (3)
- State the precise universal constants that appear in the O(ln ln V_T) bound and indicate whether they depend on the sub-Gaussian parameter.
- Clarify the exact definition of the cumulative variance process V_T for the unbounded-data case and confirm it is nondecreasing and adapted to the filtration used for the Ville event.
- Add a short remark on how the bound behaves when V_T remains small for long periods (i.e., before the ln ln V_T regime begins).
Simulated Author's Rebuttal
We thank the referee for the positive assessment and the recommendation of minor revision. The single major comment is addressed below.
read point-by-point responses
-
Referee: The central derivation that converts the Robbins mixture into the stated pathwise bound on E_alpha is only sketched in the abstract; the explicit steps relating the mixture weights, the variance process V_T, and the regret on every path in E_0 must be written out in full (ideally with the key inequality labeled as a numbered display) so that the reduction to ln ln V_T can be verified directly.
Authors: We agree that the derivation requires fuller exposition. In the revised manuscript we will insert a dedicated subsection (immediately following the statement of the main theorem) that begins from the explicit form of the Robbins mixture weights, applies the definition of the Ville event E_alpha to obtain a deterministic inequality relating the cumulative variance process V_T to the mixture, and derives the pathwise bound R_T <= C (ln^2(1/alpha)/V_T + ln(1/alpha) + ln ln V_T) via a sequence of labeled displays. The same subsection will then specialize the bound on the probability-one event E_0 to obtain the eventual ln ln V_T form. All steps will be deterministic and free of hidden stochastic arguments. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper derives a deterministic pathwise regret bound directly from the Robbins mixture construction, expressed in terms of the independently defined Ville event E_alpha and the cumulative variance process V_T. This bound holds on every trajectory inside E_alpha without reducing to fitted parameters, self-referential equations, or load-bearing self-citations. The separate claim that E_alpha has probability at least 1-alpha under sub-Gaussian or variance-bounded distributions is an external probabilistic statement and does not enter the deterministic derivation. No step in the provided chain equates a prediction to its own inputs by construction, so the result is self-contained.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Ville event E_alpha has probability at least 1-alpha under sub-Gaussian, symmetric, or variance-bounded distributions
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We prove that a classic sub-Gaussian mixture proposed by Robbins... regret till time T is bounded by ln^2(1/alpha)/V_T + ln(1/alpha) + ln ln V_T... on the Ville event E_0... eventually bounded by ln ln V_T
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanLogicNat recovery / embed_strictMono unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
on E_0... Rt <= ln ln V_T (1+o(1)) eventually... self-normalized SLLN and upper LIL
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A trajectorial interpretation of Doob's martingale inequalities
B Acciaio, M Beiglb \"o ck, F Penkner, W Schachermayer, and J Temme. A trajectorial interpretation of Doob's martingale inequalities. The Annals of Applied Probability, pages 1494--1505, 2013
work page 2013
-
[2]
Optimal best-arm identification methods for tail-risk measures
Shubhada Agrawal, Wouter M Koolen, and Sandeep Juneja. Optimal best-arm identification methods for tail-risk measures. Advances in Neural Information Processing Systems, 34: 0 25578--25590, 2021
work page 2021
-
[3]
Martingale inequalities and deterministic counterparts
Mathias Beiglb \"o ck and Marcel Nutz. Martingale inequalities and deterministic counterparts. Electronic Journal of Probability, 19: 0 1--15, 2014
work page 2014
-
[4]
Pathwise versions of the B urkholder- D avis- G undy inequality
Mathias Beiglb \"o ck and Pietro Siorpaes. Pathwise versions of the B urkholder- D avis- G undy inequality. Bernoulli, pages 360--373, 2015
work page 2015
-
[5]
Confidence sequences for generalized linear models via regret analysis
Eugenio Clerico, Hamish Flynn, and Gergely Neu. Confidence sequences for generalized linear models via regret analysis. arXiv preprint arXiv:2504.16555, 2025
-
[6]
Thomas M Cover. Universal portfolios. Mathematical finance, 1 0 (1): 0 1--29, 1991
work page 1991
-
[7]
Universal portfolios with side information
Thomas M Cover and Erik Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42 0 (2): 0 348--363, 2002
work page 2002
-
[8]
Black-box reductions for parameter-free online learning in B anach spaces
Ashok Cutkosky and Francesco Orabona. Black-box reductions for parameter-free online learning in B anach spaces. In Conference On Learning Theory, pages 1493--1529. PMLR, 2018
work page 2018
-
[9]
Matrix-free preconditioning in online learning
Ashok Cutkosky and Tamas Sarlos. Matrix-free preconditioning in online learning. In International Conference on Machine Learning, pages 1455--1464. PMLR, 2019
work page 2019
-
[10]
Confidence sequences for mean, variance, and median
Donald A Darling and Herbert Robbins. Confidence sequences for mean, variance, and median. Proceedings of the National Academy of Sciences, 58 0 (1): 0 66--68, 1967 a
work page 1967
-
[11]
Iterated logarithm inequalities
Donald A Darling and Herbert Robbins. Iterated logarithm inequalities. Proceedings of the National Academy of Sciences, 57 0 (5): 0 1188--1192, 1967 b
work page 1967
-
[12]
Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws
Victor H de la Pena. Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability, 32 0 (3A): 0 1902--1933, 2004
work page 1902
-
[13]
Self-normalized processes: Limit theory and Statistical Applications
Victor H de la Pena, Tze Leung Lai, and Qi-Man Shao. Self-normalized processes: Limit theory and Statistical Applications. Springer, 2009
work page 2009
-
[14]
The minimum description length principle
Peter D Gr \"u nwald. The minimum description length principle. MIT press, 2007
work page 2007
-
[15]
On pathwise counterparts of D oob’s maximal inequalities
AA Gushchin. On pathwise counterparts of D oob’s maximal inequalities. Proceedings of the Steklov Institute of Mathematics, 287 0 (1): 0 118--121, 2014
work page 2014
-
[16]
Time-uniform C hernoff bounds via nonnegative supermartingales
Steven R Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform C hernoff bounds via nonnegative supermartingales. Probability Surveys, 2020
work page 2020
-
[17]
Time-uniform, nonparametric, nonasymptotic confidence sequences
Steven R Howard, Aaditya Ramdas, Jon Mcauliffe, and Jasjeet Sekhon. Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics, 49 0 (2): 0 1055--1080, 2021
work page 2021
-
[18]
Emilie Kaufmann and Wouter M. Koolen. Mixture martingales revisited with applications to sequential tests and confidence intervals. Journal of Machine Learning Research, 22 0 (246): 0 1--44, 2021
work page 2021
-
[19]
A Modern Introduction to Online Learning
Francesco Orabona. A modern introduction to online learning. arXiv preprint arXiv:1912.13213, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1912
-
[20]
Tight concentrations and confidence sequences from the regret of universal portfolio
Francesco Orabona and Kwang-Sung Jun. Tight concentrations and confidence sequences from the regret of universal portfolio. IEEE Transactions on Information Theory, 2023
work page 2023
-
[21]
Coin betting and parameter-free online learning
Francesco Orabona and D \'a vid P \'a l. Coin betting and parameter-free online learning. Advances in Neural Information Processing Systems, 29, 2016
work page 2016
-
[22]
On equivalence of martingale tail bounds and deterministic regret inequalities
Alexander Rakhlin and Karthik Sridharan. On equivalence of martingale tail bounds and deterministic regret inequalities. In Conference on Learning Theory, pages 1704--1722. PMLR, 2017
work page 2017
-
[23]
Hypothesis testing with e-values
Aaditya Ramdas and Ruodu Wang. Hypothesis testing with e-values. Foundations and Trends in Statistics, 2025
work page 2025
-
[24]
Game-theoretic statistics and safe anytime-valid inference
Aaditya Ramdas, Peter Gr \"u nwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38 0 (4): 0 576--601, 2023
work page 2023
-
[25]
Fisher information and stochastic complexity
Jorma J Rissanen. Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42 0 (1): 0 40--47, 2002
work page 2002
-
[26]
Iterated logarithm inequalities and related statistical procedures
Herbert Robbins and David Siegmund. Iterated logarithm inequalities and related statistical procedures. Mathematics of the Decision Sciences, 2: 0 267--279, 1968
work page 1968
-
[27]
Boundary crossing probabilities for the wiener process and sample sums
Herbert Robbins and David Siegmund. Boundary crossing probabilities for the wiener process and sample sums. The Annals of Mathematical Statistics, pages 1410--1429, 1970
work page 1970
-
[28]
A composite generalization of ville’s martingale theorem using e-processes
Johannes Ruf, Martin Larsson, Wouter M Koolen, and Aaditya Ramdas. A composite generalization of ville’s martingale theorem using e-processes. Electronic Journal of Probability, 28: 0 1--21, 2023
work page 2023
-
[29]
Testing by betting: A strategy for statistical and scientific communication
Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication. Journal of the Royal Statistical Society Series A: Statistics in Society, 184 0 (2): 0 407--431, 2021
work page 2021
-
[30]
Probability and finance: it's only a game!, volume 491
Glenn Shafer and Vladimir Vovk. Probability and finance: it's only a game!, volume 491. John Wiley & Sons, 2005
work page 2005
-
[31]
Game-theoretic foundations for probability and finance
Glenn Shafer and Vladimir Vovk. Game-theoretic foundations for probability and finance. John Wiley & Sons, 2019
work page 2019
-
[32]
Etude critique de la notion de collectif
Jean Ville. Etude critique de la notion de collectif. Gauthier-Villars Paris, 1939
work page 1939
-
[33]
Estimating means of bounded random variables by betting
Ian Waudby-Smith and Aaditya Ramdas. Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86 0 (1): 0 1--27, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.