pith. sign in

arxiv: 2512.12325 · v3 · submitted 2025-12-13 · 💻 cs.LG · math.ST· stat.ML· stat.TH

Eventually LIL Regret: Almost Sure lnln T Regret for a sub-Gaussian Mixture on Unbounded Data

Pith reviewed 2026-05-16 22:37 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.MLstat.TH
keywords regret boundssub-Gaussian mixtureVille eventpathwise boundsonline learningalmost sure convergenceunbounded datastochastic processes
0
0 comments X

The pith

A sub-Gaussian mixture yields a deterministic regret bound of order ln ln V_T that holds eventually on almost every path for unbounded data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that Robbins' classic sub-Gaussian mixture satisfies a pathwise regret bound that depends only on a cumulative variance process V_T. On the Ville event E_alpha the regret up to time T is at most a constant times ln squared of 1/alpha over V_T plus ln of 1/alpha plus ln ln V_T; when V_T is already large the bound simplifies to ln of 1/alpha plus ln ln V_T. Because the event E_0 of probability one contains every path that satisfies the bound eventually, the mixture achieves almost-sure ln ln V_T regret without requiring the data to be bounded. The result supplies a deterministic guarantee that works even when observations are unbounded, thereby connecting adversarial online-learning regret analysis to game-theoretic statistics that tolerate unbounded sequences under mild distributional conditions.

Core claim

On the Ville event E_0 of probability one the regret of the sub-Gaussian mixture is eventually bounded by a universal constant times ln ln V_T, where V_T is the cumulative variance process; for any fixed alpha the finite-time bound on E_alpha is ln squared of 1/alpha over V_T plus ln of 1/alpha plus ln ln V_T up to constants, and this bound is deterministic once the observed sequence lies inside the event.

What carries the argument

The Ville event E_alpha, the set of paths on which the mixture's cumulative betting wealth or sum process remains controlled by the variance process V_T.

If this is right

  • Regret guarantees become deterministic once a path enters the high-probability Ville event.
  • Unbounded observations are admissible because the bound depends on realized variance rather than a uniform bound.
  • The same mixture simultaneously serves adversarial regret analysis and stochastic concentration arguments.
  • Conditional bounds of this form supply a direct bridge between game-theoretic statistics and online learning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar pathwise bounds may exist for other exponential or sub-exponential mixtures when an appropriate variance proxy is substituted for V_T.
  • The ln ln V_T rate suggests that mixture-based algorithms could achieve almost-sure regret in non-stationary or drifting environments where variance grows slowly.
  • One could test whether replacing the mixture with a different aggregation rule yields a tighter constant in front of the ln ln term on the same event.

Load-bearing premise

The observed sequence must lie inside the Ville event E_alpha.

What would settle it

A single infinite path inside E_0 on which the regret exceeds any fixed multiple of ln ln V_T for infinitely many times with V_T growing to infinity.

read the original abstract

We prove that a classic sub-Gaussian mixture proposed by Robbins in a stochastic setting actually satisfies a path-wise (deterministic) regret bound. For every path in a natural ``Ville event'' $\mathcal E_\alpha$, this regret till time $T$ is bounded by $\ln^2(1/\alpha)/V_T + \ln (1/\alpha) + \ln \ln V_T$ up to universal constants, where $V_T$ is a nonnegative, nondecreasing, cumulative variance process. (The bound reduces to $\ln(1/\alpha) + \ln \ln V_T$ if $V_T \geq \ln(1/\alpha)$.) If the data were stochastic, then one can show that $\mathcal E_\alpha$ has probability at least $1-\alpha$ under a wide class of distributions (eg: sub-Gaussian, symmetric, variance-bounded, etc.). In fact, we show that on the Ville event $\mathcal E_0$ of probability one, the regret on every path in $\mathcal E_0$ is eventually bounded by $\ln \ln V_T$ (up to constants). We explain how this work helps bridge the world of adversarial online learning (which usually deals with regret bounds for bounded data), with game-theoretic statistics (which can handle unbounded data, albeit using stochastic assumptions). In short, conditional regret bounds serve as a bridge between stochastic and adversarial betting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The paper proves a pathwise (deterministic) regret bound for Robbins' classic sub-Gaussian mixture. On every trajectory in the Ville event E_alpha (which has probability at least 1-alpha under sub-Gaussian, symmetric or variance-bounded distributions), the regret up to time T satisfies R_T <= C (ln^2(1/alpha)/V_T + ln(1/alpha) + ln ln V_T) for a universal constant C, where V_T is the cumulative variance process. The bound simplifies to an eventually ln ln V_T form (up to constants) on the probability-1 event E_0. The work positions this as a bridge between adversarial online learning (bounded data) and game-theoretic statistics (unbounded data under stochastic assumptions).

Significance. If the derivation is correct, the result is significant: it supplies an almost-sure, pathwise regret bound that holds deterministically on a high-probability event without requiring bounded observations, thereby linking the two literatures. The reduction to an eventually ln ln V_T bound on E_0 is a strong asymptotic statement. The manuscript ships an explicit mixture construction and a deterministic argument that avoids hidden stochastic steps inside the pathwise claim.

major comments (1)
  1. The central derivation that converts the Robbins mixture into the stated pathwise bound on E_alpha is only sketched in the abstract; the explicit steps relating the mixture weights, the variance process V_T, and the regret on every path in E_0 must be written out in full (ideally with the key inequality labeled as a numbered display) so that the reduction to ln ln V_T can be verified directly.
minor comments (3)
  1. State the precise universal constants that appear in the O(ln ln V_T) bound and indicate whether they depend on the sub-Gaussian parameter.
  2. Clarify the exact definition of the cumulative variance process V_T for the unbounded-data case and confirm it is nondecreasing and adapted to the filtration used for the Ville event.
  3. Add a short remark on how the bound behaves when V_T remains small for long periods (i.e., before the ln ln V_T regime begins).

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and the recommendation of minor revision. The single major comment is addressed below.

read point-by-point responses
  1. Referee: The central derivation that converts the Robbins mixture into the stated pathwise bound on E_alpha is only sketched in the abstract; the explicit steps relating the mixture weights, the variance process V_T, and the regret on every path in E_0 must be written out in full (ideally with the key inequality labeled as a numbered display) so that the reduction to ln ln V_T can be verified directly.

    Authors: We agree that the derivation requires fuller exposition. In the revised manuscript we will insert a dedicated subsection (immediately following the statement of the main theorem) that begins from the explicit form of the Robbins mixture weights, applies the definition of the Ville event E_alpha to obtain a deterministic inequality relating the cumulative variance process V_T to the mixture, and derives the pathwise bound R_T <= C (ln^2(1/alpha)/V_T + ln(1/alpha) + ln ln V_T) via a sequence of labeled displays. The same subsection will then specialize the bound on the probability-one event E_0 to obtain the eventual ln ln V_T form. All steps will be deterministic and free of hidden stochastic arguments. revision: yes

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper derives a deterministic pathwise regret bound directly from the Robbins mixture construction, expressed in terms of the independently defined Ville event E_alpha and the cumulative variance process V_T. This bound holds on every trajectory inside E_alpha without reducing to fitted parameters, self-referential equations, or load-bearing self-citations. The separate claim that E_alpha has probability at least 1-alpha under sub-Gaussian or variance-bounded distributions is an external probabilistic statement and does not enter the deterministic derivation. No step in the provided chain equates a prediction to its own inputs by construction, so the result is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the existence and probability properties of the Ville event under standard distribution classes together with the definition of the sub-Gaussian mixture; no free parameters are fitted and no new entities are postulated.

axioms (1)
  • domain assumption The Ville event E_alpha has probability at least 1-alpha under sub-Gaussian, symmetric, or variance-bounded distributions
    Abstract states this holds for a wide class of distributions including sub-Gaussian ones

pith-pipeline@v0.9.0 · 5581 in / 1295 out tokens · 50565 ms · 2026-05-16T22:37:03.767202+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

  1. [1]

    A trajectorial interpretation of Doob's martingale inequalities

    B Acciaio, M Beiglb \"o ck, F Penkner, W Schachermayer, and J Temme. A trajectorial interpretation of Doob's martingale inequalities. The Annals of Applied Probability, pages 1494--1505, 2013

  2. [2]

    Optimal best-arm identification methods for tail-risk measures

    Shubhada Agrawal, Wouter M Koolen, and Sandeep Juneja. Optimal best-arm identification methods for tail-risk measures. Advances in Neural Information Processing Systems, 34: 0 25578--25590, 2021

  3. [3]

    Martingale inequalities and deterministic counterparts

    Mathias Beiglb \"o ck and Marcel Nutz. Martingale inequalities and deterministic counterparts. Electronic Journal of Probability, 19: 0 1--15, 2014

  4. [4]

    Pathwise versions of the B urkholder- D avis- G undy inequality

    Mathias Beiglb \"o ck and Pietro Siorpaes. Pathwise versions of the B urkholder- D avis- G undy inequality. Bernoulli, pages 360--373, 2015

  5. [5]

    Confidence sequences for generalized linear models via regret analysis

    Eugenio Clerico, Hamish Flynn, and Gergely Neu. Confidence sequences for generalized linear models via regret analysis. arXiv preprint arXiv:2504.16555, 2025

  6. [6]

    Universal portfolios

    Thomas M Cover. Universal portfolios. Mathematical finance, 1 0 (1): 0 1--29, 1991

  7. [7]

    Universal portfolios with side information

    Thomas M Cover and Erik Ordentlich. Universal portfolios with side information. IEEE Transactions on Information Theory, 42 0 (2): 0 348--363, 2002

  8. [8]

    Black-box reductions for parameter-free online learning in B anach spaces

    Ashok Cutkosky and Francesco Orabona. Black-box reductions for parameter-free online learning in B anach spaces. In Conference On Learning Theory, pages 1493--1529. PMLR, 2018

  9. [9]

    Matrix-free preconditioning in online learning

    Ashok Cutkosky and Tamas Sarlos. Matrix-free preconditioning in online learning. In International Conference on Machine Learning, pages 1455--1464. PMLR, 2019

  10. [10]

    Confidence sequences for mean, variance, and median

    Donald A Darling and Herbert Robbins. Confidence sequences for mean, variance, and median. Proceedings of the National Academy of Sciences, 58 0 (1): 0 66--68, 1967 a

  11. [11]

    Iterated logarithm inequalities

    Donald A Darling and Herbert Robbins. Iterated logarithm inequalities. Proceedings of the National Academy of Sciences, 57 0 (5): 0 1188--1192, 1967 b

  12. [12]

    Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws

    Victor H de la Pena. Self-normalized processes: Exponential inequalities, moment bounds and iterated logarithm laws. The Annals of Probability, 32 0 (3A): 0 1902--1933, 2004

  13. [13]

    Self-normalized processes: Limit theory and Statistical Applications

    Victor H de la Pena, Tze Leung Lai, and Qi-Man Shao. Self-normalized processes: Limit theory and Statistical Applications. Springer, 2009

  14. [14]

    The minimum description length principle

    Peter D Gr \"u nwald. The minimum description length principle. MIT press, 2007

  15. [15]

    On pathwise counterparts of D oob’s maximal inequalities

    AA Gushchin. On pathwise counterparts of D oob’s maximal inequalities. Proceedings of the Steklov Institute of Mathematics, 287 0 (1): 0 118--121, 2014

  16. [16]

    Time-uniform C hernoff bounds via nonnegative supermartingales

    Steven R Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform C hernoff bounds via nonnegative supermartingales. Probability Surveys, 2020

  17. [17]

    Time-uniform, nonparametric, nonasymptotic confidence sequences

    Steven R Howard, Aaditya Ramdas, Jon Mcauliffe, and Jasjeet Sekhon. Time-uniform, nonparametric, nonasymptotic confidence sequences. The Annals of Statistics, 49 0 (2): 0 1055--1080, 2021

  18. [18]

    Emilie Kaufmann and Wouter M. Koolen. Mixture martingales revisited with applications to sequential tests and confidence intervals. Journal of Machine Learning Research, 22 0 (246): 0 1--44, 2021

  19. [19]

    A Modern Introduction to Online Learning

    Francesco Orabona. A modern introduction to online learning. arXiv preprint arXiv:1912.13213, 2019

  20. [20]

    Tight concentrations and confidence sequences from the regret of universal portfolio

    Francesco Orabona and Kwang-Sung Jun. Tight concentrations and confidence sequences from the regret of universal portfolio. IEEE Transactions on Information Theory, 2023

  21. [21]

    Coin betting and parameter-free online learning

    Francesco Orabona and D \'a vid P \'a l. Coin betting and parameter-free online learning. Advances in Neural Information Processing Systems, 29, 2016

  22. [22]

    On equivalence of martingale tail bounds and deterministic regret inequalities

    Alexander Rakhlin and Karthik Sridharan. On equivalence of martingale tail bounds and deterministic regret inequalities. In Conference on Learning Theory, pages 1704--1722. PMLR, 2017

  23. [23]

    Hypothesis testing with e-values

    Aaditya Ramdas and Ruodu Wang. Hypothesis testing with e-values. Foundations and Trends in Statistics, 2025

  24. [24]

    Game-theoretic statistics and safe anytime-valid inference

    Aaditya Ramdas, Peter Gr \"u nwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference. Statistical Science, 38 0 (4): 0 576--601, 2023

  25. [25]

    Fisher information and stochastic complexity

    Jorma J Rissanen. Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42 0 (1): 0 40--47, 2002

  26. [26]

    Iterated logarithm inequalities and related statistical procedures

    Herbert Robbins and David Siegmund. Iterated logarithm inequalities and related statistical procedures. Mathematics of the Decision Sciences, 2: 0 267--279, 1968

  27. [27]

    Boundary crossing probabilities for the wiener process and sample sums

    Herbert Robbins and David Siegmund. Boundary crossing probabilities for the wiener process and sample sums. The Annals of Mathematical Statistics, pages 1410--1429, 1970

  28. [28]

    A composite generalization of ville’s martingale theorem using e-processes

    Johannes Ruf, Martin Larsson, Wouter M Koolen, and Aaditya Ramdas. A composite generalization of ville’s martingale theorem using e-processes. Electronic Journal of Probability, 28: 0 1--21, 2023

  29. [29]

    Testing by betting: A strategy for statistical and scientific communication

    Glenn Shafer. Testing by betting: A strategy for statistical and scientific communication. Journal of the Royal Statistical Society Series A: Statistics in Society, 184 0 (2): 0 407--431, 2021

  30. [30]

    Probability and finance: it's only a game!, volume 491

    Glenn Shafer and Vladimir Vovk. Probability and finance: it's only a game!, volume 491. John Wiley & Sons, 2005

  31. [31]

    Game-theoretic foundations for probability and finance

    Glenn Shafer and Vladimir Vovk. Game-theoretic foundations for probability and finance. John Wiley & Sons, 2019

  32. [32]

    Etude critique de la notion de collectif

    Jean Ville. Etude critique de la notion de collectif. Gauthier-Villars Paris, 1939

  33. [33]

    Estimating means of bounded random variables by betting

    Ian Waudby-Smith and Aaditya Ramdas. Estimating means of bounded random variables by betting. Journal of the Royal Statistical Society Series B: Statistical Methodology, 86 0 (1): 0 1--27, 2024