pith. sign in

arxiv: 2605.14692 · v2 · pith:G25Z54ICnew · submitted 2026-05-14 · 🧮 math.ST · stat.ME· stat.TH

Asymptotic Anytime-Valid Inference for U-statistics

Pith reviewed 2026-05-20 21:14 UTC · model grok-4.3

classification 🧮 math.ST stat.MEstat.TH
keywords U-statisticsanytime-valid inferenceconfidence sequencesnondegenerate casedegenerate caseGaussian chaossequential analysistime-uniform rates
0
0 comments X

The pith

Asymptotic anytime-valid confidence sequences for degree-two U-statistics achieve optimal time-uniform rates in both nondegenerate and degenerate regimes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops asymptotic anytime-valid confidence sequences for degree-two U-statistics that remain valid under continuous monitoring of data streams. In the nondegenerate case it applies Hoeffding's projection to reduce the problem to a time-uniform central limit theorem on the first-order terms while showing the remainder vanishes under mild moments, then supplies a leave-one-out jackknife estimator that makes the procedure fully data-driven. In the degenerate case it approximates the U-statistic by a centered quadratic Gaussian chaos and introduces the Spectrally Allocated Gaussian-chaos Excursion boundary together with consistent truncated-spectrum plug-in estimators. A reader would care because the resulting sequences attain the expected optimal widths of sqrt(log log n/n) and log log n/n respectively, enabling valid inference in sequential experiments without fixed-sample stopping rules.

Core claim

The central claim is that asymptotic anytime-valid confidence sequences can be constructed for degree-two U-statistics under continuous monitoring. In the nondegenerate regime Hoeffding's projection reduces the statistic to partial sums of the first-order projection whose time-uniform central limit theory yields the sequences once the canonical remainder is shown negligible under mild moment assumptions, with a leave-one-out jackknife supplying the variance estimator. In the degenerate regime the U-statistic is approximated by a centered quadratic Gaussian chaos rather than a simple Gaussian; the Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary is developed for this process and

What carries the argument

Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary together with Hoeffding projection and truncated spectrum estimation, which together produce time-uniform bounds for the nondegenerate and degenerate regimes.

If this is right

  • Common degree-two U-statistics such as sample variance or Kendall's tau acquire valid anytime-valid inference procedures.
  • The procedures are fully data-driven and require no user-specified tuning parameters beyond the data.
  • The widths match the time-uniform optimal rates of sqrt(log log n/n) nondegenerate and log log n/n degenerate.
  • Several standard U-statistics fit directly inside the proposed framework with explicit implementations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same projection-plus-chaos strategy could be tested on higher-degree U-statistics to see whether analogous boundaries appear.
  • The approach may connect to sequential estimation problems in online learning where data arrives continuously.
  • Real-world streaming datasets could be used to check whether the asymptotic widths translate to practical gains over fixed-sample methods.

Load-bearing premise

The canonical remainder is negligible under mild moment assumptions in the nondegenerate case, and the truncated spectrum estimator is consistent for the plug-in SAGE boundary in the degenerate case.

What would settle it

A numerical experiment in which the empirical coverage of the constructed sequences drops below the nominal level for large sample sizes in the degenerate regime would falsify the consistency claim for the truncated spectrum estimator.

Figures

Figures reproduced from arXiv: 2605.14692 by Leheng Cai, Qirui Hu, Weijia Li.

Figure 1
Figure 1. Figure 1: Left: cumulative miscoverage rates of the proposed AsympCSs and the classical pointwise CIs for GMD under the standard Gaussian distribution. Middle: averaged half-widths of the proposed AsympCSs and the classical pointwise CIs. Right: a single sample path of the statistics Un for GMD alongside the three boundaries, where the black horizontal line indicates the true parameter. The horizontal axis in all th… view at source ↗
Figure 2
Figure 2. Figure 2: Sensitivity of AsympCS-LIL for GMD to the cold-start parameter [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Left: the size and power comparison of the proposed sequential testing procedure using the SAGE boundaries and the classical test procedure for sequential two-sample test with MMD kernel statistics under standard Gaussian distribution. Dashed lines represent the power under H1 with δ = 0.3, while solid lines represent the size under H0 with δ = 0. Middle: the empirical power over mean shifts δ ∈ [0, 0.45] … view at source ↗
read the original abstract

We study asymptotic anytime-valid confidence sequences for degree-two U-statistics under continuous monitoring. In the nondegenerate case, Hoeffding's projection reduces the problem to a time-uniform central limit theory for the partial sums of the first-order projection, while the canonical remainder is shown to be negligible under mild moment assumptions. A leave-one-out jackknife estimator then yields a fully data-driven procedure, leading to confidence sequences with asymptotic coverage guarantee for the parameter of interest. In the degenerate case, we show that the U-statistic is approximated by a centered quadratic Gaussian-chaos rather than by a simple Gaussian, which poses significant challenges for sequential inference. To address this issue, we novelly develop the Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary, and then provide plug-in implementations based on truncated spectrum estimation with consistency guarantees. The resulting widths can attain the expected time-uniform optimal rates: $\sqrt{\log\log n/n}$ in the nondegenerate regime and $\log\log n/n$ in the degenerate regime. Several widely used U-statistics are discussed within the proposed framework, and numerical experiments further support the validity of the derived theory.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript develops asymptotic anytime-valid confidence sequences for degree-two U-statistics under continuous monitoring. In the nondegenerate regime, Hoeffding projection reduces the problem to a time-uniform CLT on the first-order projection (with the canonical remainder shown negligible under mild moments), and a leave-one-out jackknife yields a fully data-driven procedure with asymptotic coverage. In the degenerate regime, the U-statistic is approximated by a centered quadratic Gaussian chaos; the authors introduce the Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary and implement it via truncated spectrum estimation with claimed consistency, attaining the rates √(log log n/n) (nondegenerate) and log log n/n (degenerate). Several standard U-statistics and numerical experiments are discussed.

Significance. If the central derivations hold, the work would extend anytime-valid inference to a broad class of U-statistics that arise in nonparametric estimation. The nondegenerate reduction leverages established tools while the SAGE boundary supplies a novel construction for the degenerate quadratic-chaos case, potentially enabling tight sequential intervals where Gaussian approximations fail. The claimed optimal rates align with known time-uniform lower bounds and would be a useful addition to the sequential-inference literature.

major comments (1)
  1. [degenerate case and SAGE boundary] Degenerate-regime analysis (abstract and corresponding section): the consistency claim for the truncated spectrum estimator used in the plug-in SAGE boundary is stated under mild moment assumptions, yet the argument appears to establish only pointwise consistency rather than uniform o(1) control over the entire monitoring horizon. Because the SAGE boundary must be plugged in with error vanishing uniformly to preserve the log log n/n rate, a fixed or non-adaptive truncation level risks either undercoverage or inflated widths; this step is load-bearing for the degenerate claim.
minor comments (2)
  1. [abstract] The abstract states that 'several widely used U-statistics are discussed within the proposed framework'; naming the specific examples (e.g., sample variance, Kendall's tau) already in the introduction would improve readability.
  2. [degenerate case] Notation for the truncation level in the spectrum estimator should be introduced with an explicit dependence on the horizon or on the realized eigenvalues to clarify how the bias-variance tradeoff is controlled.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for raising this important point about the uniform consistency of the spectrum estimator in the degenerate regime. We will revise the paper to clarify and strengthen this aspect of the proof.

read point-by-point responses
  1. Referee: Degenerate-regime analysis (abstract and corresponding section): the consistency claim for the truncated spectrum estimator used in the plug-in SAGE boundary is stated under mild moment assumptions, yet the argument appears to establish only pointwise consistency rather than uniform o(1) control over the entire monitoring horizon. Because the SAGE boundary must be plugged in with error vanishing uniformly to preserve the log log n/n rate, a fixed or non-adaptive truncation level risks either undercoverage or inflated widths; this step is load-bearing for the degenerate claim.

    Authors: We appreciate the referee's comment and agree that uniform consistency is crucial for maintaining the asymptotic validity and optimal rate in the degenerate case. In the current manuscript, the consistency is derived using moment conditions that permit the application of uniform laws of large numbers or concentration bounds over the monitoring horizon. To address the concern explicitly, we will add a new lemma in the revised version that proves the truncated spectrum estimator converges uniformly in probability to the true spectrum over all n. This will be achieved by bounding the supremum of the estimation error using chaining arguments or maximal inequalities under the mild moment assumptions. Consequently, the plug-in error in the SAGE boundary will be uniformly o(1), ensuring the log log n/n rate is preserved without undercoverage or unnecessary inflation of widths. We will also specify how the truncation level is chosen (e.g., based on a data-driven criterion that works uniformly). We believe this revision will fully resolve the issue. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation relies on standard projections plus independent boundary construction

full rationale

The paper's nondegenerate case reduces the U-statistic via Hoeffding projection to a time-uniform CLT for the first-order projection plus a negligible canonical remainder under mild moments, then applies a leave-one-out jackknife estimator; these steps invoke external results rather than self-referential definitions. In the degenerate case the authors introduce a new Spectrally Allocated Gaussian-chaos Excursion (SAGE) boundary and prove consistency of a truncated spectrum plug-in estimator, with no indication that the boundary or its rate is obtained by fitting to the target quantity or by renaming an input. The overall widths are derived from these constructions rather than being forced by any fitted parameter or self-citation chain, rendering the argument self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The claims rest on standard projection techniques from prior literature plus new boundary construction and mild moment conditions; no explicit free parameters are fitted in the abstract description.

axioms (1)
  • domain assumption Mild moment assumptions suffice for the canonical remainder to be negligible
    Invoked for the nondegenerate case reduction to time-uniform CLT.
invented entities (1)
  • SAGE boundary no independent evidence
    purpose: Time-uniform boundary for centered quadratic Gaussian-chaos processes arising in degenerate U-statistics
    Newly developed to handle the degenerate regime where standard Gaussian approximations fail.

pith-pipeline@v0.9.0 · 5734 in / 1318 out tokens · 67788 ms · 2026-05-20T21:14:15.144181+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages

  1. [1]

    Arcones and Evarist Gin´ e

    Miguel A. Arcones and Evarist Gin´ e. Limit theorems for U-processes.The Annals of Probability, 21(3):1494–1542, 1993

  2. [2]

    On a new multivariate two-sample test.Journal of Multivariate Analysis, 88(1):190–206, 2004

    Ludwig Baringhaus and Carsten Franz. On a new multivariate two-sample test.Journal of Multivariate Analysis, 88(1):190–206, 2004

  3. [3]

    Sequential monitoring for distributional changepoint using degenerate U-statistics, 2025

    Cooper Boniece, Lajos Horvath, and Lorenzo Trapani. Sequential monitoring for distributional changepoint using degenerate U-statistics, 2025

  4. [4]

    Kyungmee Choi and John I. Marden. A multivariate version of kendall’s tau.Journal of Nonpara- metric Statistics, 9(3):261–293, 1998

  5. [5]

    Fast two-sample testing with analytic representations of probability measures

    Kacper Chwialkowski, Aaditya Ramdas, Dino Sejdinovic, and Arthur Gretton. Fast two-sample testing with analytic representations of probability measures. InAdvances in Neural Information Processing Systems 28, 2015

  6. [6]

    D. A. Darling and Herbert Robbins. Iterated logarithm inequalities.Proceedings of the National Academy of Sciences of the United States of America, 57(5):1188–1192, 1967

  7. [7]

    A central limit theorem for generalized quadratic forms.Probability Theory and Related Fields, 75(2):261–277, 1987

    Piet de Jong. A central limit theorem for generalized quadratic forms.Probability Theory and Related Fields, 75(2):261–277, 1987

  8. [8]

    de la Pe˜ na and Evarist Gin´ e.Decoupling: From Dependence to Independence

    Victor H. de la Pe˜ na and Evarist Gin´ e.Decoupling: From Dependence to Independence. Springer, New York, 1999

  9. [9]

    Invariance principles for von Mises and U-statistics.Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und Verwandte Gebiete, 67(2):139–167, 1984

    Herold Dehling, Manfred Denker, and Walter Philipp. Invariance principles for von Mises and U-statistics.Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und Verwandte Gebiete, 67(2):139–167, 1984

  10. [10]

    Change-point detection under dependence based on two-sample U-statistics

    Herold Dehling, Roland Fried, Isabel Garc´ ıa, and Martin Wendler. Change-point detection under dependence based on two-sample U-statistics. InAsymptotic Laws and Methods in Stochastics, volume 76 ofFields Institute Communications, pages 195–220. Springer, New York, 2015

  11. [11]

    Change-point detection based on weighted two-sample U-statistics.Electronic Journal of Statistics, 16(1):862–891, 2022

    Herold Dehling, Kata Vuk, and Martin Wendler. Change-point detection based on weighted two-sample U-statistics.Electronic Journal of Statistics, 16(1):862–891, 2022

  12. [12]

    Tyler, and Daniel Vogel

    Alexander D¨ urre, David E. Tyler, and Daniel Vogel. On the eigenvalues of the spatial sign covariance matrix in more than two dimensions.Statistics & Probability Letters, 111:80–85, 2016

  13. [13]

    A functional law of the iterated logarithm for U-statistic type processes.Acta Applicandae Mathematicae, 78:115–120, 2003

    Dietmar Ferger. A functional law of the iterated logarithm for U-statistic type processes.Acta Applicandae Mathematicae, 78:115–120, 2003

  14. [14]

    The LIL for canonical U-statistics of order 2.The Annals of Probability, 29(1):520–557, 2001

    Evarist Gin´ e, Stanis law Kwapie´ n, Rafa l Lata la, and Joel Zinn. The LIL for canonical U-statistics of order 2.The Annals of Probability, 29(1):520–557, 2001. 13

  15. [15]

    A new and flexible class of sharp asymptotic time-uniform confidence sequences.Statistics & Probability Letters, 226:110462, 2025

    Felix Gnettner and Claudia Kirch. A new and flexible class of sharp asymptotic time-uniform confidence sequences.Statistics & Probability Letters, 226:110462, 2025

  16. [16]

    Borgwardt, Malte J

    Arthur Gretton, Karsten M. Borgwardt, Malte J. Rasch, Bernhard Sch¨ olkopf, and Alexander J. Smola. A kernel two-sample test.Journal of Machine Learning Research, 13:723–773, 2012

  17. [17]

    On the invariance principle for U-statistics.Stochastic Processes and their Applications, 9(2):163–174, 1979

    Peter Hall. On the invariance principle for U-statistics.Stochastic Processes and their Applications, 9(2):163–174, 1979

  18. [18]

    ECA: High-dimensional elliptical component analysis in non-gaussian distributions.Journal of the American Statistical Association, 113(521):252–268, 2018

    Fang Han and Han Liu. ECA: High-dimensional elliptical component analysis in non-gaussian distributions.Journal of the American Statistical Association, 113(521):252–268, 2018

  19. [19]

    A class of statistics with asymptotically normal distribution.The Annals of Mathematical Statistics, 19(3):293–325, 1948

    Wassily Hoeffding. A class of statistics with asymptotically normal distribution.The Annals of Mathematical Statistics, 19(3):293–325, 1948

  20. [20]

    Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon

    Steven R. Howard, Aaditya Ramdas, Jon McAuliffe, and Jasjeet Sekhon. Time-uniform, nonpara- metric, nonasymptotic confidence sequences.The Annals of Statistics, 49(2):1055–1080, 2021

  21. [21]

    Maurice G. Kendall. A new measure of rank correlation.Biometrika, 30(1/2):81–93, 1938

  22. [22]

    Sequential change point tests based on U-statistics.Scandi- navian Journal of Statistics, 49(3):1184–1214, 2022

    Claudia Kirch and Christina Stoehr. Sequential change point tests based on U-statistics.Scandi- navian Journal of Statistics, 49(3):1184–1214, 2022

  23. [23]

    Random matrix approximation of spectra of integral operators.Bernoulli, 6(1):113–167, 2000

    Vladimir Koltchinskii and Evarist Gin´ e. Random matrix approximation of spectra of integral operators.Bernoulli, 6(1):113–167, 2000

  24. [24]

    Korolyuk and Yuri V

    Vladimir S. Korolyuk and Yuri V. Borovskich.Theory of U-Statistics. Kluwer Academic Publishers, Dordrecht, 1994

  25. [25]

    Lee.U-Statistics: Theory and Practice

    Alan J. Lee.U-Statistics: Theory and Practice. Marcel Dekker, New York, 1990

  26. [26]

    Strong Gaussian approximation for U-statistics in high dimensions and beyond.arXiv preprint arXiv:2603.10595, 2026

    Weijia Li, Leheng Cai, and Qirui Hu. Strong Gaussian approximation for U-statistics in high dimensions and beyond.arXiv preprint arXiv:2603.10595, 2026

  27. [27]

    Mann and Donald R

    Henry B. Mann and Donald R. Whitney. On a test of whether one of two random variables is stochastically larger than the other.The Annals of Mathematical Statistics, 18(1):50–60, 1947

  28. [28]

    Asymptotic results for stopping times based on U-statistics

    Nitis Mukhopadhyay and Inger Vik. Asymptotic results for stopping times based on U-statistics. Sequential Analysis, 4(1–2):83–109, 1985

  29. [29]

    Convergence rates for two-stage confidence intervals based on U-statistics.Annals of the Institute of Statistical Mathematics, 40(1):111–117, 1988

    Nitis Mukhopadhyay and Inger Vik. Convergence rates for two-stage confidence intervals based on U-statistics.Annals of the Institute of Statistical Mathematics, 40(1):111–117, 1988

  30. [30]

    Masoud M. Nasari. Studentized processes of U-statistics, 2009

  31. [31]

    Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

    Aaditya Ramdas, Peter Gr¨ unwald, Vladimir Vovk, and Glenn Shafer. Game-theoretic statistics and safe anytime-valid inference.Statistical Science, 38(4):576–601, 2023

  32. [32]

    Statistical methods related to the law of the iterated logarithm.The Annals of Mathematical Statistics, 41(5):1397–1409, 1970

    Herbert Robbins. Statistical methods related to the law of the iterated logarithm.The Annals of Mathematical Statistics, 41(5):1397–1409, 1970. 14

  33. [33]

    Boundary crossing probabilities for the Wiener process and sample sums.The Annals of Mathematical Statistics, 41(5):1410–1429, 1970

    Herbert Robbins and David Siegmund. Boundary crossing probabilities for the Wiener process and sample sums.The Annals of Mathematical Statistics, 41(5):1410–1429, 1970

  34. [34]

    On learning with integral operators

    Lorenzo Rosasco, Mikhail Belkin, and Ernesto De Vito. On learning with integral operators. Journal of Machine Learning Research, 11:905–934, 2010

  35. [35]

    Equivalence of distance-based and RKHS-based statistics in hypothesis testing.The Annals of Statistics, 41(5):2263–2291, 2013

    Dino Sejdinovic, Bharath Sriperumbudur, Arthur Gretton, and Kenji Fukumizu. Equivalence of distance-based and RKHS-based statistics in hypothesis testing.The Annals of Statistics, 41(5):2263–2291, 2013

  36. [36]

    Serfling.Approximation Theorems of Mathematical Statistics

    Robert J. Serfling.Approximation Theorems of Mathematical Statistics. John Wiley & Sons, New York, 1980

  37. [37]

    Raymond N. Sproule. Sequential nonparametric fixed-width confidence intervals for U-statistics. The Annals of Statistics, 13(1):228–235, 1985

  38. [38]

    Sz´ ekely and Maria L

    G´ abor J. Sz´ ekely and Maria L. Rizzo. Energy statistics: A class of statistics based on distances. Journal of Statistical Planning and Inference, 143(8):1249–1272, 2013

  39. [39]

    Sz´ ekely, Maria L

    G´ abor J. Sz´ ekely, Maria L. Rizzo, and Nail K. Bakirov. Measuring and testing dependence by correlation of distances.The Annals of Statistics, 35(6):2769–2794, 2007

  40. [40]

    ´Etude Critique de la Notion de Collectif

    Jean Ville. ´Etude Critique de la Notion de Collectif. Gauthier-Villars, Paris, 1939

  41. [41]

    Kennedy, and Aaditya Ramdas

    Ian Waudby-Smith, David Arbour, Ritwik Sinha, Edward H. Kennedy, and Aaditya Ramdas. Time-uniform central limit theory and asymptotic confidence sequences.The Annals of Statistics, 52(6):2613–2640, 2024

  42. [42]

    Estimating means of bounded random variables by betting.Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(1):1–27, 02 2024

    Ian Waudby-Smith and Aaditya Ramdas. Estimating means of bounded random variables by betting.Journal of the Royal Statistical Society Series B: Statistical Methodology, 86(1):1–27, 02 2024

  43. [43]

    Blaschko

    Wojciech Zaremba, Arthur Gretton, and Matthew B. Blaschko. B-test: A non-parametric, low variance kernel two-sample test. InAdvances in Neural Information Processing Systems 26, pages 755–763, 2013. 15 A Additional numerical results A.1 Results of sensitivity experiments Figure A.1 reports the sensitivity analysis for the weight allocation, as discussed i...