Null Measurability at the Symmetrization Interface in VC Learning
Pith reviewed 2026-05-08 03:53 UTC · model grok-4.3
The pith
For Borel-parameterized concept classes on Polish domains the one-sided ghost-gap event is analytic and measurable in every Borel measure completion.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For any Borel-parameterized concept class on a Polish domain, the bad event 'there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2' is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition.
What carries the argument
The analyticity of the one-sided ghost-gap bad event, which by Choquet capacitability guarantees measurability in the completion of any finite Borel measure.
If this is right
- The symmetrization argument yields PAC learnability from finite VC dimension under this weaker measurability condition in the realizable setting.
- The analytic/null-measurable level is closed under patching, fixed and countable interpolation, and fiber-product amalgamation.
- There exists a strict separation: some concept classes have null-measurable ghost-gap events that are not Borel.
- The results apply directly to the one-sided interface used by the standard symmetrization proof rather than to two-sided suprema.
Where Pith is reading between the lines
- The same analyticity argument may relax measurability assumptions in other uniform-convergence proofs that rely on symmetrization.
- The closure properties suggest that many naturally occurring classes built from simpler ones will inherit the weaker regularity.
- The Lean 4 formalization indicates the descriptive-set-theoretic steps can be verified mechanically for related theorems.
Load-bearing premise
The domain is Polish and the concept class is Borel-parameterized.
What would settle it
A Borel-parameterized concept class on a Polish domain whose ghost-gap bad event fails to be measurable in the completion of some finite Borel measure.
read the original abstract
Recent work revisiting measurability in the fundamental theorem of statistical learning imposes Borel measurability of ghost-gap suprema. We show that, at the one-sided ghost-gap interface actually used by the standard symmetrization proof, this requirement is stronger than necessary. For any Borel-parameterized concept class on a Polish domain, the bad event "there exists a hypothesis whose ghost empirical error exceeds its training empirical error by at least {\epsilon}/2" is analytic. By Choquet capacitability, it is therefore measurable in the completion of every finite Borel measure. We then construct a concept class whose bad event is null-measurable but not Borel, giving a strict separation from the Borel supremum condition. Finally, we prove closure under patching, fixed and countable interpolation, and fiber-product amalgamation, showing that the weaker regularity level is stable under natural concept-class constructors. In the realizable setting, where targets belong to the class and are measurable, these results weaken the measurability hypothesis needed by the symmetrization route from finite VC dimension to PAC learnability. The main results and the descriptive-set-theoretic infrastructure used by them are formalized in Lean 4.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper argues that the one-sided ghost-gap bad event arising in the standard symmetrization argument—i.e., the existence of a hypothesis whose ghost empirical error exceeds its training empirical error by at least ε/2—is an analytic set whenever the concept class is Borel-parameterized on a Polish domain. Choquet capacitability then yields measurability in the completion of every finite Borel measure. The authors supply an explicit separating construction of a class whose bad event is null-measurable but not Borel, establish closure of this regularity level under patching, fixed/countable interpolation and fiber-product amalgamation, and conclude that the symmetrization route from finite VC dimension to PAC learnability requires only this weaker condition in the realizable setting. All main results and the underlying descriptive-set-theoretic infrastructure are formalized in Lean 4.
Significance. If the claims hold, the work sharpens the measurability hypotheses in the fundamental theorem of statistical learning by showing that analyticity (rather than Borel measurability of the supremum) suffices at the symmetrization interface actually used by the proof. The explicit separation example demonstrates that the two regularity notions are strictly distinct, while the closure results indicate stability under natural class constructors. A notable strength is the Lean 4 formalization of both the analyticity argument and the supporting descriptive-set-theory lemmas, supplying machine-checked verification that increases in the technical correctness.
minor comments (3)
- [Abstract] Abstract: the list of closure operations (patching, fixed and countable interpolation, fiber-product amalgamation) is given without even a one-sentence gloss; a parenthetical reference to the section where each is defined would improve immediate readability.
- [Formalization section] §2 (or wherever the Lean formalization is described): a short table or enumerated list mapping the formalized theorems to the corresponding numbered statements in the paper would help readers who do not inspect the repository.
- [Introduction / Preliminaries] Notation for the ghost empirical error and the one-sided bad event is introduced gradually; an early, self-contained display equation collecting the definitions would make the analyticity claim easier to track.
Simulated Author's Rebuttal
We thank the referee for their positive summary of the paper, recognition of its significance in relaxing measurability requirements at the symmetrization interface, and the recommendation for minor revision. We are particularly pleased that the Lean 4 formalization is highlighted as increasing technical confidence. As the report lists no specific major comments under the MAJOR COMMENTS section, we have no point-by-point responses to major comments.
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper derives analyticity of the one-sided ghost-gap bad event directly from the input assumption of Borel parameterization on a Polish domain, using standard descriptive-set-theoretic closure properties rather than any redefinition or self-referential construction. Choquet capacitability is invoked as an external theorem to obtain measurability in measure completions. The separation example is an explicit construction of a concept class, and closure under patching, interpolation, and amalgamation is proven by direct arguments. The Lean 4 formalization supplies independent machine-checked verification of the infrastructure and main claims. No steps reduce by construction to fitted parameters, self-citations, or ansatzes; the chain relies on external mathematical facts and formalization.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The domain is a Polish space
- domain assumption The concept class is Borel-parameterized
Reference graph
Works this paper leans on
- [1]
-
[2]
O. Bousquet, S. Hanneke, S. Moran, R. van Handel, and A. Yehudayoff. A theory of universal learning. InProceedings of the 53rd Annual ACM Symposium on Theory of Computing, pages 532–541, 2021
work page 2021
-
[3]
G. Choquet. Theory of capacities.Annales de l’Institut Fourier, 5:131–295, 1954
work page 1954
-
[4]
L. de Moura and S. Ullrich. The Lean 4 theorem prover and programming language. In Automated Deduction – CADE 28, pages 625–635, 2021
work page 2021
-
[5]
R. M. Dudley. A course on empirical processes. InÉcole d’Été de Probabilités de Saint-Flour XII–1982, volume 1097 ofLecture Notes in Mathematics, pages 1–142. Springer, 1984
work page 1982
-
[6]
R. M. Dudley and R. Durst. Empirical processes, Vapnik–Chervonenkis classes and Poisson processes.Probability and Mathematical Statistics, 1(2):109–115, 1981
work page 1981
-
[7]
Dhruv Gupta. Formal Learning Theory Kernel: a Lean 4 formalization of the fundamental theorem of statistical learning. Version v3.3.0-paper, commite48c0f6, 2026. 11
work page 2026
-
[8]
A. S. Kechris.Classical Descriptive Set Theory. Springer, 1995
work page 1995
- [9]
-
[10]
L. S. Krapp and L. Wirth. Measurability in the fundamental theorem of statistical learning,
-
[11]
Comparator: a trustworthy judge for Lean proofs
Lean Prover Community. Comparator: a trustworthy judge for Lean proofs. Software, 2025
work page 2025
-
[12]
S. Moran and A. Yehudayoff. Sample compression schemes for VC classes.Journal of the ACM, 63(3):21:1–21:21, 2016
work page 2016
- [13]
-
[14]
Pollard.Convergence of Stochastic Processes
D. Pollard.Convergence of Stochastic Processes. Springer Series in Statistics. Springer, 1984
work page 1984
-
[15]
N. Sauer. On the density of families of sets.Journal of Combinatorial Theory, Series A, 13(1):145–147, 1972
work page 1972
-
[16]
S. Shalev-Shwartz and S. Ben-David.Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014
work page 2014
-
[17]
S. Shelah. A combinatorial problem; stability and order for models and theories in infinitary languages.Pacific Journal of Mathematics, 41(1):247–261, 1972
work page 1972
-
[18]
M. Suslin. Sur une définition des ensembles mesurables B sans nombres transfinis.Comptes Rendus de l’Académie des Sciences, 164:88–91, 1917
work page 1917
-
[19]
A. W. van der Vaart and J. A. Wellner.Weak Convergence and Empirical Processes. Springer, 1996
work page 1996
-
[20]
A. W. van der Vaart and J. A. Wellner. Preservation theorems for Glivenko–Cantelli and uniform Glivenko–Cantelli classes. In E. Giné, D. M. Mason, and J. A. Wellner, editors,High Dimensional Probability II, volume 47 ofProgress in Probability, pages 115–133. Birkhäuser, 2000
work page 2000
-
[21]
V. N. Vapnik and A. Y. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and its Applications, 16(2):264–280, 1971. 12
work page 1971
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.