pith. sign in

arxiv: 2605.21492 · v1 · pith:G2HQKYFGnew · submitted 2026-04-08 · 💻 cs.LG · cs.AI· cs.LO· stat.ML

The Attribution Impossibility: No Feature Ranking Is Faithful, Stable, and Complete Under Collinearity

Pith reviewed 2026-05-22 01:34 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.LOstat.ML
keywords feature attributionexplainable AIcollinearityfeature rankingSHAPimpossibility theoremensemble methodsformal verification
0
0 comments X

The pith

No feature ranking can be faithful, stable, and complete at once when features are collinear.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that when input features are collinear, meaning one is a perfect linear function of another, it is impossible for any feature attribution method to produce rankings that are simultaneously faithful to the underlying model, stable against small perturbations, and complete in explaining the full output. This matters because collinearity appears in most real datasets and feature rankings underpin explanations, fairness audits, and model diagnostics. The authors prove the result from sixteen axioms, show that rankings for collinear pairs reduce to a coin flip, and classify all methods into two families: faithful-complete approaches that flip rankings up to half the time and ensemble approaches that report ties for symmetric features.

Core claim

No feature ranking can be simultaneously faithful, stable, and complete when features are collinear. For collinear pairs, ranking reduces to a coin flip. The design space theorem shows exactly two families of methods exist: faithful-complete methods, which are unstable with rankings that flip up to 50 percent of the time, and ensemble methods like DASH, which are stable and report ties for symmetric features. The attribution ratio diverges as 1 over 1 minus rho squared for gradient boosting, is infinite for Lasso, and converges for random forests. DASH is provably Pareto-optimal among unbiased aggregations and achieves the Cramer-Rao variance bound.

What carries the argument

The design space theorem derived from sixteen axioms on faithful, stable, and complete attribution, which partitions every possible method into one of two families and forces coin-flip behavior on perfectly collinear pairs.

If this is right

  • SHAP-based proxy discrimination audits are provably unreliable when collinear features are present.
  • Switching to conditional SHAP does not escape the impossibility when features have equal causal effects.
  • 68 percent of 77 public datasets exhibit attribution instability.
  • DASH achieves the Cramer-Rao lower bound on variance with a tight formula for ensemble size.
  • A Z-test workflow and single-model screening tool can diagnose instability in practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners should first test for collinearity before relying on any ranking for high-stakes decisions.
  • Methods that explicitly report ties for collinear groups may become standard for stable explanations.
  • The same impossibility may apply to other attribution tasks such as local explanations or counterfactual generation under linear dependence.

Load-bearing premise

The sixteen formal axioms correctly capture what faithful, stable, and complete mean for real-world attribution requirements.

What would settle it

Discovery of a single attribution method that produces faithful, stable, and complete rankings on a dataset containing at least one pair of perfectly collinear features would falsify the impossibility.

Figures

Figures reproduced from arXiv: 2605.21492 by Bryan Arnold, David Rhoads, Drake Caraker.

Figure 1
Figure 1. Figure 1: The Attribution Impossibility. No feature ranking can be simultaneously faithful, stable, and [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Left: Within-group information lost by Dash (log2 m! bits per group, independent of M and ρ). Right: Between-group information per pair as a function of M. Dash sharpens between-group rankings with increasing M. At high ρ (≥ 0.9), convergence requires larger M. 5.5 Progressive DASH: Adaptive Ensemble Sizing Training M=25 models for every explanation is wasteful when most feature pairs are stable. Progressi… view at source ↗
Figure 3
Figure 3. Figure 3: The Attribution Design Space. Family A (single-model methods): faithful and complete but unstable (U=1/2). Family B (Dash ensemble): stable with ties (U=0, S increasing with M). The ideal point (S=1, U=0, C=complete) is infeasible. There is no third option. The single tradeoff axis. The Design Space Theorem collapses the three-dimensional space to a single axis: ensemble size M. M Stability S Unfaithfulnes… view at source ↗
Figure 4
Figure 4. Figure 4: Flip rate as a function of (ρ, ∆β). The impossibility region (dark) expands with ρ. At ρ ≥ 0.95, instability persists for all ∆β ≤ 1. 26 [PITH_FULL_IMAGE:figures/full_fig_p026_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Left: Multi-model Z-test statistic vs. flip rate on Breast Cancer (r = −0.89). Pairs below z = 1.96 (dashed) have unreliable rankings. Right: Single-model screen split-frequency diagnostic (r = −0.78). Both diagnostics predict which pairs are unstable. Restricted-range robustness. The headline correlation r = −0.89 between Zjk and flip rate on Breast Cancer includes many “easy” pairs with Z ≫ 10 and flip r… view at source ↗
Figure 6
Figure 6. Figure 6: Universal SNR calibration across 6 datasets (1,325 feature pairs). The theoretical [PITH_FULL_IMAGE:figures/full_fig_p036_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Empirical validation on synthetic Gaussian data ( [PITH_FULL_IMAGE:figures/full_fig_p038_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: SHAP ranking instability on Breast Cancer. Highly correlated pairs with similar importance (e.g., [PITH_FULL_IMAGE:figures/full_fig_p039_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Multi-model Z-test statistic vs. flip rate across four datasets (Breast Cancer, Ames, Digits, Adult). [PITH_FULL_IMAGE:figures/full_fig_p046_9.png] view at source ↗
read the original abstract

No feature ranking can be simultaneously faithful, stable, and complete when features are collinear. For collinear pairs, ranking reduces to a coin flip. We prove this impossibility, quantify it for four model classes, resolve it via ensemble averaging (DASH), and machine-verify it with 305 Lean 4 theorems. We characterize the complete attribution design space: exactly two families of methods exist -- faithful-complete methods (unstable, with rankings that flip up to 50% of the time) and ensemble methods like DASH (stable, reporting ties for symmetric features) -- and no method lies outside this dichotomy. The impossibility is quantitative: the attribution ratio diverges as 1/(1-rho^2) for gradient boosting, is infinite for Lasso, and converges for random forests. DASH (Diversified Aggregation of SHAP) is provably Pareto-optimal among unbiased aggregations, achieving the Cramer-Rao variance bound with a tight ensemble size formula. In a survey of 77 public datasets, 68% exhibit attribution instability. Switching to conditional SHAP does not escape the impossibility when features have equal causal effects. The framework includes practical diagnostics -- a Z-test workflow and single-model screening tool -- and has direct consequences for fairness auditing: SHAP-based proxy discrimination audits are provably unreliable under collinearity. The design space theorem, diagnostics, and impossibility are mechanically verified in Lean 4 (305 theorems from 16 axioms, 0 sorry) -- to our knowledge, the first formally verified impossibility in explainable AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper proves that no feature attribution method can simultaneously satisfy faithfulness, stability, and completeness when input features are collinear; for any collinear pair the induced ranking is equivalent to a fair coin flip. It formally characterizes the entire attribution design space as consisting of exactly two families (faithful-complete methods that are unstable, versus ensemble methods such as DASH that are stable but report ties), quantifies the divergence of the attribution ratio for gradient boosting, Lasso, and random forests, shows that DASH is Pareto-optimal among unbiased aggregations and attains the Cramér-Rao bound, reports that 68 % of 77 public datasets exhibit attribution instability, and demonstrates that conditional SHAP does not evade the impossibility when causal effects are equal. All core claims are machine-checked in Lean 4 (305 theorems, 16 axioms, zero sorries).

Significance. If the result holds, the work is significant for explainable AI. The machine-checked proof of an impossibility result is a rare and valuable contribution; the explicit two-family design-space theorem and the Cramér-Rao optimality of DASH supply both theoretical clarity and a concrete, provably stable alternative. The 77-dataset survey and the fairness-auditing implications give the claims immediate practical relevance. The formal verification strengthens confidence that the impossibility is not an artifact of informal reasoning.

minor comments (3)
  1. §4.2: the statement that 'DASH achieves the Cramér-Rao bound with a tight ensemble size formula' would benefit from an explicit cross-reference to the corresponding Lean theorem number so readers can locate the verified statement directly.
  2. Figure 3 caption: the legend for the 'coin-flip' baseline is not visually distinct from the DASH curve at small sample sizes; a different line style or marker would improve readability.
  3. The survey section reports 68 % instability but does not state the precise operational definition of 'instability' used to label a dataset (e.g., fraction of collinear pairs with ranking flip probability > 0.4). Adding this definition would aid reproducibility.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment, detailed summary of our contributions, and recommendation to accept. The recognition of the impossibility result, the two-family design-space characterization, the Cramér-Rao optimality of DASH, the empirical survey, and the Lean 4 formalization is appreciated and aligns with the manuscript's intent.

Circularity Check

0 steps flagged

No significant circularity; derivation is axiomatically self-contained

full rationale

The impossibility theorem and design-space dichotomy are derived directly from 16 stated axioms and mechanically verified via 305 Lean 4 theorems (0 sorry). The Cramer-Rao bound for DASH follows from the paper's own unbiased-aggregation assumptions as an independent consequence. No load-bearing step reduces to a fitted parameter, self-citation chain, or definitional renaming; the formal verification establishes independence from the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The formalization rests on 16 axioms whose specific content is not enumerated in the abstract; the central claim depends on domain definitions of attribution properties and collinearity rather than data-fitted constants or new postulated entities.

axioms (2)
  • domain assumption Formal definitions of faithful, stable, and complete feature attributions
    These definitions are invoked to prove that no method can satisfy all three simultaneously under collinearity.
  • domain assumption Collinearity modeled as linear dependence that forces ranking to a coin flip
    This modeling choice underpins the quantitative divergence results for gradient boosting, Lasso, and random forests.

pith-pipeline@v0.9.0 · 5821 in / 1469 out tokens · 67307 ms · 2026-05-22T01:34:34.294377+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages

  1. [1]

    Proceedings of the National Academy of Sciences , volume=

    Impossibility Theorems for Feature Attribution , author=. Proceedings of the National Academy of Sciences , volume=

  2. [2]

    Big Data , volume=

    Fair prediction with disparate impact: A study of bias in recidivism prediction instruments , author=. Big Data , volume=

  3. [3]

    Innovations in Theoretical Computer Science (ITCS) , year=

    Inherent trade-offs in the fair determination of risk scores , author=. Innovations in Theoretical Computer Science (ITCS) , year=

  4. [4]

    On the failings of

    Huang, Xuanxiang and Marques-Silva, Joao , journal=. On the failings of

  5. [5]

    Journal of Machine Learning Research , volume=

    Partial Order in Chaos: Consensus on Feature Attributions in the Rashomon Set , author=. Journal of Machine Learning Research , volume=

  6. [6]

    The Limits of

    Rao, Shrisha , journal=. The Limits of

  7. [7]

    International Conference on Machine Learning (ICML) , year=

    Position: Amazing Things Come From Having Many Good Models , author=. International Conference on Machine Learning (ICML) , year=

  8. [8]

    Advances in Neural Information Processing Systems , volume=

    A unified approach to interpreting model predictions , author=. Advances in Neural Information Processing Systems , volume=

  9. [9]

    1951 , publisher=

    Social Choice and Individual Values , author=. 1951 , publisher=

  10. [10]

    Social Choice Theory in

    Nipkow, Tobias , journal=. Social Choice Theory in

  11. [11]

    Statistical Learning Theory in

    Zhang, Yuanhe and Lee, Jason D and Liu, Fanghui , journal=. Statistical Learning Theory in. 2026 , url=

  12. [12]

    2024 , howpublished=

    Regulation (. 2024 , howpublished=

  13. [13]

    Hwang, Aida and Bell, Andrew and Fonseca, Jose and others , booktitle=

  14. [14]

    Advances in Neural Information Processing Systems , volume=

    Full-Gradient Representation for Neural Network Visualization , author=. Advances in Neural Information Processing Systems , volume=

  15. [15]

    Journal of Machine Learning Research , volume=

    All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously , author=. Journal of Machine Learning Research , volume=

  16. [16]

    Journal of Machine Learning Research , volume=

    Underspecification Presents Challenges for Credibility in Modern Machine Learning , author=. Journal of Machine Learning Research , volume=

  17. [17]

    International Conference on Machine Learning , year=

    Provably Better Explanations with Optimized Aggregation of Feature Attributions , author=. International Conference on Machine Learning , year=

  18. [18]

    Advances in Neural Information Processing Systems , year=

    Probabilistic Stability Guarantees for Feature Attributions , author=. Advances in Neural Information Processing Systems , year=

  19. [19]

    SSRN Working Paper , year=

    Mathematical Foundations of Explainability , author=. SSRN Working Paper , year=

  20. [20]

    Conference on Uncertainty in Artificial Intelligence , pages=

    Equivalence and Synthesis of Causal Models , author=. Conference on Uncertainty in Artificial Intelligence , pages=

  21. [21]

    2005 , publisher=

    Testing Statistical Hypotheses , author=. 2005 , publisher=

  22. [22]

    2009 , publisher=

    Introduction to Nonparametric Estimation , author=. 2009 , publisher=

  23. [23]

    Feature relevance quantification in explainable

    Janzing, Dominik and Minorics, Lenon and Bl. Feature relevance quantification in explainable. International Conference on Artificial Intelligence and Statistics , pages=

  24. [24]

    International Conference on Artificial Intelligence and Statistics , year=

    From Shapley Values to Generalized Additive Models and back , author=. International Conference on Artificial Intelligence and Statistics , year=

  25. [25]

    Advances in Neural Information Processing Systems , year=

    But Are You Sure? An Uncertainty-Aware Perspective on Explainability , author=. Advances in Neural Information Processing Systems , year=

  26. [26]

    Journal of Machine Learning Research , volume=

    Explaining by Removing: A Unified Framework for Model Explanation , author=. Journal of Machine Learning Research , volume=

  27. [27]

    Slack, Dylan and Hilgard, Sophie and Jia, Emily and Singh, Sameer and Lakkaraju, Himanshu , booktitle=. Fooling

  28. [28]

    ACM/IMS Journal of Data Science , year=

    On the Existence of Simpler Machine Learning Models , author=. ACM/IMS Journal of Data Science , year=

  29. [29]

    Algorithms to estimate

    Chen, Hugh and Covert, Ian C and Lundberg, Scott M and Lee, Su-In , journal=. Algorithms to estimate

  30. [30]

    2011 , howpublished=

    Supervisory Guidance on Model Risk Management (. 2011 , howpublished=

  31. [31]

    1946 , note=

    A Structure Theory for the Class of Exponential Families , author=. 1946 , note=

  32. [32]

    International Journal of Data Science and Analytics , year=

    General Pitfalls of Model-Agnostic Interpretation Methods for Machine Learning Models , author=. International Journal of Data Science and Analytics , year=

  33. [33]

    Machine Learning , year=

    Testing Conditional Independence in Supervised Learning Algorithms , author=. Machine Learning , year=

  34. [34]

    Explaining black box decisions by

    Mase, Masayoshi and Owen, Art B and Seiler, Benjamin , journal=. Explaining black box decisions by

  35. [35]

    The many

    Sundararajan, Mukund and Najmi, Amir , booktitle=. The many

  36. [36]

    Problems with

    Kumar, I Elizabeth and Venkatasubramanian, Suresh and Scheidegger, Carlos and Friedler, Sorelle , booktitle=. Problems with

  37. [37]

    Machine Learning , volume=

    Random Forests , author=. Machine Learning , volume=

  38. [38]

    Annals of Statistics , volume=

    Greedy Function Approximation: A Gradient Boosting Machine , author=. Annals of Statistics , volume=

  39. [39]

    ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , year=

    ``Why Should I Trust You?'': Explaining the Predictions of Any Classifier , author=. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , year=

  40. [40]

    Advances in Neural Information Processing Systems , year=

    Unrestricted Permutation Forces Extrapolation , author=. Advances in Neural Information Processing Systems , year=

  41. [41]

    Journal of Computational and Graphical Statistics , year=

    Statistical Inference for Feature Rankings Under Model Multiplicity , author=. Journal of Computational and Graphical Statistics , year=

  42. [42]

    Explaining individual predictions when features are dependent: More accurate approximations to

    Aas, Kjersti and Jullum, Martin and L. Explaining individual predictions when features are dependent: More accurate approximations to. Artificial Intelligence , volume=

  43. [43]

    Asymmetric

    Frye, Christopher and Rowat, Colin and Feige, Ilya , booktitle=. Asymmetric

  44. [44]

    Wright, Marvin N and Watson, David S and K. Do. Statistics and Computing , year=

  45. [45]

    Contributions to the Theory of Games II , series=

    A Value for N-Person Games , author=. Contributions to the Theory of Games II , series=. 1953 , publisher=

  46. [46]

    de Moura, Leonardo and Ullrich, Sebastian , booktitle=. The

  47. [47]

    Certified Programs and Proofs (CPP) , year=

    The. Certified Programs and Proofs (CPP) , year=

  48. [48]

    Ghorbani, Amirata and Zou, James , booktitle=. Data

  49. [50]

    K. J. Arrow. Social Choice and Individual Values. Wiley, 1951

  50. [51]

    Bilodeau, N

    B. Bilodeau, N. Jaques, P. W. Koh, and B. Kim. Impossibility theorems for feature attribution. Proceedings of the National Academy of Sciences, 121 0 (2): 0 e2304406120, 2024

  51. [52]

    L. Breiman. Random forests. Machine Learning, 45: 0 5--32, 2001

  52. [53]

    Chouldechova

    A. Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. Big Data, 5 0 (2): 0 153--163, 2017

  53. [54]

    D'Amour, K

    A. D'Amour, K. Heller, D. Moldovan, et al. Underspecification presents challenges for credibility in modern machine learning. Journal of Machine Learning Research, 23 0 (226): 0 1--61, 2022

  54. [55]

    de Moura and S

    L. de Moura and S. Ullrich. The L ean 4 theorem prover and programming language. In International Conference on Automated Deduction (CADE), 2021

  55. [56]

    Decker, L.-M

    T. Decker, L.-M. Herm, R. Feldhans, M. Kamp, and B. Hammer. Provably better explanations with optimized aggregation of feature attributions. In International Conference on Machine Learning, 2024

  56. [57]

    Regulation ( EU ) 2024/1689 (artificial intelligence act)

    EU Parliament . Regulation ( EU ) 2024/1689 (artificial intelligence act). Official Journal of the European Union, OJ L, 12.7.2024, 2024

  57. [58]

    Fisher, C

    A. Fisher, C. Rudin, and F. Dominici. All models are wrong, but many are useful: Learning a variable's importance by studying an entire class of prediction models simultaneously. Journal of Machine Learning Research, 20 0 (177): 0 1--81, 2019

  58. [59]

    J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29 0 (5): 0 1189--1232, 2001

  59. [60]

    Ghorbani and J

    A. Ghorbani and J. Zou. Data S hapley: Equitable valuation of data for machine learning. In International Conference on Machine Learning, 2019

  60. [61]

    Herren and P

    A. Herren and P. R. Hahn. Statistical inference for feature rankings under model multiplicity. Journal of Computational and Graphical Statistics, 2023

  61. [62]

    Hooker, L

    G. Hooker, L. Mentch, and S. Zhou. Unrestricted permutation forces extrapolation. In Advances in Neural Information Processing Systems, 2021

  62. [63]

    Huang and J

    X. Huang and J. Marques-Silva. On the failings of S hapley values for explainability. International Journal of Approximate Reasoning, 171: 0 109112, 2024

  63. [64]

    Hunt and C

    G. Hunt and C. Stein. A structure theory for the class of exponential families. 1946. Unpublished manuscript; results presented in Lehmann & Romano (2005), Ch. 6

  64. [65]

    Y. Jin, R. Gao, and B. Kim. Probabilistic stability guarantees for feature attributions. In Advances in Neural Information Processing Systems, 2025

  65. [66]

    Kleinberg, S

    J. Kleinberg, S. Mullainathan, and M. Raghavan. Inherent trade-offs in the fair determination of risk scores. In Innovations in Theoretical Computer Science (ITCS), 2017

  66. [67]

    Krishna, T

    S. Krishna, T. Han, A. Ber, J. Brinton, A. Lakshminarayanan, C. Robinson, and H. Lakkaraju. The disagreement problem in explainable machine learning: A practitioner's perspective. arXiv preprint arXiv:2202.01602, 2022

  67. [68]

    Laberge, Y

    G. Laberge, Y. Pequignot, A. Mathieu, F. Khomh, and M. Marchand. Partial order in chaos: Consensus on feature attributions in the rashomon set. Journal of Machine Learning Research, 24 0 (364): 0 1--50, 2023

  68. [69]

    E. L. Lehmann and J. P. Romano. Testing Statistical Hypotheses. Springer, 3rd edition, 2005

  69. [70]

    S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, volume 30, 2017

  70. [71]

    T. Nipkow. Social choice theory in HOL : Arrow and G ibbard- S atterthwaite. Journal of Automated Reasoning, 43: 0 289--304, 2009

  71. [72]

    Noguer i Alonso

    M. Noguer i Alonso. Mathematical foundations of explainability. SSRN Working Paper, 2025

  72. [73]

    Supervisory guidance on model risk management ( SR 11-7/ OCC 2011-12)

    Office of the Comptroller of the Currency . Supervisory guidance on model risk management ( SR 11-7/ OCC 2011-12). OCC Bulletin 2011-12, 2011

  73. [74]

    S. Rao. The limits of AI explainability: An algorithmic information theory approach. arXiv preprint arXiv:2504.20676, 2025

  74. [75]

    M. T. Ribeiro, S. Singh, and C. Guestrin. ``why should i trust you?'': Explaining the predictions of any classifier. In ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

  75. [76]

    Rudin et al

    C. Rudin et al. Position: Amazing things come from having many good models. In International Conference on Machine Learning (ICML), 2024

  76. [77]

    L. S. Shapley. A value for n-person games. In Contributions to the Theory of Games II, number 28 in Annals of Mathematics Studies, pages 307--317. Princeton University Press, 1953

  77. [78]

    Srinivas and F

    S. Srinivas and F. Fleuret. Full-gradient representation for neural network visualization. In Advances in Neural Information Processing Systems, volume 32, 2019

  78. [79]

    The L ean mathematical library

    The mathlib Community . The L ean mathematical library. In Certified Programs and Proofs (CPP), 2020

  79. [80]

    A. B. Tsybakov. Introduction to Nonparametric Estimation. Springer, 2009

  80. [81]

    Zhang, J

    Y. Zhang, J. D. Lee, and F. Liu. Statistical learning theory in L ean 4: Empirical processes from scratch. arXiv preprint arXiv:2602.02285, 2026. URL https://arxiv.org/abs/2602.02285