pith. sign in

arxiv: 2605.09221 · v1 · submitted 2026-05-09 · 💻 cs.LG · cs.AI

The Pok\'emon Theorem and other Fairness Impossibility Results

Pith reviewed 2026-05-12 03:13 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords fairness impossibilityPokémon theoremRKHSconditional mean embeddingsmaximum mean discrepancyfair feature learningbase rates
0
0 comments X

The pith

Fairness criteria act as linear constraints on conditional mean embeddings in an RKHS, overdetermined by unequal base rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper unifies several fairness impossibility results by representing them in a shared reproducing kernel Hilbert space geometry. Fairness notions become linear constraints on conditional mean embeddings of the data. When base rates differ across groups, the law of total expectation creates an overdetermined system that cannot be satisfied exactly. This yields the Pokémon theorem, which shows any finite collection of such constraints leaves a residual maximum mean discrepancy violation that decays at the Kolmogorov m-width rate under spectral regularity. The same geometry proves that requiring both parity and class-conditional separation in learned representations forces class collapse when base rates are unequal.

Core claim

Fairness criteria are linear constraints on conditional mean embeddings, and unequal base rates make the law of total expectation overdetermine those constraints. This view recovers the Kleinberg-Mullainathan-Raghavan dichotomy from group-conditional unbiasedness alone, proves the Pokémon theorem that any finite linear mean-fairness criteria leave a residual MMD violation decaying at the Kolmogorov m-width rate, and shows that parity plus class-conditional separation in representation space forces class collapse under unequal base rates. Approximate relaxations produce explicit signal and error frontiers.

What carries the argument

Conditional mean embeddings in an RKHS, which turn fairness criteria into linear constraints that become overdetermined when base rates differ.

If this is right

  • The Kleinberg-Mullainathan-Raghavan result follows from group-conditional unbiasedness alone.
  • Any finite set of linear mean-fairness criteria leaves a detectable residual MMD violation between groups.
  • Parity and class-conditional separation together force class collapse in representation space when base rates differ.
  • Approximate fairness relaxations admit explicit trade-offs between estimator error and fairness violation via signal and error frontiers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The decay rate of the residual MMD could be measured directly on fairness benchmark datasets to check spectral regularity assumptions.
  • The linear-constraint view may extend to other fairness definitions by embedding them into the same RKHS.
  • Algorithm designers could minimize the explicit MMD residual as a practical proxy for the impossibility gap.

Load-bearing premise

Fairness criteria can be faithfully represented as linear constraints on conditional mean embeddings in an RKHS and the law of total expectation overdetermines these constraints precisely when base rates are unequal.

What would settle it

A dataset with unequal base rates in which some model satisfies multiple linear mean-fairness criteria with exactly zero MMD between groups, or learns representations achieving both parity and separation without class collapse.

Figures

Figures reproduced from arXiv: 2605.09221 by Alex Smola, Daniel Matsui Smola.

Figure 1
Figure 1. Figure 1: Empirical illustrations on standard fairness benchmarks. [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
read the original abstract

Fairness impossibility results often look like distinct scalar incompatibility statements. We show that several share one RKHS geometry: fairness criteria are linear constraints on conditional mean embeddings, and unequal base rates make the law of total expectation overdetermine those constraints. This view yields four results. The Kleinberg--Mullainathan--Raghavan dichotomy needs only group-conditional unbiasedness, not full calibration. The \emph{Pok\'emon theorem} shows that a distinct group pair satisfying any finite collection of linear mean-fairness criteria leaves a residual violation witnessed by the MMD, decaying at the Kolmogorov $m$-width rate under spectral regularity. The same tools prove an impossibility for fair feature learning: parity and class-conditional separation in representation space force class collapse under unequal base rates. The approximate relaxations yield signal and error frontiers, allowing a trade-off between real-world estimators and fairness goals. Experiments on standard fairness benchmarks are consistent with our bounds.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript unifies several fairness impossibility results in machine learning by modeling fairness criteria as linear constraints on conditional mean embeddings in a reproducing kernel Hilbert space (RKHS). It shows that unequal base rates cause the law of total expectation to overdetermine these constraints, producing a nonzero residual violation witnessed by maximum mean discrepancy (MMD). The central Pokémon theorem bounds this residual's MMD decay at the Kolmogorov m-width rate under spectral regularity of the kernel operator. Additional results include a simplified view of the Kleinberg-Mullainathan-Raghavan dichotomy requiring only group-conditional unbiasedness, an impossibility for fair feature learning where parity and class-conditional separation force class collapse under unequal base rates, and approximate relaxations yielding signal-error frontiers. Experiments on standard fairness benchmarks are reported as consistent with the bounds.

Significance. If the derivations hold, the work is significant for supplying a geometric unification of fairness impossibilities with quantitative rates and trade-off frontiers, leveraging standard RKHS tools from kernel methods. This could enable tighter integration with existing kernel-based fairness techniques and guide practical relaxations. The explicit use of m-width bounds and the law of total expectation provides falsifiable predictions that strengthen the contribution beyond purely qualitative impossibilities.

major comments (2)
  1. [Pokémon theorem] Pokémon theorem section (abstract and main development): The claimed MMD residual decay at the Kolmogorov m-width rate under spectral regularity is load-bearing for the quantitative claim. No verification is provided that standard kernels (Gaussian, polynomial) or fairness datasets satisfy the required eigenvalue decay of the covariance operator on H; slow decay would make the bound vacuous, as noted in the stress-test concern. Adding a check, counterexample, or explicit assumption statement is needed.
  2. [Fair feature learning impossibility] Fair feature learning impossibility (abstract): The proof that parity and class-conditional separation force class collapse under unequal base rates rests on representing these as linear constraints on conditional mean embeddings μ_{X|G}. The step where the law of total expectation produces an overdetermined system should be shown explicitly (e.g., via an equation for the orthogonal complement residual) to confirm it is not an artifact of the embedding choice.
minor comments (2)
  1. [Abstract] Abstract: The statement that experiments are 'consistent with our bounds' lacks a cross-reference to the specific theorem or section containing the bounds, reducing clarity for readers.
  2. [Preliminaries] Notation throughout: The definition and use of conditional mean embeddings and the RKHS H should be introduced with a dedicated preliminary subsection to ensure consistent interpretation of the linear functionals.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which highlight important points for strengthening the quantitative claims and proof clarity in our manuscript. We address each major comment below and will make the suggested revisions to improve the presentation.

read point-by-point responses
  1. Referee: [Pokémon theorem] Pokémon theorem section (abstract and main development): The claimed MMD residual decay at the Kolmogorov m-width rate under spectral regularity is load-bearing for the quantitative claim. No verification is provided that standard kernels (Gaussian, polynomial) or fairness datasets satisfy the required eigenvalue decay of the covariance operator on H; slow decay would make the bound vacuous, as noted in the stress-test concern. Adding a check, counterexample, or explicit assumption statement is needed.

    Authors: We agree that the rate's applicability requires spectral regularity and that this should be stated explicitly to avoid any perception of vacuity. In the revised manuscript, we will add a dedicated assumption paragraph in the Pokémon theorem section stating: 'We assume the covariance operator C_H on the RKHS satisfies eigenvalue decay λ_k = O(k^{-α}) for some α > 1, which ensures the Kolmogorov m-width decays as O(m^{-α}). This holds for the Gaussian kernel on compact input domains (exponential decay) and polynomial kernels of fixed degree (finite rank, hence faster decay).' We will also include a brief verification note referencing standard results in kernel methods literature and confirm that the empirical MMD residuals in our experiments on fairness benchmarks are consistent with this decay under the kernels used. This clarifies the conditions without altering the theorem's statement. revision: yes

  2. Referee: [Fair feature learning impossibility] Fair feature learning impossibility (abstract): The proof that parity and class-conditional separation force class collapse under unequal base rates rests on representing these as linear constraints on conditional mean embeddings μ_{X|G}. The step where the law of total expectation produces an overdetermined system should be shown explicitly (e.g., via an equation for the orthogonal complement residual) to confirm it is not an artifact of the embedding choice.

    Authors: We thank the referee for this suggestion to make the derivation more transparent. In the revision, we will expand the fair feature learning impossibility section with an explicit step-by-step derivation. We will write the parity constraint as ⟨μ_{X|G=0} - μ_{X|G=1}, f⟩_H = 0 for all f ∈ H and the class-conditional separation as ⟨μ_{X|Y=y,G=0} - μ_{X|Y=y,G=1}, f⟩_H = 0. Applying the law of total expectation to the overall embedding μ_X = P(G=0)μ_{X|G=0} + P(G=1)μ_{X|G=1} and substituting the constraints, we show that unequal base rates P(G=0) ≠ P(G=1) force the only solution satisfying all linear constraints simultaneously to be μ_{X|Y=0} = μ_{X|Y=1} (class collapse). We will add the orthogonal complement residual equation: let V be the subspace spanned by the constraint functionals; then the residual r = μ_X - proj_V μ_X satisfies ||r||_H ≥ |P(G=0) - P(G=1)| · δ where δ is the separation gap, witnessed by MMD. This demonstrates the overdetermination arises directly from the geometry and base-rate weighting, independent of embedding choice. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses standard RKHS geometry and law of total expectation as external inputs

full rationale

The paper's chain begins with the standard representation of fairness criteria as linear functionals on conditional mean embeddings in an RKHS, then invokes the law of total expectation to produce an overdetermined system when base rates differ. The Pokémon theorem then bounds the resulting MMD residual via the Kolmogorov m-width of the constraint subspace under the additional assumption of spectral regularity on the covariance operator. These steps rest on textbook RKHS properties, MMD definitions, and m-width rates from approximation theory; none of the provided equations or claims reduce a derived quantity to a fitted parameter, a self-citation, or a renamed input. Spectral regularity is explicitly stated as a hypothesis rather than proved from the result itself. No self-citation load-bearing steps or ansatz smuggling appear in the abstract or described results. The derivation is therefore self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claims rest on representing fairness criteria as linear constraints in an RKHS and on the overdetermination induced by unequal base rates via the law of total expectation.

axioms (2)
  • domain assumption Fairness criteria are linear constraints on conditional mean embeddings in an RKHS.
    Stated as the unifying geometry in the abstract.
  • domain assumption Unequal base rates cause the law of total expectation to overdetermine the fairness constraints.
    Key premise that generates the impossibility results.

pith-pipeline@v0.9.0 · 5459 in / 1308 out tokens · 53995 ms · 2026-05-12T03:13:23.549214+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

9 extracted references · 9 canonical work pages

  1. [1]

    2023 , isbn =

    doi: 10.1145/3593013.3594007. Rachel K. E. Bellamy, Kuntal Dey, Michael Hind, Samuel C. Hoffman, Stephanie Houde, Kalapriya Kannan, Pranay Lohia, Jacquelyn Martino, Sameep Mehta, Aleksandra Mojsilovi´c, Seema Nagar, Karthikeyan Nate- san Ramamurthy, John Richards, Diptikalyan Saha, Prasanna Sattigeri, Moninder Singh, Kush R. Varshney, and Yunfeng Zhang. A...

  2. [2]

    Benjamini and Y

    doi: 10.1111/j.2517-6161.1995.tb02031.x. Andrea Caponnetto and Ernesto De Vito. Optimal rates for the regularized least-squares algorithm.Foundations of Computational Mathematics, 7(3):331–368, 2007. doi: 10.1007/s10208-006-0196-8. Tianqi Chen and Carlos Guestrin. XGBoost: A scalable tree boosting system. InProceedings of the 22nd ACM SIGKDD International...

  3. [3]

    Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt

    doi: 10.1111/j.1745-3984.1971.tb00908.x. Frances Ding, Moritz Hardt, John Miller, and Ludwig Schmidt. Retiring adult: New datasets for fair machine learning. InAdvances in Neural Information Processing Systems 34 (NeurIPS 2021), pages 6478–6490, 2021. URL https://proceedings.neurips.cc/paper_files/paper/2021/file/ 32e54441e6382a7fbacbbbaf3c450059-Paper.pd...

  4. [4]

    URLhttps://proceedings.mlr.press/v80/hebert-johnson18a.html

    PMLR, 2018. URLhttps://proceedings.mlr.press/v80/hebert-johnson18a.html. Ben Hutchinson and Margaret Mitchell. 50 years of test (un)fairness: Lessons for machine learning. In Proceedings of the Conference on Fairness, Accountability, and Transparency (FAT* 2019), pages 49–58,

  5. [5]

    Elisa Celis, Sayash Kapoor, Farnood Salehi, and Nisheeth Vishnoi

    doi: 10.1145/3287560.3287600. Nikola Jovanovi´c, Mislav Balunovi ´c, Dimitar Iliev Dimitrov, and Martin Vechev. FARE: Provably fair representation learning with practical certificates. InProceedings of the 40th International Conference on Machine Learning (ICML 2023), volume 202 ofPMLR, pages 15401–15420, 2023. URL https: //proceedings.mlr.press/v202/jova...

  6. [6]

    Adrián Pérez-Suay, Paula Gordaliza, Jean-Michel Loubes, Dino Sejdinovic, and Gustau Camps-Valls

    doi: 10.1007/978-3-319-71249-9_21. Adrián Pérez-Suay, Paula Gordaliza, Jean-Michel Loubes, Dino Sejdinovic, and Gustau Camps-Valls. Fair kernel regression through cross-covariance operators.Transactions on Machine Learning Research, 2023. URLhttps://openreview.net/forum?id=MyQ1e1VQQ3. Nancy S. Petersen and Melvin R. Novick. An evaluation of some models fo...

  7. [7]

    2018 , Bdsk-Url-1 =

    URLhttps://papers.neurips.cc/paper/7151-on-fairness-and-calibration. Michael Reed and Barry Simon.Methods of Modern Mathematical Physics, Vol. I: Functional Analysis. Academic Press, New York, revised and enlarged edition, 1980. ISBN 9780125850506. Bernhard Schölkopf and Alexander J. Smola.Learning with Kernels: Support Vector Ma- chines, Regularization, ...

  8. [8]

    linear mean-fairness criteria

    URLhttps://www.jmlr.org/papers/v24/23-0389.html. Shaokui Wei, Jiayin Liu, Bing Li, and Hongyuan Zha. Mean parity fair regression in RKHS. InProceedings of the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023), volume 206 ofPMLR, pages 4602–4628, 2023. URLhttps://proceedings.mlr.press/v206/wei23a.html. Susan Wei and Mar...

  9. [9]

    Empirical base rates on the chosen subsample:ˆpa ≈0.44,ˆpb ≈0.36,| c∆p| ≈0.085, cPr[G=a]≈0.62. Splits.All datasets are partitioned into train / validation / test at fractions 0.70/0.15/0.15, stratified jointly on (Y, G) with a minimum of 10 samples per (y, g)∈ {0,1} × {a, b} subgroup per split (assertion-enforced). The split is deterministic given the see...