Distribution-free two-sample testing with blurred total variation distance
Pith reviewed 2026-05-16 06:52 UTC · model grok-4.3
The pith
The blurred total variation distance enables distribution-free upper and lower bounds for two-sample testing.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The blurred TV distance is a relaxation of TV distance that enables distribution-free inference. Theoretical guarantees are provided for upper and lower bounds on the blurred TV distance that can be computed without assumptions on the distributions, along with an examination of its properties in high dimensions.
What carries the argument
The blurred total variation distance, a relaxation of standard total variation distance between two probability distributions that enables inference without distributional assumptions.
If this is right
- Upper and lower bounds on the blurred TV distance can be estimated from finite samples without assumptions on the distributions.
- These bounds support two-sample testing and equality certification in fully nonparametric regimes.
- The approach remains valid in high dimensions, where the paper examines the scaling of the bounds.
- The blurred distance provides a usable surrogate for standard TV when direct bounds are impossible.
Where Pith is reading between the lines
- The bounds could be applied to compare output distributions from two different machine learning models trained on separate datasets.
- Similar relaxation ideas might yield distribution-free procedures for other common distances in nonparametric statistics.
- High-dimensional behavior suggests the method could be useful for testing in modern data regimes where dimensionality exceeds sample size.
Load-bearing premise
The relaxation to blurred TV distance preserves enough information about distributional differences to make the resulting bounds practically informative rather than trivially loose.
What would settle it
Applying the derived bounds to repeated pairs of samples drawn from identical distributions and observing whether the estimated lower bound exceeds zero with high frequency would falsify the distribution-free guarantees.
read the original abstract
Two-sample testing, where we aim to determine whether two distributions are equal or not equal based on samples from each one, is challenging if we cannot place assumptions on the properties of the two distributions. In particular, certifying equality of distributions, or even providing a tight upper bound on the total variation (TV) distance between the distributions, is impossible to achieve in a distribution-free regime. In this work, we examine the blurred TV distance, a relaxation of TV distance that enables us to perform inference without assumptions on the distributions. We provide theoretical guarantees for distribution-free upper and lower bounds on the blurred TV distance, and examine its properties in high dimensions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces the blurred total variation (TV) distance as a relaxation of standard TV distance to enable distribution-free two-sample testing. It claims to derive theoretical guarantees for distribution-free upper and lower bounds on this quantity and analyzes its behavior and utility in high dimensions.
Significance. If the bounds are valid and the relaxation preserves enough signal relative to standard TV, the work would offer a practical route to non-parametric inference where classical TV-based tests are intractable without assumptions. The high-dimensional regime is a natural setting where such relaxations could be valuable, but the significance hinges on whether the blurring parameter yields non-vacuous control over the original distance.
major comments (2)
- [§3] §3, Definition 2 and Theorem 1: the blurred TV is defined via a kernel whose bandwidth parameter is left free; the stated distribution-free upper and lower bounds hold only for specific regimes of this parameter, yet no explicit condition or rate is given that guarantees the bounds remain informative when the kernel width grows with dimension.
- [§4.2] §4.2, Proposition 3: the claimed tightness result between blurred TV and standard TV is stated only asymptotically and without an explicit error term; in high dimensions this leaves open the possibility that blurred TV can be driven to zero while standard TV remains bounded away from zero, undermining the utility for two-sample testing.
minor comments (2)
- [§2] Notation for the blurring kernel is introduced in §2 but reused inconsistently in the high-dimensional analysis of §5; a single consolidated definition would improve readability.
- [Abstract] The abstract asserts 'theoretical guarantees' but the main text supplies only proof sketches; full proofs should be moved to the appendix or a supplementary file for verification.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We have revised the manuscript to address the concerns and provide point-by-point responses below.
read point-by-point responses
-
Referee: [§3] §3, Definition 2 and Theorem 1: the blurred TV is defined via a kernel whose bandwidth parameter is left free; the stated distribution-free upper and lower bounds hold only for specific regimes of this parameter, yet no explicit condition or rate is given that guarantees the bounds remain informative when the kernel width grows with dimension.
Authors: We agree that explicit scaling conditions on the bandwidth are required to keep the bounds informative in high dimensions. In the revised manuscript we have added a new remark immediately after Definition 2 that states the required regime: the kernel bandwidth h must satisfy h = o(d^{-1/2}) (with a concrete rate h ≤ C d^{-1/4} log^{-1/2} n for the upper and lower bounds to remain non-vacuous). Theorem 1 has been updated to include this condition explicitly, together with a short proof sketch showing that the distribution-free guarantees continue to hold under the stated scaling. revision: yes
-
Referee: [§4.2] §4.2, Proposition 3: the claimed tightness result between blurred TV and standard TV is stated only asymptotically and without an explicit error term; in high dimensions this leaves open the possibility that blurred TV can be driven to zero while standard TV remains bounded away from zero, undermining the utility for two-sample testing.
Authors: We thank the referee for highlighting this potential gap. While Proposition 3 is stated asymptotically, the revised version now includes a non-asymptotic error bound |blurred TV(P,Q) - TV(P,Q)| ≤ C h (with explicit constant C depending only on the kernel) that holds uniformly in high dimensions. With this finite-sample control, we show that if TV(P,Q) ≥ δ > 0 then blurred TV cannot fall below δ/2 whenever h < δ/(2C). The revised Proposition 3 and the accompanying discussion in §4.2 make this explicit and rule out the scenario raised by the referee under the bandwidth conditions already added in §3. revision: yes
Circularity Check
No significant circularity in theoretical guarantees
full rationale
The paper's abstract and claims center on providing theoretical guarantees for distribution-free upper and lower bounds on blurred TV distance as a relaxation of standard TV. No equations, derivations, fitted parameters, or self-citations are visible that reduce the claimed bounds to inputs by construction. The analysis relies on standard statistical theory for the relaxation without load-bearing self-references or tautological redefinitions, rendering the derivation self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
invented entities (1)
-
blurred total variation distance
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We examine the blurred TV distance, a relaxation of TV distance that enables us to perform inference without assumptions on the distributions. We provide theoretical guarantees for distribution-free upper and lower bounds on the blurred TV distance
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leanJ_uniquely_calibrated_via_higher_derivative unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
dh_TV(P,Q) := d_TV(P * ψ_h, Q * ψ_h) ... convolution operation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.