Approximate Replicability in Learning

Christopher Ye; Max Hopkins; Russell Impagliazzo

arxiv: 2510.20200 · v2 · submitted 2025-10-23 · 💻 cs.LG

Approximate Replicability in Learning

Max Hopkins , Russell Impagliazzo , Christopher Ye This is my paper

Pith reviewed 2026-05-18 04:09 UTC · model grok-4.3

classification 💻 cs.LG

keywords replicabilityPAC learningagnostic learningsample complexitystabilityrelaxed replicabilitymachine learning

0 comments

The pith

Three relaxed versions of replicability allow near-optimal agnostic PAC learners.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Full replicability demands that algorithms output identical hypotheses when their training sets are resampled using shared randomness, yet this strict stability is impossible even for threshold learning. The paper defines three natural relaxations: pointwise replicability, which requires consistency only on any single fixed input; approximate replicability, which requires the output hypothesis to be consistent on most of the underlying distribution; and semi-replicability, which allows full replicability plus shared unlabeled samples. Under constant replicability parameters, each relaxation admits agnostic PAC learners whose sample complexity is close to the optimum without any replicability constraint.

Core claim

In all three cases, for constant replicability we obtain close to sample-optimal agnostic PAC learners: pointwise and approximate replicability are achievable using O(d/α² + 1/α⁴) samples, while semi-replicability requires Θ(d²/α²) labeled samples.

What carries the argument

The three relaxations of replicability (pointwise consistency on fixed inputs, approximate consistency on most of the distribution, and semi-replicability with shared unlabeled data) that weaken full stability while preserving learning guarantees.

If this is right

Pointwise replicability admits agnostic PAC learners with O(d/α² + 1/α⁴) samples.
Approximate replicability achieves the same sample bound.
Semi-replicability solves the task using Θ(d²/α²) labeled samples together with shared unlabeled data.
All three bounds are close to the information-theoretic optimum for agnostic PAC learning without replicability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

These relaxations could enable stable algorithms for tasks such as threshold learning that are known to be impossible under strict replicability.
The gap between the O(d/α²) and Θ(d²/α²) bounds highlights how access to unlabeled data can reduce labeled-sample demand while retaining replicability.
The same relaxations might be tested for compatibility with other stability requirements such as differential privacy.

Load-bearing premise

The three proposed relaxations are natural and practically meaningful approximations to full replicability.

What would settle it

A distribution and hypothesis class where every constant-replicable pointwise learner requires asymptotically more than O(d/α² + 1/α⁴) samples would refute the near-optimality result.

read the original abstract

Replicability, introduced by (Impagliazzo et al. STOC '22), is the notion that algorithms should remain stable under a resampling of their inputs (given access to shared randomness). While a strong and interesting notion of stability, the cost of replicability can be prohibitive: there is no replicable algorithm, for instance, for tasks as simple as threshold learning (Bun et al. STOC '23). Given such strong impossibility results we ask: under what approximate notions of replicability is learning possible? In this work, we propose three natural relaxations of replicability in the context of PAC learning: (1) Pointwise: the learner must be consistent on any fixed input, but not across all inputs simultaneously, (2) Approximate: the learner must output hypotheses that classify most of the distribution consistently, (3) Semi: the algorithm is fully replicable, but may additionally use shared unlabeled samples. In all three cases, for constant replicability we obtain close to sample-optimal agnostic PAC learners: 1) and 2) are achievable using $O(d/\alpha^2 + 1/\alpha^{4})$ samples, while 3) requires $\Theta(d^2/\alpha^2)$ labeled samples.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Relaxing replicability via pointwise, approximate, and semi notions recovers near-optimal agnostic PAC sample bounds that strict replicability rules out.

read the letter

The main point is that three targeted relaxations of replicability let the authors get sample complexities close to the usual agnostic PAC rates for constant replicability parameters, even though full replicability is impossible for simple tasks like threshold learning. They define pointwise replicability as stability on any fixed input without requiring it across all inputs simultaneously, approximate replicability as output hypotheses that agree on most of the distribution, and semi-replicability as full replicability plus shared unlabeled samples. For the first two they achieve O(d/α² + 1/α⁴) samples; the third requires Θ(d²/α²) labeled samples. These bounds are new and match the relaxations they introduce. The paper does the definitions cleanly in section 3 and derives the upper bounds from uniform convergence plus a stabilization step whose cost lines up with the stated expressions. The lower bounds follow standard information-theoretic arguments. The stress-test note confirms no hidden assumptions break the derivations, and the work sits on top of the earlier impossibility results without circularity. One soft spot is that the practical usefulness of these particular relaxations still needs more motivation beyond escaping the known impossibilities; they are formally coherent but could feel tailored to the math. The citation pattern is appropriate and points to the right prior papers. This is for learning theorists who care about stability constraints on PAC learners. A reader working on replicability or related algorithmic stability questions will get concrete, usable bounds and clear definitions to build on. It deserves a serious referee because the results are technically grounded and address a concrete gap left by the strong negative results.

Referee Report

1 major / 3 minor

Summary. The paper introduces three relaxations of replicability (pointwise, approximate, and semi-replicability) in the PAC learning setting. It shows that, for constant replicability parameters, these allow agnostic PAC learners with sample complexities O(d/α² + 1/α⁴) for pointwise and approximate replicability, and Θ(d²/α²) labeled samples for semi-replicability, building on prior impossibility results for full replicability.

Significance. If the results hold, this work is significant as it identifies practical approximations to replicability that restore learnability for tasks where full replicability is impossible (e.g., threshold learning). The formal definitions and derivations from uniform convergence plus stabilization yield near-optimal bounds, extending Impagliazzo et al. (STOC '22) and Bun et al. (STOC '23) with new stability conditions that suffice for PAC guarantees.

major comments (1)

§3: The stability conditions for the three relaxations are defined precisely enough to support the PAC claims in Theorems 4.1, 4.3 and 5.2, but the manuscript should explicitly verify that these conditions are strictly weaker than full replicability (i.e., recover the original definition only in the limit of the approximation parameters).

minor comments (3)

Abstract: The phrase 'close to sample-optimal' would be strengthened by a one-sentence comparison to the standard agnostic PAC bound O(d/α² + 1/α²), clarifying the precise overhead introduced by each relaxation.
§4.1 and §4.3: The replicability-stabilization step is invoked to obtain the stated sample bounds; a short high-level description of how the overhead is derived (beyond 'standard uniform convergence') would improve accessibility.
Theorem 5.2: The Θ(d²/α²) lower bound for semi-replicability is stated for labeled samples; confirming whether the quadratic d dependence persists when unlimited shared unlabeled samples are available would clarify the result.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment and constructive feedback. The single major comment is addressed below; we will incorporate the requested clarification as a minor revision.

read point-by-point responses

Referee: §3: The stability conditions for the three relaxations are defined precisely enough to support the PAC claims in Theorems 4.1, 4.3 and 5.2, but the manuscript should explicitly verify that these conditions are strictly weaker than full replicability (i.e., recover the original definition only in the limit of the approximation parameters).

Authors: We agree that an explicit verification would improve clarity. In the revised manuscript we will add a short remark (or subsection) in §3 showing that each of the three definitions recovers the original replicability notion of Impagliazzo et al. (STOC '22) in the appropriate limit: for pointwise replicability as the per-point failure probability tends to zero; for approximate replicability as the measure of the disagreeing set tends to zero; and for semi-replicability as the number of shared unlabeled samples tends to infinity. The argument is by direct substitution into the respective definitions and does not alter any of the existing theorems or proofs. revision: yes

Circularity Check

0 steps flagged

Minor self-citation to replicability definition; new relaxations and bounds derived independently

full rationale

The paper formally introduces three new relaxations of replicability (pointwise, approximate, and semi) in Section 3 with precise stability conditions that directly enable the PAC guarantees. Sample bounds in Theorems 4.1, 4.3, and 5.2 are obtained via standard uniform convergence combined with a replicability-stabilization overhead, without any reduction to fitted parameters, self-definitions, or ansatzes. The citation to Impagliazzo et al. (STOC '22) supplies the baseline full-replicability notion and known impossibilities rather than serving as a load-bearing justification for the approximate variants; the central claims remain self-contained with independent content from classical learning theory.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The central claims rest on the newly introduced definitions of the three relaxations and on the standard background assumptions of agnostic PAC learning (i.i.d. samples, finite VC dimension d). No numerical parameters are fitted to data; the results are asymptotic bounds expressed in d and α.

axioms (1)

domain assumption Standard agnostic PAC learning setting: i.i.d. samples drawn from an unknown distribution over a domain whose hypothesis class has VC dimension d
The stated sample complexities are expressed in terms of d and the replicability parameter α under this background model.

invented entities (3)

Pointwise replicability no independent evidence
purpose: Relaxation requiring consistency only on any single fixed input rather than across all inputs simultaneously
New definition introduced to circumvent strict replicability impossibilities.
Approximate replicability no independent evidence
purpose: Relaxation requiring the output hypothesis to classify most of the distribution consistently
New definition introduced to circumvent strict replicability impossibilities.
Semi-replicability no independent evidence
purpose: Full replicability augmented by access to shared unlabeled samples
New definition introduced to circumvent strict replicability impossibilities.

pith-pipeline@v0.9.0 · 5744 in / 1545 out tokens · 63180 ms · 2026-05-18T04:09:38.254773+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose three natural relaxations of replicability... pointwise... approximate... semi... Theorems 1.4, 1.6, 1.10 give sample bounds via boosting and hypothesis selection.
IndisputableMonolith/Foundation/ArithmeticFromLogic.lean LogicNat recovery unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Lower bounds via reduction to replicable bias estimation on shattered sets (Theorem 1.5, 1.9).

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.