arxiv: 2604.04535 · v1 · submitted 2026-04-06 · 💻 cs.LG · cs.CC· cs.IT· math.IT

Recognition: 2 theorem links

· Lean Theorem

Learning from Equivalence Queries, Revisited

Mark Braverman , Roi Livni , Yishay Mansour , Shay Moran , Kobbi Nissim

Authors on Pith no claims yet

Pith reviewed 2026-05-10 20:07 UTC · model grok-4.3

classification 💻 cs.LG cs.CCcs.ITmath.IT

keywords equivalence queriessymmetric counterexamplesfull-information feedbackbandit feedbacklearning theoryquery complexityadversarial learning

0 comments

The pith

Restricting to symmetric counterexample generators produces tight bounds on the number of rounds needed to learn from equivalence queries under full-information and bandit feedback.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reconsiders the equivalence query learning model in which a learner proposes hypotheses and receives counterexamples when they fail to match the target. It focuses on a restricted class of counterexample generators called symmetric, which select examples based solely on the symmetric difference between the current hypothesis and the target. Within this class the authors derive matching upper and lower bounds on the number of rounds required to identify the target, both when the learner observes the correct label of each counterexample and when it receives only a binary error signal. A reader would care because this setup captures the periodic update cycles of deployed systems such as recommenders or generative models, where feedback is rarely fully adversarial.

Core claim

In the symmetric equivalence-query model the number of learning rounds is tightly bounded in the full-information setting, where each counterexample arrives with its correct label, and in the bandit setting, where only the fact of disagreement is revealed. The bounds are obtained by viewing the interaction as a game between the learner and a symmetric adversary, then applying adaptive weighting of hypotheses together with minimax arguments to determine the exact round complexity.

What carries the argument

Symmetric counterexample generators, defined as mechanisms that choose counterexamples using only the symmetric difference between the proposed hypothesis and the unknown target concept.

If this is right

The same tight bounds hold whether the learner receives full labels or only bandit error signals.
Natural mechanisms such as random counterexamples or selection of the simplest counterexample fall inside the symmetric class and therefore inherit the bounds.
The game-theoretic analysis supplies both upper and lower bounds that match, so the round complexity cannot be improved inside the model.
The framework separates the effect of adversarial strength from the effect of information type.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If real user feedback in deployed systems is approximately symmetric, the derived bounds give a concrete prediction for how many updates are needed before a model stabilizes.
The separation between symmetric and fully adversarial generators suggests a way to measure the robustness of an interactive learning system by testing how close its feedback is to symmetry.

Load-bearing premise

The counterexample generators are symmetric, so their choice depends only on the symmetric difference between the hypothesis and the target.

What would settle it

An explicit symmetric generator together with a learning algorithm for which the number of rounds strictly exceeds the derived upper bound would falsify the tightness claim.

Figures

Figures reproduced from arXiv: 2604.04535 by Kobbi Nissim, Mark Braverman, Roi Livni, Shay Moran, Yishay Mansour.

**Figure 1.** Figure 1: A schematic example of a (ternary) decision tree when Y = {0, 1, 2}. Internal nodes are labeled by instances, and edges are labeled by outcomes in Y. Leaves correspond to prediction paths and carry no instance labels. Example 2 (Decision-tree–induced counterexample generators). A decision tree is a rooted tree whose internal nodes are labeled by instances from X and whose outgoing edges are labeled by elem… view at source ↗

read the original abstract

Modern machine learning systems, such as generative models and recommendation systems, often evolve through a cycle of deployment, user interaction, and periodic model updates. This differs from standard supervised learning frameworks, which focus on loss or regret minimization over a fixed sequence of prediction tasks. Motivated by this setting, we revisit the classical model of learning from equivalence queries, introduced by Angluin (1988). In this model, a learner repeatedly proposes hypotheses and, when a deployed hypothesis is inadequate, receives a counterexample. Under fully adversarial counterexample generation, however, the model can be overly pessimistic. In addition, most prior work assumes a \emph{full-information} setting, where the learner also observes the correct label of the counterexample, an assumption that is not always natural. We address these issues by restricting the environment to a broad class of less adversarial counterexample generators, which we call \emph{symmetric}. Informally, such generators choose counterexamples based only on the symmetric difference between the hypothesis and the target. This class captures natural mechanisms such as random counterexamples (Angluin and Dohrn, 2017; Bhatia, 2021; Chase, Freitag, and Reyzin, 2024), as well as generators that return the simplest counterexample according to a prescribed complexity measure. Within this framework, we study learning from equivalence queries under both full-information and bandit feedback. We obtain tight bounds on the number of learning rounds in both settings and highlight directions for future work. Our analysis combines a game-theoretic view of symmetric adversaries with adaptive weighting methods and minimax arguments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Restricting to symmetric counterexamples yields tight bounds for equivalence query learning in full and bandit feedback.

read the letter

The key point is that this paper takes the equivalence query model and relaxes the adversary to symmetric counterexample generators, which choose examples based solely on the symmetric difference. Within that, they derive tight bounds on the number of rounds for both full information feedback and bandit feedback. They formalize the symmetric class and model the interaction as a game, using adaptive weighting for upper bounds and minimax arguments for lower bounds to get matching results. The bandit analysis is a clear addition since it drops the full-label assumption and fits settings where you only learn the hypothesis was wrong. The derivations stay self-contained inside the stated model with no circular steps or external assumptions needed for the tightness claims. This gives a less pessimistic view than fully adversarial counterexamples and aligns better with some interactive systems like recommendations or model updates from user feedback. The main limitation is the symmetry restriction itself. It covers random or minimal counterexamples but rules out generators that might use extra context or choose asymmetrically. The paper treats this as a deliberate modeling choice rather than claiming it fits every case, so the practical reach depends on how common symmetry is. There are no simulations or concrete examples, keeping it purely theoretical. This work is for learning theorists focused on query models or limited-feedback learning. A reader who wants refinements to Angluin-style results with explicit bounds would find the framework and matching upper/lower bounds useful. It deserves peer review. The setup is formal, the arguments are developed, and experts can check the details and suggest extensions.

Referee Report

0 major / 2 minor

Summary. The paper revisits the equivalence query model of Angluin (1988) by restricting the adversary to the class of symmetric counterexample generators, which select counterexamples based only on the symmetric difference between the current hypothesis and the target. Within this model the authors formulate the interaction as a game, derive matching upper bounds via adaptive weighting and lower bounds via minimax arguments, and obtain tight bounds on the number of rounds required under both full-information feedback and bandit feedback.

Significance. If the derivations hold, the work supplies a self-contained, less pessimistic framework that captures natural mechanisms such as random counterexamples while still admitting tight, matching bounds in both feedback regimes. The formal definition of the symmetric class, the game-theoretic formulation, and the explicit upper/lower-bound arguments are strengths that make the tightness claims internally consistent within the stated model.

minor comments (2)

[Abstract] Abstract: the statement that tight bounds are obtained would be strengthened by a one-sentence indication of the functional dependence of the bound (e.g., on the size of the hypothesis class or on a complexity measure of the symmetric difference).
[Introduction] The manuscript would benefit from an explicit statement, early in the introduction, of the precise definition of a symmetric counterexample generator (currently only described informally).

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their accurate summary of our paper, for highlighting the significance of the symmetric counterexample model and the matching upper and lower bounds in both feedback regimes, and for recommending minor revision. No specific major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper explicitly defines the symmetric counterexample class (dependence only on symmetric difference) and derives tight round bounds via an independent game-theoretic interaction model, adaptive weighting for upper bounds, and minimax arguments for lower bounds. These techniques operate directly on the stated restricted model without reducing any claim to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation chain remains self-contained and falsifiable within the given framework.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Based on the abstract alone, the central claim rests on the modeling choice of symmetric generators and standard assumptions from learning theory; no explicit free parameters or invented entities beyond the new class are described.

axioms (1)

domain assumption Counterexample generators are symmetric, choosing based only on the symmetric difference between hypothesis and target.
This is the key restriction that makes the model less pessimistic than fully adversarial.

invented entities (1)

symmetric counterexample generator no independent evidence
purpose: To capture natural, less adversarial feedback mechanisms in equivalence query learning.
Defined to include random counterexamples and simplest-counterexample generators; no independent evidence provided.

pith-pipeline@v0.9.0 · 5606 in / 1296 out tokens · 158535 ms · 2026-05-10T20:07:41.404414+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel; dAlembert_to_ODE_general_theorem echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

Definition 1 (Symmetric counterexample generators) ... CE(h, c|...) = CE(c, h|...) ... Payoff(h, c) + Payoff(c, h) ≤ 1 ... minimax theorem implies ... Payoff(p, q) ≤ 1/2 ... version space shrinks by a factor of at least 2 ... Ldim drops by at least one with constant probability
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection; RCLCombiner_isCoupling_iff echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

RCLCombiner ... interactionDefect ... IsCouplingCombiner ... branch_selection

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages

[1]

(b) Define the zero-sum game onVwith payoff as in (6)

While|V|>1: (a) Letd V = Ldim(V). (b) Define the zero-sum game onVwith payoff as in (6). (c) Compute a minimax mixed strategyp V for the learner. (d) Sampleh∼p V and submithas an equivalence query. (e) If the adversary accepts, terminate. (f) Otherwise receive counterexample(x, y)and update V←V x→y. Lemma 6Fix a nonempty version spaceV⊆ Hand letd V = Ldim...

work page
[2]

ProofWe first establish the following symmetry inequality for pure strategies: for allh, c∈V, Payoff(h, c) + Payoff(c, h)≤1.(7) Fixh, c∈V

In particular, there exists a (possibly randomized) learner strategyp V overVsuch that for everyc∈V, Payoff(pV , c)≤ 1 2 . ProofWe first establish the following symmetry inequality for pure strategies: for allh, c∈V, Payoff(h, c) + Payoff(c, h)≤1.(7) Fixh, c∈V. Ifh=c, thenPayoff(h, h) = 0by definition, and (7) holds trivially. Assume henceforth thath̸=c, ...

work page
[3]

Since the game is finite (both players’ strategy sets are the finite setV⊆ H), von Neumann’s minimax theorem applies (von Neumann, 1928; Cesa-Bianchi and Lugosi, 2006)

In other words, for every adversary strategyqthere exists a learner strategyp(such asp=q) satisfyingPayoff(p, q)≤ 1 2, and hence themaximinvalue of the game is at most 1 2. Since the game is finite (both players’ strategy sets are the finite setV⊆ H), von Neumann’s minimax theorem applies (von Neumann, 1928; Cesa-Bianchi and Lugosi, 2006). Therefore, the ...

work page 1928
[4]

Fix a shattered Littlestone treeTof depthd= Ldim(H)

work page
[5]

Letν(h)be the deepest node reached

For each hypothesish∈ H, associate a nodeν(h)obtained by traversingTfrom the root according to the predictions ofh, stopping when no outgoing edge is consistent withh. Letν(h)be the deepest node reached

work page
[6]

(b) Otherwise, letxbe the instance labelingν:= LCA(ν(h), ν(c))

Upon receiving an equivalence queryhwith target conceptc: (a) Ifν(h) =ν(c), accepthand terminate. (b) Otherwise, letxbe the instance labelingν:= LCA(ν(h), ν(c)). (c) Return(x, c(x))as a counterexample. SinceTis shattered, for every root-to-leaf pathπinTthere exists an hypothesis inHthat is consistent with the labels alongπ. Fix, for each leafℓ(equivalentl...

work page
[7]

Input:x=x t+1 andi∈[k]

work page
[8]

For eachj∈[k]assignκ t(x, j) :=µ t{s:r s,t(x) =j}

work page
[9]

For a string of values(s, j)let ν(s, j) =    0ifj=i µt(s)· 1− k−1 k3 + κt(x,i) κt(x,j)·k ifr s,t(x) =j(andj̸=i) µt(s)· 1 k3 otherwise

work page
[10]

Normalizeνto obtain probability distributionµ t+1: µi t+1(s, j) :=ν(s, j)·  X s′,j′ ν(s′, j′)   −1 Next, we let: Φt(h;µ t) = (8 lnk)L t(h)−lnµ t(h),Φ (x,i) t (h;µ t) = (8 lnk)L x t (h)−lnµ i t+1(h). Then, the payoff matrix at roundtis defined as follows Payoff(h, c) =−1ifh=cand otherwise: Payoff(h, c) := E x∼CE(c,h|history) (Payoffx(h, c)) := E x∼CE(c...

work page
[11]

Initializeµ 0 to be the unit mass on empty string

work page
[12]

almost uniform

While the adversary doesn’t accept: (a) Define the zero-sum game onHwith payoff as in eq. (9). (b) Compute a minimax mixed strategyp t for the learner as in eq. (10). (c) Sampleh∼p t and submithas an equivalence query. (d) If the adversary accepts, terminate. (e) Otherwise receive counterexamplex t+1 =xand updateµ t+1 :=µ h(x) t+1 according to the update ...

work page