Recognition: 2 theorem links
· Lean TheoremLearning from Equivalence Queries, Revisited
Pith reviewed 2026-05-10 20:07 UTC · model grok-4.3
The pith
Restricting to symmetric counterexample generators produces tight bounds on the number of rounds needed to learn from equivalence queries under full-information and bandit feedback.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In the symmetric equivalence-query model the number of learning rounds is tightly bounded in the full-information setting, where each counterexample arrives with its correct label, and in the bandit setting, where only the fact of disagreement is revealed. The bounds are obtained by viewing the interaction as a game between the learner and a symmetric adversary, then applying adaptive weighting of hypotheses together with minimax arguments to determine the exact round complexity.
What carries the argument
Symmetric counterexample generators, defined as mechanisms that choose counterexamples using only the symmetric difference between the proposed hypothesis and the unknown target concept.
If this is right
- The same tight bounds hold whether the learner receives full labels or only bandit error signals.
- Natural mechanisms such as random counterexamples or selection of the simplest counterexample fall inside the symmetric class and therefore inherit the bounds.
- The game-theoretic analysis supplies both upper and lower bounds that match, so the round complexity cannot be improved inside the model.
- The framework separates the effect of adversarial strength from the effect of information type.
Where Pith is reading between the lines
- If real user feedback in deployed systems is approximately symmetric, the derived bounds give a concrete prediction for how many updates are needed before a model stabilizes.
- The separation between symmetric and fully adversarial generators suggests a way to measure the robustness of an interactive learning system by testing how close its feedback is to symmetry.
Load-bearing premise
The counterexample generators are symmetric, so their choice depends only on the symmetric difference between the hypothesis and the target.
What would settle it
An explicit symmetric generator together with a learning algorithm for which the number of rounds strictly exceeds the derived upper bound would falsify the tightness claim.
Figures
read the original abstract
Modern machine learning systems, such as generative models and recommendation systems, often evolve through a cycle of deployment, user interaction, and periodic model updates. This differs from standard supervised learning frameworks, which focus on loss or regret minimization over a fixed sequence of prediction tasks. Motivated by this setting, we revisit the classical model of learning from equivalence queries, introduced by Angluin (1988). In this model, a learner repeatedly proposes hypotheses and, when a deployed hypothesis is inadequate, receives a counterexample. Under fully adversarial counterexample generation, however, the model can be overly pessimistic. In addition, most prior work assumes a \emph{full-information} setting, where the learner also observes the correct label of the counterexample, an assumption that is not always natural. We address these issues by restricting the environment to a broad class of less adversarial counterexample generators, which we call \emph{symmetric}. Informally, such generators choose counterexamples based only on the symmetric difference between the hypothesis and the target. This class captures natural mechanisms such as random counterexamples (Angluin and Dohrn, 2017; Bhatia, 2021; Chase, Freitag, and Reyzin, 2024), as well as generators that return the simplest counterexample according to a prescribed complexity measure. Within this framework, we study learning from equivalence queries under both full-information and bandit feedback. We obtain tight bounds on the number of learning rounds in both settings and highlight directions for future work. Our analysis combines a game-theoretic view of symmetric adversaries with adaptive weighting methods and minimax arguments.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper revisits the equivalence query model of Angluin (1988) by restricting the adversary to the class of symmetric counterexample generators, which select counterexamples based only on the symmetric difference between the current hypothesis and the target. Within this model the authors formulate the interaction as a game, derive matching upper bounds via adaptive weighting and lower bounds via minimax arguments, and obtain tight bounds on the number of rounds required under both full-information feedback and bandit feedback.
Significance. If the derivations hold, the work supplies a self-contained, less pessimistic framework that captures natural mechanisms such as random counterexamples while still admitting tight, matching bounds in both feedback regimes. The formal definition of the symmetric class, the game-theoretic formulation, and the explicit upper/lower-bound arguments are strengths that make the tightness claims internally consistent within the stated model.
minor comments (2)
- [Abstract] Abstract: the statement that tight bounds are obtained would be strengthened by a one-sentence indication of the functional dependence of the bound (e.g., on the size of the hypothesis class or on a complexity measure of the symmetric difference).
- [Introduction] The manuscript would benefit from an explicit statement, early in the introduction, of the precise definition of a symmetric counterexample generator (currently only described informally).
Simulated Author's Rebuttal
We thank the referee for their accurate summary of our paper, for highlighting the significance of the symmetric counterexample model and the matching upper and lower bounds in both feedback regimes, and for recommending minor revision. No specific major comments were raised in the report.
Circularity Check
No significant circularity detected
full rationale
The paper explicitly defines the symmetric counterexample class (dependence only on symmetric difference) and derives tight round bounds via an independent game-theoretic interaction model, adaptive weighting for upper bounds, and minimax arguments for lower bounds. These techniques operate directly on the stated restricted model without reducing any claim to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain. The derivation chain remains self-contained and falsifiable within the given framework.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Counterexample generators are symmetric, choosing based only on the symmetric difference between hypothesis and target.
invented entities (1)
-
symmetric counterexample generator
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel; dAlembert_to_ODE_general_theorem echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
Definition 1 (Symmetric counterexample generators) ... CE(h, c|...) = CE(c, h|...) ... Payoff(h, c) + Payoff(c, h) ≤ 1 ... minimax theorem implies ... Payoff(p, q) ≤ 1/2 ... version space shrinks by a factor of at least 2 ... Ldim drops by at least one with constant probability
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection; RCLCombiner_isCoupling_iff echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
RCLCombiner ... interactionDefect ... IsCouplingCombiner ... branch_selection
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
(b) Define the zero-sum game onVwith payoff as in (6)
While|V|>1: (a) Letd V = Ldim(V). (b) Define the zero-sum game onVwith payoff as in (6). (c) Compute a minimax mixed strategyp V for the learner. (d) Sampleh∼p V and submithas an equivalence query. (e) If the adversary accepts, terminate. (f) Otherwise receive counterexample(x, y)and update V←V x→y. Lemma 6Fix a nonempty version spaceV⊆ Hand letd V = Ldim...
-
[2]
In particular, there exists a (possibly randomized) learner strategyp V overVsuch that for everyc∈V, Payoff(pV , c)≤ 1 2 . ProofWe first establish the following symmetry inequality for pure strategies: for allh, c∈V, Payoff(h, c) + Payoff(c, h)≤1.(7) Fixh, c∈V. Ifh=c, thenPayoff(h, h) = 0by definition, and (7) holds trivially. Assume henceforth thath̸=c, ...
-
[3]
In other words, for every adversary strategyqthere exists a learner strategyp(such asp=q) satisfyingPayoff(p, q)≤ 1 2, and hence themaximinvalue of the game is at most 1 2. Since the game is finite (both players’ strategy sets are the finite setV⊆ H), von Neumann’s minimax theorem applies (von Neumann, 1928; Cesa-Bianchi and Lugosi, 2006). Therefore, the ...
work page 1928
-
[4]
Fix a shattered Littlestone treeTof depthd= Ldim(H)
-
[5]
Letν(h)be the deepest node reached
For each hypothesish∈ H, associate a nodeν(h)obtained by traversingTfrom the root according to the predictions ofh, stopping when no outgoing edge is consistent withh. Letν(h)be the deepest node reached
-
[6]
(b) Otherwise, letxbe the instance labelingν:= LCA(ν(h), ν(c))
Upon receiving an equivalence queryhwith target conceptc: (a) Ifν(h) =ν(c), accepthand terminate. (b) Otherwise, letxbe the instance labelingν:= LCA(ν(h), ν(c)). (c) Return(x, c(x))as a counterexample. SinceTis shattered, for every root-to-leaf pathπinTthere exists an hypothesis inHthat is consistent with the labels alongπ. Fix, for each leafℓ(equivalentl...
-
[7]
Input:x=x t+1 andi∈[k]
-
[8]
For eachj∈[k]assignκ t(x, j) :=µ t{s:r s,t(x) =j}
-
[9]
For a string of values(s, j)let ν(s, j) = 0ifj=i µt(s)· 1− k−1 k3 + κt(x,i) κt(x,j)·k ifr s,t(x) =j(andj̸=i) µt(s)· 1 k3 otherwise
-
[10]
Normalizeνto obtain probability distributionµ t+1: µi t+1(s, j) :=ν(s, j)· X s′,j′ ν(s′, j′) −1 Next, we let: Φt(h;µ t) = (8 lnk)L t(h)−lnµ t(h),Φ (x,i) t (h;µ t) = (8 lnk)L x t (h)−lnµ i t+1(h). Then, the payoff matrix at roundtis defined as follows Payoff(h, c) =−1ifh=cand otherwise: Payoff(h, c) := E x∼CE(c,h|history) (Payoffx(h, c)) := E x∼CE(c...
-
[11]
Initializeµ 0 to be the unit mass on empty string
-
[12]
While the adversary doesn’t accept: (a) Define the zero-sum game onHwith payoff as in eq. (9). (b) Compute a minimax mixed strategyp t for the learner as in eq. (10). (c) Sampleh∼p t and submithas an equivalence query. (d) If the adversary accepts, terminate. (e) Otherwise receive counterexamplex t+1 =xand updateµ t+1 :=µ h(x) t+1 according to the update ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.