arxiv: 2605.11554 · v1 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

A Controlled Counterexample to Strong Proxy-Based Explanations of OOD Performance: in a Fixed Pretraining-and-Probing Setup

Hongmin Li

Authors on Pith no claims yet

Pith reviewed 2026-05-13 02:05 UTC · model grok-4.3

classification 💻 cs.LG

keywords proxy-based explanationsOOD performancepretrainingprobingcounterexampletask-relevant structureepiplexitysequence models

0 comments

The pith

A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether a proxy for total learned structure must align with rankings based on out-of-distribution probe accuracy in a fixed pretraining setup. It constructs a controlled case where the formal structure, its proxy, and the task-relevant structure diverge. In a synthetic sequence model experiment, the OOD accuracy ranking reverses the proxy ranking in two of three seeds under the primary evaluation. This demonstrates a boundary on using such proxies to explain transfer performance. A sympathetic reader would care because many interpretations of why one pretraining corpus transfers better rely on these proxies tracking what matters for the downstream task.

Core claim

In a fixed pretraining-and-probing setup motivated by computationally bounded notions of learned structure, including epiplexity, a formal structure quantity, its operational proxy, and the task-relevant structure for a target family can separate. This separation allows the OOD accuracy ranking of two pretraining datasets to disagree with their proxy ranking, as shown first in a controlled mathematical construction and then instantiated in a synthetic sequence-model experiment where the reversal occurs in two of three seeds, with auxiliary diagnostics supporting the interpretation.

What carries the argument

The controlled construction that separates a formal structure quantity, its operational proxy, and the task-relevant structure for a target family, instantiated in a synthetic sequence-model experiment with all-sample evaluation.

If this is right

Proxy rankings of pretraining datasets can disagree with rankings by OOD probe accuracy.
Strong proxy-based explanations of OOD performance require the proxy to track task-relevant structure.
The counterexample identifies a boundary condition rather than rejecting structure-based explanations in general.
Task-agnostic structure proxies need validation against downstream task relevance before explaining transfer differences.
The separation mechanism can be realized both formally and in concrete sequence-model training.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Explanations of corpus transfer might need proxies that isolate task-specific structure instead of total structure.
Discrepancies observed when comparing real pretraining corpora could arise from the same kind of separation shown here.
Extending the construction to natural language data or larger models would test whether the boundary persists outside synthetic settings.
The result suggests checking alignment between proxy and task performance before using proxies for causal claims about pretraining.

Load-bearing premise

That the synthetic sequence-model experiment and the controlled construction adequately capture the dynamics of real-world pretraining and OOD probing scenarios.

What would settle it

Replicating the synthetic experiment across additional seeds, architectures, or sequence lengths and consistently finding that proxy rankings match OOD accuracy rankings would challenge the counterexample.

Figures

Figures reproduced from arXiv: 2605.11554 by Hongmin Li.

**Figure 2.** Figure 2: Per-seed OOD probe-accuracy gaps for the primary evaluation. The left panel shows the [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Config-level comparison between the primary evaluation and the background-only ablation. [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗

read the original abstract

Task-agnostic structure proxies are often used to interpret why one pretraining corpus transfers better than another, but such explanations require the proxy to track the structure that matters for the downstream task. We test this requirement in a fixed pretraining-and-probing setup motivated by computationally bounded notions of learned structure, including epiplexity. The core question is whether a proxy ranking of two pretraining datasets must agree with their ranking by OOD probe accuracy. We show that it need not. First, we give a controlled construction in which a formal structure quantity, its operational proxy, and the task-relevant structure for a target family separate. We then instantiate the same mechanism in a synthetic sequence-model experiment: under the primary all-sample evaluation, the OOD accuracy ranking reverses the proxy ranking in two of three seeds, with auxiliary diagnostics and ablations supporting the same interpretation. The counterexample does not reject structure-based explanations in general; it identifies a boundary on strong proxy-based explanations. A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a controlled construction and synthetic experiment showing that a structure proxy can reverse the OOD accuracy ranking even when pretraining and probing are fixed.

read the letter

The main thing to know is that the authors separate a formal structure quantity, its practical proxy, and the task-relevant structure, then show the proxy ranking can flip against actual OOD probe accuracy in a sequence-model setup. This happens under the primary metric in two of three seeds, with some diagnostics attached. The result is scoped as an existence proof of a boundary rather than a general attack on structure explanations of transfer.

Referee Report

0 major / 2 minor

Summary. The paper claims that task-agnostic structure proxies (e.g., those motivated by epiplexity) need not track the task-relevant structure driving OOD probe accuracy, even inside a fixed pretraining-and-probing setup. It demonstrates this via an explicit controlled construction that separates a formal structure quantity, its operational proxy, and the task-relevant structure for a target family, followed by a synthetic sequence-model instantiation in which the OOD accuracy ranking reverses the proxy ranking under the primary all-sample metric in two of three seeds, with supporting diagnostics and ablations.

Significance. If the result holds, the work is significant as a narrowly scoped existence proof that identifies a boundary condition for strong proxy-based explanations of OOD transfer. By using a parameter-free theoretical separation and a reproducible synthetic experiment with auxiliary checks, it shows that proxies for total learned structure can fail to align with OOD performance even when pretraining and probing are held fixed. This is a useful caution for interpretability research without claiming generality to production pipelines, and the explicit scoping strengthens rather than weakens the contribution.

minor comments (2)

[§4] §4 (Synthetic Experiment): the primary all-sample metric and the decision to report reversal in two of three seeds are central to the empirical claim; a brief justification for the metric choice and seed count (e.g., via power considerations or additional ablations) would make the robustness of the reversal clearer without altering the existence result.
[§3] Notation and definitions: the precise operational definition of the structure proxy (including how epiplexity is computed in the construction) should be stated in a single self-contained paragraph or equation early in §3 so that the separation argument can be verified independently of the later experiment.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive and accurate summary of our contribution, as well as for recognizing its significance as a narrowly scoped existence proof. The referee correctly identifies the core result: a controlled separation showing that task-agnostic structure proxies need not track the structure relevant to OOD probe accuracy, even under fixed pretraining and probing. No specific major comments were raised in the report. We are prepared to incorporate any minor revisions requested by the editor.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper advances an existence proof via explicit construction and synthetic instantiation rather than a derivation chain. It separates a formal structure quantity, its proxy, and task-relevant structure by design, then shows reversal under controlled conditions. No step claims a first-principles prediction that reduces to fitted inputs, self-citation load-bearing premises, or ansatz smuggling; the central claim is scoped as identifying a boundary on strong proxy explanations, not deriving a general result from its own outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim depends on the validity of the separation in the construction and the experimental results; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption The pretraining-and-probing setup isolates the effect of data structure on OOD performance
Used to make the counterexample meaningful for the broader question.

pith-pipeline@v0.9.0 · 5495 in / 1184 out tokens · 81054 ms · 2026-05-13T02:05:26.884417+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Proposition 1... S(DA) > S(DB) while OODPerf(DA -> T) < OODPerf(DB -> T)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages · 1 internal anchor

[2]

Automatica , volume=

Modeling by Shortest Data Description , author=. Automatica , volume=

work page
[3]

2007 , publisher=

The Minimum Description Length Principle , author=. 2007 , publisher=

work page 2007
[5]

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , year=

Designing and Interpreting Probes with Control Tasks , author=. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , year=

work page 2019
[6]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=

Information-Theoretic Probing with Minimum Description Length , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=

work page 2020
[7]

Proceedings of the 36th International Conference on Machine Learning , year=

Data Shapley: Equitable Valuation of Data for Machine Learning , author=. Proceedings of the 36th International Conference on Machine Learning , year=

work page
[8]

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=

Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=

work page 2020
[9]

Understanding intermediate layers using linear classifier probes

Guillaume Alain and Yoshua Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

From entropy to epiplexity: Rethinking information for computationally bounded intelligence.arXiv preprint arXiv:2601.03220,

Marc Finzi, Yiding Jiang, J. Zico Kolter, Shikai Qiu, Pavel Izmailov, and Andrew Gordon Wilson. From entropy to epiplexity: Rethinking information for computationally bounded intelligence. arXiv preprint arXiv:2601.03220, 2026

work page arXiv 2026
[11]

Data shapley: Equitable valuation of data for machine learning

Amirata Ghorbani and James Zou. Data shapley: Equitable valuation of data for machine learning. In Proceedings of the 36th International Conference on Machine Learning, 2019

work page 2019
[12]

Gr \"u nwald

Peter D. Gr \"u nwald. The Minimum Description Length Principle. MIT Press, 2007

work page 2007
[13]

Designing and interpreting probes with control tasks

John Hewitt and Percy Liang. Designing and interpreting probes with control tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

work page 2019
[14]

Modeling by shortest data description

Jorma Rissanen. Modeling by shortest data description. Automatica, 14 0 (5): 0 465--471, 1978

work page 1978
[15]

Smith, and Yejin Choi

Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, and Yejin Choi. Dataset cartography: Mapping and diagnosing datasets with training dynamics. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

work page 2020
[16]

Information-theoretic probing with minimum description length

Elena Voita and Ivan Titov. Information-theoretic probing with minimum description length. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

work page 2020