Recognition: 2 theorem links
· Lean TheoremA Controlled Counterexample to Strong Proxy-Based Explanations of OOD Performance: in a Fixed Pretraining-and-Probing Setup
Pith reviewed 2026-05-13 02:05 UTC · model grok-4.3
The pith
A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a fixed pretraining-and-probing setup motivated by computationally bounded notions of learned structure, including epiplexity, a formal structure quantity, its operational proxy, and the task-relevant structure for a target family can separate. This separation allows the OOD accuracy ranking of two pretraining datasets to disagree with their proxy ranking, as shown first in a controlled mathematical construction and then instantiated in a synthetic sequence-model experiment where the reversal occurs in two of three seeds, with auxiliary diagnostics supporting the interpretation.
What carries the argument
The controlled construction that separates a formal structure quantity, its operational proxy, and the task-relevant structure for a target family, instantiated in a synthetic sequence-model experiment with all-sample evaluation.
If this is right
- Proxy rankings of pretraining datasets can disagree with rankings by OOD probe accuracy.
- Strong proxy-based explanations of OOD performance require the proxy to track task-relevant structure.
- The counterexample identifies a boundary condition rather than rejecting structure-based explanations in general.
- Task-agnostic structure proxies need validation against downstream task relevance before explaining transfer differences.
- The separation mechanism can be realized both formally and in concrete sequence-model training.
Where Pith is reading between the lines
- Explanations of corpus transfer might need proxies that isolate task-specific structure instead of total structure.
- Discrepancies observed when comparing real pretraining corpora could arise from the same kind of separation shown here.
- Extending the construction to natural language data or larger models would test whether the boundary persists outside synthetic settings.
- The result suggests checking alignment between proxy and task performance before using proxies for causal claims about pretraining.
Load-bearing premise
That the synthetic sequence-model experiment and the controlled construction adequately capture the dynamics of real-world pretraining and OOD probing scenarios.
What would settle it
Replicating the synthetic experiment across additional seeds, architectures, or sequence lengths and consistently finding that proxy rankings match OOD accuracy rankings would challenge the counterexample.
Figures
read the original abstract
Task-agnostic structure proxies are often used to interpret why one pretraining corpus transfers better than another, but such explanations require the proxy to track the structure that matters for the downstream task. We test this requirement in a fixed pretraining-and-probing setup motivated by computationally bounded notions of learned structure, including epiplexity. The core question is whether a proxy ranking of two pretraining datasets must agree with their ranking by OOD probe accuracy. We show that it need not. First, we give a controlled construction in which a formal structure quantity, its operational proxy, and the task-relevant structure for a target family separate. We then instantiate the same mechanism in a synthetic sequence-model experiment: under the primary all-sample evaluation, the OOD accuracy ranking reverses the proxy ranking in two of three seeds, with auxiliary diagnostics and ablations supporting the same interpretation. The counterexample does not reject structure-based explanations in general; it identifies a boundary on strong proxy-based explanations. A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that task-agnostic structure proxies (e.g., those motivated by epiplexity) need not track the task-relevant structure driving OOD probe accuracy, even inside a fixed pretraining-and-probing setup. It demonstrates this via an explicit controlled construction that separates a formal structure quantity, its operational proxy, and the task-relevant structure for a target family, followed by a synthetic sequence-model instantiation in which the OOD accuracy ranking reverses the proxy ranking under the primary all-sample metric in two of three seeds, with supporting diagnostics and ablations.
Significance. If the result holds, the work is significant as a narrowly scoped existence proof that identifies a boundary condition for strong proxy-based explanations of OOD transfer. By using a parameter-free theoretical separation and a reproducible synthetic experiment with auxiliary checks, it shows that proxies for total learned structure can fail to align with OOD performance even when pretraining and probing are held fixed. This is a useful caution for interpretability research without claiming generality to production pipelines, and the explicit scoping strengthens rather than weakens the contribution.
minor comments (2)
- [§4] §4 (Synthetic Experiment): the primary all-sample metric and the decision to report reversal in two of three seeds are central to the empirical claim; a brief justification for the metric choice and seed count (e.g., via power considerations or additional ablations) would make the robustness of the reversal clearer without altering the existence result.
- [§3] Notation and definitions: the precise operational definition of the structure proxy (including how epiplexity is computed in the construction) should be stated in a single self-contained paragraph or equation early in §3 so that the separation argument can be verified independently of the later experiment.
Simulated Author's Rebuttal
We thank the referee for the positive and accurate summary of our contribution, as well as for recognizing its significance as a narrowly scoped existence proof. The referee correctly identifies the core result: a controlled separation showing that task-agnostic structure proxies need not track the structure relevant to OOD probe accuracy, even under fixed pretraining and probing. No specific major comments were raised in the report. We are prepared to incorporate any minor revisions requested by the editor.
Circularity Check
No significant circularity
full rationale
The paper advances an existence proof via explicit construction and synthetic instantiation rather than a derivation chain. It separates a formal structure quantity, its proxy, and task-relevant structure by design, then shows reversal under controlled conditions. No step claims a first-principles prediction that reduces to fitted inputs, self-citation load-bearing premises, or ansatz smuggling; the central claim is scoped as identifying a boundary on strong proxy explanations, not deriving a general result from its own outputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The pretraining-and-probing setup isolates the effect of data structure on OOD performance
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
A proxy for total learned structure can fail to track the task-relevant structure that drives OOD performance, even in a controlled setting.
-
IndisputableMonolith/Foundation/BranchSelection.leanbranch_selection unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Proposition 1... S(DA) > S(DB) while OODPerf(DA -> T) < OODPerf(DB -> T)
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[2]
Modeling by Shortest Data Description , author=. Automatica , volume=
-
[3]
The Minimum Description Length Principle , author=. 2007 , publisher=
work page 2007
-
[5]
Designing and Interpreting Probes with Control Tasks , author=. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing , year=
work page 2019
-
[6]
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=
Information-Theoretic Probing with Minimum Description Length , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=
work page 2020
-
[7]
Proceedings of the 36th International Conference on Machine Learning , year=
Data Shapley: Equitable Valuation of Data for Machine Learning , author=. Proceedings of the 36th International Conference on Machine Learning , year=
-
[8]
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics , author=. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing , year=
work page 2020
-
[9]
Understanding intermediate layers using linear classifier probes
Guillaume Alain and Yoshua Bengio. Understanding intermediate layers using linear classifier probes. arXiv preprint arXiv:1610.01644, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[10]
Marc Finzi, Yiding Jiang, J. Zico Kolter, Shikai Qiu, Pavel Izmailov, and Andrew Gordon Wilson. From entropy to epiplexity: Rethinking information for computationally bounded intelligence. arXiv preprint arXiv:2601.03220, 2026
-
[11]
Data shapley: Equitable valuation of data for machine learning
Amirata Ghorbani and James Zou. Data shapley: Equitable valuation of data for machine learning. In Proceedings of the 36th International Conference on Machine Learning, 2019
work page 2019
-
[12]
Peter D. Gr \"u nwald. The Minimum Description Length Principle. MIT Press, 2007
work page 2007
-
[13]
Designing and interpreting probes with control tasks
John Hewitt and Percy Liang. Designing and interpreting probes with control tasks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
work page 2019
-
[14]
Modeling by shortest data description
Jorma Rissanen. Modeling by shortest data description. Automatica, 14 0 (5): 0 465--471, 1978
work page 1978
-
[15]
Swabha Swayamdipta, Roy Schwartz, Nicholas Lourie, Yizhong Wang, Hannaneh Hajishirzi, Noah A. Smith, and Yejin Choi. Dataset cartography: Mapping and diagnosing datasets with training dynamics. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
work page 2020
-
[16]
Information-theoretic probing with minimum description length
Elena Voita and Ivan Titov. Information-theoretic probing with minimum description length. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.