Representation Without Reward: A JEPA Audit for LLM Fine-Tuning
Pith reviewed 2026-05-19 16:23 UTC · model grok-4.3
The pith
JEPA-style auxiliaries change LLM hidden-state geometry but leave task accuracy unchanged on language-to-regex generation
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In LLM fine-tuning for natural-language-to-regex generation, auxiliary objectives intended to shape hidden-state geometry produce measurable shifts in representation statistics and gradient alignment, including the first positive cosine with cross-entropy observed for a decoder-visible JEPA construction, yet none deliver task accuracy improvements that survive Bonferroni or Holm-Bonferroni correction. Exact-match scores stay inside seed noise for both LoRA and full-parameter regimes. The findings therefore establish a weak coupling between hidden-state representation work and decoded-task accuracy, reframing JEPA evaluation around the question of when useful geometry becomes visible as task-
What carries the argument
The decoder-visible JEPA objective constructed to lie in cross-entropy's positive cone, tested as one of twenty-two auxiliaries for whether induced hidden-state changes reach the language-model head and improve exact-match accuracy.
Load-bearing premise
The natural-language-to-regex generation task with exact-match metric is representative enough that a null result generalizes to weak coupling between hidden geometry and task signal across LLM fine-tuning.
What would settle it
A statistically significant exact-match gain from the decoder-visible JEPA auxiliary on a second generation task such as text-to-SQL, after the same multiple-testing corrections, would falsify the weak-coupling claim.
Figures
read the original abstract
Joint-embedding predictive architectures (JEPAs) propose that a model should learn more useful abstractions when trained to predict latent representations rather than observed outputs. For autoregressive language-model fine-tuning the principle entails a stricter requirement: the induced hidden-state geometry must reach the language-model head \emph{and} improve the decoded task metric. We test that requirement under a fixed Llama-3.2-1B-Instruct LoRA harness on natural-language-to-regex generation, comparing twenty-two training-time auxiliaries across trajectory-shape regularisation, distributional constraints, predictor/target asymmetry, Fisher-metric Jacobi residuals, and a decoder-visible JEPA objective constructed to lie in cross-entropy's positive cone. The empirical answer is a structured null: several auxiliaries clear single-cell paired $\alpha = 0.10$ without correction (T3-Local at $\Delta = +2.53$~pp, $p = 0.003$ being the strongest), but none survives Bonferroni or Holm--Bonferroni at the relevant family-wise threshold, even though many change curvature, anisotropy, variance, and gradient direction. Decoder-visible JEPA yields the first positive auxiliary--cross-entropy gradient cosine in the study, yet exact match remains inside seed noise; a full-fine-tuning replication of the same auxiliary at $n = 5$ seeds reproduces the null on both benchmarks (TURK: $\Delta = +0.04$~pp, $p_{\text{paired}} = 0.96$; SYNTH: $\Delta = +0.52$~pp, $p_{\text{paired}} = 0.28$), so the null is robust across LoRA and full fine-tuning for the decoder-visible construction. Hidden-state representation work and decoded-task accuracy are therefore weakly coupled in this regime; we accordingly reframe LLM-domain JEPA evaluation as a coupling problem, in which the operative question is under which metrics useful hidden geometry becomes decoder-visible task signal.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This paper audits Joint-Embedding Predictive Architectures (JEPAs) for autoregressive LLM fine-tuning by testing whether 22 training auxiliaries (trajectory regularizers, distributional constraints, predictor/target asymmetry, Fisher-metric Jacobi residuals, and a decoder-visible JEPA objective) improve hidden-state geometry in a way that reaches the language-model head and raises decoded-task accuracy. Experiments use a fixed Llama-3.2-1B-Instruct LoRA harness on natural-language-to-regex generation with exact-match metric. Results show a structured null: nominal gains (e.g., T3-Local at +2.53 pp, p=0.003) fail Bonferroni/Holm correction; decoder-visible JEPA produces the first positive auxiliary–cross-entropy gradient cosine yet yields no task improvement. The null replicates under full fine-tuning (TURK and SYNTH benchmarks). The authors conclude that hidden-state representation work and decoded accuracy are weakly coupled in this regime and reframe LLM-domain JEPA evaluation as a coupling problem.
Significance. If the null holds, the work supplies controlled evidence that JEPA-style auxiliaries can alter hidden-state curvature, anisotropy, and gradient direction without producing decoder-visible task gains on this benchmark. The statistical design (paired tests, seed variation, Bonferroni/Holm correction) and the full-fine-tuning replication are clear strengths that make the reported null reliable within the studied setup. The result usefully separates representation learning from task-signal transmission in autoregressive fine-tuning.
major comments (1)
- [Abstract] Abstract: the reframing of LLM-domain JEPA evaluation as a 'coupling problem' is presented as following from the observed null. The null is robust for the reported NL-to-regex task and exact-match metric (including the full-fine-tuning replication), but the extension to a general reframing assumes this narrow, constrained-output, 0/1-metric setup is representative of regimes in which representation geometry more continuously affects generation quality. A brief discussion of scope or a second task would strengthen the claim.
Simulated Author's Rebuttal
We thank the referee for the constructive comment on scope. We address the concern below and will revise the manuscript to clarify the intended domain of the reframing.
read point-by-point responses
-
Referee: [Abstract] Abstract: the reframing of LLM-domain JEPA evaluation as a 'coupling problem' is presented as following from the observed null. The null is robust for the reported NL-to-regex task and exact-match metric (including the full-fine-tuning replication), but the extension to a general reframing assumes this narrow, constrained-output, 0/1-metric setup is representative of regimes in which representation geometry more continuously affects generation quality. A brief discussion of scope or a second task would strengthen the claim.
Authors: We agree that the reframing should be explicitly scoped to the studied regime rather than presented as fully general. The manuscript already qualifies the setting as natural-language-to-regex generation under exact-match evaluation and demonstrates robustness via the full-fine-tuning replication on both TURK and SYNTH. To address the referee's point directly, we will revise the abstract to state that the coupling problem is identified 'in this regime' and add a short paragraph in the discussion section noting that the weak coupling between hidden-state geometry and decoded accuracy may not hold under open-ended generation or continuous quality metrics. We do not add a second task at this stage because the current experimental harness (fixed Llama-3.2-1B LoRA, 22 auxiliaries, paired seed design, multiple-testing correction) is already resource-intensive; the added scope discussion will make the boundary conditions of the claim transparent without overclaiming generality. revision: yes
Circularity Check
No circularity: empirical null-result study with interpretive reframing
full rationale
The paper reports results from controlled fine-tuning experiments comparing 22 training auxiliaries on a natural-language-to-regex task under fixed LoRA and full fine-tuning regimes. The central claim of weak coupling between hidden-state geometry and decoded-task accuracy follows from the observed structured null on exact-match metrics (none surviving multiple-testing correction, with decoder-visible JEPA also null). This is an empirical conclusion from external benchmarks rather than any self-referential equation, fitted parameter renamed as prediction, or self-citation chain. The reframing as a coupling problem is an interpretive step based on the null findings and does not reduce to quantities defined inside the study by construction. No load-bearing self-citations, uniqueness theorems, or ansatzes are present in the derivation.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The natural-language-to-regex task with exact-match accuracy is representative of broader LLM fine-tuning regimes for testing representation-task coupling.
Reference graph
Works this paper leans on
-
[1]
J. Schmidhuber, “Making the world differentiable: On using self- supervised fully recurrent neural networks for dynamic reinforcement learning and planning in non-stationary environments,” Institut f ¨ur Informatik, Technische Universit ¨at M ¨unchen, Tech. Rep. FKI-126-90, 1990
work page 1990
-
[2]
A path towards autonomous machine intelligence, ver- sion 0.9.2,
Y . LeCun, “A path towards autonomous machine intelligence, ver- sion 0.9.2,” OpenReview, 2022, position paper introducing the joint- embedding predictive architecture (JEPA)
work page 2022
-
[3]
Curious model-building control systems,
J. Schmidhuber, “Curious model-building control systems,”Proceedings of the International Joint Conference on Neural Networks (IJCNN), pp. 1458–1463, 1991
work page 1991
-
[4]
M. Assran, Q. Duval, I. Misra, P. Bojanowski, P. Vincent, M. Rab- bat, Y . LeCun, and N. Ballas, “Self-supervised learning from images with a joint-embedding predictive architecture,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023, arXiv:2301.08243v3
-
[5]
L. Mur-Labadia, M. Muckley, A. Bar, M. Assran, K. Sinha, M. Rabbat, Y . LeCun, N. Ballas, and A. Bardes, “V-jepa 2.1: Unlocking dense fea- tures in video self-supervised learning,” Mar. 2026, arXiv:2603.14482v2, preprint, March 2026
-
[6]
Semantic tube prediction: Beating llm data efficiency with jepa, 2026
H. Huang, Y . LeCun, and R. Balestriero, “Semantic tube prediction: Beating llm data efficiency with jepa,” Feb. 2026, arXiv:2602.22617v1, preprint, February 2026
-
[7]
A Simple Framework for Contrastive Learning of Visual Representations
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” inProceedings of the International Conference on Machine Learning (ICML), 2020, arXiv:2002.05709
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[8]
Bootstrap your own latent: A new approach to self-supervised learn- ing
J.-B. Grill, F. Strub, F. Altch ´e, C. Tallec, P. H. Richemond, E. Buchatskaya, C. Doersch, B. A. Pires, Z. D. Guo, M. G. Azar, B. Piot, R. Mvtchell, A. Ahuja, E. Agapow, and C. Beurie, “Bootstrap your own latent: A new approach to self-supervised learning,” inProceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2020, arXiv:2006.07733
-
[9]
Exploring simple siamese representation learning, 2020
X. Chen and K. He, “Exploring simple siamese representation learning,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, arXiv:2011.10566
-
[10]
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning
A. Bardes, J. Ponce, and Y . LeCun, “Vicreg: Variance-invariance- covariance regularization for self-supervised learning,” inProceedings of the International Conference on Learning Representations (ICLR), 2022, arXiv:2105.04906
work page internal anchor Pith review Pith/arXiv arXiv 2022
- [11]
-
[12]
Y . Tian, X. Chen, and S. Ganguli, “Understanding self-supervised learning dynamics without contrastive pairs,” inProceedings of the International Conference on Machine Learning (ICML), 2021, arXiv:2102.06810
-
[13]
Q. Garrido, Y . Chen, A. Bardes, L. Najman, and Y . LeCun, “On the duality between contrastive and non-contrastive self-supervised learning,”Proceedings of the International Conference on Learning Representations (ICLR), 2024, arXiv:2206.02574, oral presentation
-
[14]
Implicit variance regular- ization in non-contrastive ssl,
M. S. Halvagal, A. Laborieux, and F. Zenke, “Implicit variance regular- ization in non-contrastive ssl,”Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), 2023, arXiv:2212.04858
-
[15]
How jepa avoids noisy features: The implicit bias of deep linear self distillation networks,
E. Littwin, O. Saremi, M. Advani, V . Thilak, P. Nakkiran, C. Huang, and J. Susskind, “How jepa avoids noisy features: The implicit bias of deep linear self distillation networks,” inProceedings of the Con- ference on Neural Information Processing Systems (NeurIPS), 2024, arXiv:2407.03475
-
[16]
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics
R. Balestriero and Y . LeCun, “Lejepa: Provable and scalable self- supervised learning without the heuristics,”arXiv preprint, 2025, arXiv:2511.08544
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[17]
data2vec: A general framework for self-supervised learning in speech, vision and language
A. Baevski, W.-N. Hsu, Q. Xu, A. Babu, J. Gu, and M. Auli, “data2vec: A general framework for self-supervised learning in speech, vision and language,”Proceedings of the International Conference on Machine Learning (ICML), 2022, arXiv:2202.03555
-
[18]
LLM-JEPA: Large language models meet joint embedding predictive architectures,
H. Huang, Y . LeCun, and R. Balestriero, “LLM-JEPA: Large language models meet joint embedding predictive architectures,” Oct. 2025, arXiv:2509.14252v2, preprint, October 2025
-
[19]
Temporal straightening for latent planning,
Y . Wang, O. Bounou, G. Zhou, R. Balestriero, T. G. J. Rudner, Y . LeCun, and M. Ren, “Temporal straightening for latent planning,” Mar. 2026, arXiv:2603.12231v1, preprint, March 2026
-
[20]
The implicit bias of gradient descent on separable data,
D. Soudry, E. Hoffer, M. S. Nacson, S. Gunasekar, and N. Srebro, “The implicit bias of gradient descent on separable data,”Journal of Machine Learning Research, vol. 19, pp. 1–57, 2018, arXiv:1710.10345
-
[21]
Sliced and radon wasserstein barycenters of measures,
N. Bonneel, J. Rabin, G. Peyr ´e, and H. Pfister, “Sliced and radon wasserstein barycenters of measures,”Journal of Mathematical Imaging and Vision, vol. 51, no. 1, pp. 22–45, 2015
work page 2015
-
[22]
Generalized Sliced Wasserstein Distances
S. Kolouri, K. Nadjahi, U. S ¸ims ¸ekli, R. Badeau, and G. Rohde, “Gener- alized sliced Wasserstein distances,” inAdvances in Neural Information Processing Systems (NeurIPS), 2019, arXiv:1902.00434
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[23]
Estimation of non-normalized statistical models by score matching,
A. Hyv ¨arinen, “Estimation of non-normalized statistical models by score matching,”Journal of Machine Learning Research, vol. 6, pp. 695–709, 2005
work page 2005
-
[24]
Sliced Score Matching: A Scalable Approach to Density and Score Estimation
Y . Song, S. Garg, J. Shi, and S. Ermon, “Sliced score matching: A scalable approach to density and score estimation,” inProceedings of the Conference on Uncertainty in Artificial Intelligence (UAI), 2020, arXiv:1905.07088
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[25]
LoRA: Low-Rank Adaptation of Large Language Models
E. J. Hu, Y . Shen, P. Wallis, Z. Allen-Zhu, Y . Li, S. Wang, L. Wang, and W. Chen, “Lora: Low-rank adaptation of large language models,” inPro- ceedings of the International Conference on Learning Representations (ICLR), 2022, arXiv:2106.09685
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[26]
K. Ethayarajh, “How contextual are contextualized word representa- tions? comparing the geometry of bert, elmo, and gpt-2 embeddings,” in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019, arXiv:1909.00512
-
[27]
E. A. Hosseini and E. Fedorenko, “Large language models implicitly learn to straighten neural sentence trajectories to construct a predic- tive representation of natural language,” inProceedings of the Con- ference on Neural Information Processing Systems (NeurIPS), 2023, arXiv:2311.04930
-
[28]
The pitfalls of next-token prediction,
G. Bachmann and V . Nagarajan, “The pitfalls of next-token prediction,” inProceedings of the International Conference on Machine Learning (ICML), 2024, arXiv:2403.06963
-
[29]
Gradient surgery for multi-task learning,
T. Yu, S. Kumar, A. Gupta, S. Levine, K. Hausman, and C. Finn, “Gradient surgery for multi-task learning,” inAdvances in Neural Information Processing Systems (NeurIPS), 2020, arXiv:2001.06782. © 2026 JP Morgan Chase & Co. All rights reserved 22
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.