pith:PYE5AVJQ
Representation Without Reward: A JEPA Audit for LLM Fine-Tuning
JEPA-style auxiliaries change LLM hidden-state geometry but leave task accuracy unchanged on language-to-regex generation
arxiv:2605.15394 v1 · 2026-05-14 · cs.LG · cs.AI · stat.ML
Add to your LaTeX paper
\usepackage{pith}
\pithnumber{PYE5AVJQ7LLVLKZHZ3CL2Q4JIZ}
Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge
Record completeness
Claims
Hidden-state representation work and decoded-task accuracy are therefore weakly coupled in this regime; we accordingly reframe LLM-domain JEPA evaluation as a coupling problem.
The natural-language-to-regex generation task with exact-match metric is sufficiently representative that a null result on it generalizes to the broader claim of weak coupling between hidden geometry and task signal in LLM fine-tuning.
An empirical audit of 22 JEPA-style training auxiliaries on Llama-3.2-1B fine-tuning for regex generation finds no statistically significant task improvement after multiple-testing correction, even when auxiliaries visibly alter hidden-state geometry.
References
Receipt and verification
| First computed | 2026-05-20T00:00:56.350056Z |
|---|---|
| Builder | pith-number-builder-2026-05-17-v1 |
| Signature | Pith Ed25519
(pith-v1-2026-05) · public key |
| Schema | pith-number/v1.0 |
Canonical hash
7e09d05530fad755ab27cec4bd43894671e5ec255d7498f107a4968e91815db3
Aliases
· · · · ·Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/PYE5AVJQ7LLVLKZHZ3CL2Q4JIZ \
| jq -c '.canonical_record' \
| python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: 7e09d05530fad755ab27cec4bd43894671e5ec255d7498f107a4968e91815db3
Canonical record JSON
{
"metadata": {
"abstract_canon_sha256": "3de13ac3eea96cb0c4d0c8c3dc748ab683b560aa36b140508fb53cfc0630ac65",
"cross_cats_sorted": [
"cs.AI",
"stat.ML"
],
"license": "http://creativecommons.org/licenses/by-nc-nd/4.0/",
"primary_cat": "cs.LG",
"submitted_at": "2026-05-14T20:27:32Z",
"title_canon_sha256": "8921e762b44b31f71bf28a3d2f0d0d6aa84b965a52da3e407da36f114436b3b8"
},
"schema_version": "1.0",
"source": {
"id": "2605.15394",
"kind": "arxiv",
"version": 1
}
}