{"paper":{"title":"Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Layer-wise dynamics in language models reveal performance signals beyond final representations.","cross_cats":["cs.CL"],"primary_cat":"cs.LG","authors_text":"Jingzhou Jiang, Kar Yan Tam, Yi Yang","submitted_at":"2026-05-12T20:22:45Z","abstract_excerpt":"Hidden states change substantially across the layers of modern language models, but most layer-wise analyses focus on one aspect of that change. We propose Layer-wise Representation Dynamics (LRD), a framework with three layer-wise measurement families: Frenet (Grassmann speed and curvature) for global subspace motion, Neighborhood Retention Score (NRS) for local nearest-neighbor retention, and Graph Filtration Mutual Information (GFMI) for alignment with the final layer. Applying LRD to 31 models (encoder-based and decoder-based embedders, plus base LLMs) on 30 MTEB tasks reveals architectura"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Applying LRD to 31 models on 30 MTEB tasks reveals architectural and task-level differences that are not apparent from final-layer representations alone... These results show that layer-wise structure provides signal for both interpretation and deployment decisions.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That the three proposed measurements (Frenet, NRS, GFMI) capture dynamics that are causally relevant to downstream performance rather than merely correlated on the tested set of models and tasks.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Layer-wise dynamics in language models reveal performance signals beyond final representations.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"b6e1b779dc2f0da069b8c09b4670d909665c5b3f0d9bade66ad1ddc8f6add54a"},"source":{"id":"2605.12714","kind":"arxiv","version":1},"verdict":{"id":"d454eae7-13eb-4855-b129-bf9228cf00af","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T21:45:21.962818Z","strongest_claim":"Applying LRD to 31 models on 30 MTEB tasks reveals architectural and task-level differences that are not apparent from final-layer representations alone... These results show that layer-wise structure provides signal for both interpretation and deployment decisions.","one_line_summary":"LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That the three proposed measurements (Frenet, NRS, GFMI) capture dynamics that are causally relevant to downstream performance rather than merely correlated on the tested set of models and tasks.","pith_extraction_headline":"Layer-wise dynamics in language models reveal performance signals beyond final representations."},"references":{"count":78,"sample":[{"doi":"","year":2008,"title":"Princeton University Press","work_id":"b00efaf2-cb0f-41ff-b747-f56be79d133b","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"The Falcon Series of Open Language Models","work_id":"9ef058cb-28ba-4128-b9b7-a707f2fd36b3","ref_index":2,"cited_arxiv_id":"2311.16867","is_internal_anchor":true},{"doi":"","year":2003,"title":"Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396","work_id":"ed1bd55d-4e5b-400a-8a7e-890c11077e4e","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2006,"title":"Manifold regularization: A geometric framework for learning from labeled and unlabeled examples.Journal of machine learning research, 7(11), 2006","work_id":"90040779-9acf-4503-887d-54c0ac930473","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2016,"title":"A full-text learning to rank dataset for medical information retrieval","work_id":"5d603f1c-c8f0-4adc-b6a8-052eaf2ce678","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":78,"snapshot_sha256":"0a5fdadda6775140c24377c3bcc5e479c74cf57452446e81a91d6569e112585e","internal_anchors":10},"formal_canon":{"evidence_count":2,"snapshot_sha256":"839a6326f7635f51ba8302bd0233610f2699c6505925dde487886fdb60276dc2"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}