Symmetries in next-token prediction targets induce corresponding geometric symmetries such as circulant matrices and equiangular tight frames in the optimal weights and embeddings of a layer-peeled LLM surrogate model.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
NEO performs test-time adaptation by re-centering target latent embeddings at the origin, boosting accuracy on distribution-shifted datasets like ImageNet-C with no optimization or hyperparameters and minimal extra compute.
citing papers explorer
-
Uncovering Symmetry Transfer in Large Language Models via Layer-Peeled Optimization
Symmetries in next-token prediction targets induce corresponding geometric symmetries such as circulant matrices and equiangular tight frames in the optimal weights and embeddings of a layer-peeled LLM surrogate model.
-
NEO: No-Optimization Test-Time Adaptation through Latent Re-Centering
NEO performs test-time adaptation by re-centering target latent embeddings at the origin, boosting accuracy on distribution-shifted datasets like ImageNet-C with no optimization or hyperparameters and minimal extra compute.