Enforcing feature- and label-permutation equivariance in transformers for in-context classification yields an identifiable emergent update rule driven by mixed feature-label Gram matrices that amplifies class separation.
•Test Update (U j): Uj =α ′ X i∈Sc⋆ A(t) i ⟨x(t) i , x(t) j ⟩+α ′ X i /∈Sc⋆ A(t) i ⟨x(t) i , x(t) j ⟩
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Layerwise Dynamics for In-Context Classification in Transformers
Enforcing feature- and label-permutation equivariance in transformers for in-context classification yields an identifiable emergent update rule driven by mixed feature-label Gram matrices that amplifies class separation.