Mean-pooled cosine similarity grows with sequence length in anisotropic transformer embeddings independent of content, while CKA shows far less length dependence across code, translation, and vision tasks.
Unsupervised cross-lingual representation learning at scale
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Mean-Pooled Cosine Similarity is Not Length-Invariant: Theory and Cross-Domain Evidence for a Length-Invariant Alternative
Mean-pooled cosine similarity grows with sequence length in anisotropic transformer embeddings independent of content, while CKA shows far less length dependence across code, translation, and vision tasks.