RoVE rotates value embeddings simultaneously with keys in attention to make values position-dependent, reframing RoPE as attentive convolution and reporting gains on long-context tasks in 124M and 354M GPT-2 models.
arXiv:2002.03830 [cs]
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
RoVE: Rotary Value Embeddings Attention for Relative Position-dependent Value Pathways
RoVE rotates value embeddings simultaneously with keys in attention to make values position-dependent, reframing RoPE as attentive convolution and reporting gains on long-context tasks in 124M and 354M GPT-2 models.