plainlm: Language model pretraining in pytorch

Niccolò Ajroldi · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

cs.CL · 2025-11-21 · unverdicted · novelty 7.0

Selective RoPE adds input-dependent rotations to generalize RoPE, showing implicit positional structure in softmax attention and improving performance on language modeling, copying, state tracking, and retrieval when added to gated transformers.

citing papers explorer

Showing 1 of 1 citing paper.

Selective Rotary Position Embedding cs.CL · 2025-11-21 · unverdicted · none · ref 1
Selective RoPE adds input-dependent rotations to generalize RoPE, showing implicit positional structure in softmax attention and improving performance on language modeling, copying, state tracking, and retrieval when added to gated transformers.

plainlm: Language model pretraining in pytorch

fields

years

verdicts

representative citing papers

citing papers explorer