2505.10222 , archivePrefix=

Jintian Shao, Hongyi Huang, Jiayi Wu, Beiwen Zhang, ZhiYu Wu, You Shan, MingKai Zheng · 2025 · arXiv 2505.10222

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

representative citing papers

Robust Filter Attention: Self-Attention as Precision-Weighted State Estimation

cs.LG · 2025-09-04 · unverdicted · novelty 7.0

Robust Filter Attention models self-attention as consistency-based state estimation under a linear SDE for token trajectories, matching standard attention complexity while showing lower perplexity and better zero-shot length extrapolation than RoPE on language benchmarks.

RT-Transformer: The Transformer Block as a Spherical State Estimator

cs.LG · 2026-05-10 · unverdicted · novelty 6.0

Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.

citing papers explorer

Showing 2 of 2 citing papers.

Robust Filter Attention: Self-Attention as Precision-Weighted State Estimation cs.LG · 2025-09-04 · unverdicted · none · ref 79
Robust Filter Attention models self-attention as consistency-based state estimation under a linear SDE for token trajectories, matching standard attention complexity while showing lower perplexity and better zero-shot length extrapolation than RoPE on language benchmarks.
RT-Transformer: The Transformer Block as a Spherical State Estimator cs.LG · 2026-05-10 · unverdicted · none · ref 195
Transformer components arise as the natural solution to precision-weighted directional state estimation on the hypersphere.

2505.10222 , archivePrefix=

fields

years

verdicts

representative citing papers

citing papers explorer