WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
Fourier position embedding: Enhancing attention’s periodic extension for length generalization
5 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 5roles
background 1polarities
background 1representative citing papers
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
UCAN unifies window-based spatial attention and Hedgehog Attention with a distillation-based large-kernel module and cross-layer sharing to deliver competitive PSNR at low MACs in lightweight super-resolution.
Kimi Linear hybridizes linear attention with a new KDA module to beat full attention on tasks while slashing KV cache by 75% and speeding decoding up to 6x.
VPSG corrects predictable directional coordinate biases in MLLMs by shuffling visual positional encodings to isolate unconditioned tendencies and steering digit decoding with a lightweight finite-state machine, yielding accuracy gains on ScreenSpot-Pro without retraining.
citing papers explorer
-
Weierstrass Positional Encoding for Vision Transformers
WePE encodes 2D patch positions in Vision Transformers via Weierstrass elliptic functions on the complex plane to exploit double periodicity and derive relative positions algebraically.
-
Convergent Evolution: How Different Language Models Learn Similar Number Representations
Diverse language models converge on similar periodic number features with a two-tier hierarchy of Fourier sparsity and geometric separability, acquired via language co-occurrences or multi-token arithmetic.
-
UCAN: Unified Convolutional Attention Network for Expansive Receptive Fields in Lightweight Super-Resolution
UCAN unifies window-based spatial attention and Hedgehog Attention with a distillation-based large-kernel module and cross-layer sharing to deliver competitive PSNR at low MACs in lightweight super-resolution.
-
Kimi Linear: An Expressive, Efficient Attention Architecture
Kimi Linear hybridizes linear attention with a new KDA module to beat full attention on tasks while slashing KV cache by 75% and speeding decoding up to 6x.
-
Mitigating Coordinate Prediction Bias from Positional Encoding Failures
VPSG corrects predictable directional coordinate biases in MLLMs by shuffling visual positional encodings to isolate unconditioned tendencies and steering digit decoding with a lightweight finite-state machine, yielding accuracy gains on ScreenSpot-Pro without retraining.