pith. sign in

Attention is a smoothed cubic spline

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

years

2026 2 2025 2

verdicts

UNVERDICTED 4

representative citing papers

Copositive Matrices with Ordered Off-Diagonal Entries

math.OC · 2026-05-15 · unverdicted · novelty 7.0

Copositive matrices with nondecreasing off-diagonal entries admit a PSD plus nonnegative decomposition, which implies exactness of a natural relaxation for separable quadratic optimization over the simplex.

Algebraic Invariants of Lightning Self-Attention

math.AG · 2026-04-17 · unverdicted · novelty 5.0

Lightning self-attention coefficients are coordinates on an algebraic variety obeying Chow-type, low-rank, Veronese-type, and Sylvester-resultant invariants.

A Mathematical Explanation of Transformers

cs.LG · 2025-10-05 · unverdicted · novelty 5.0

The Transformer is interpreted as discretization of a structured integro-differential equation in continuous domains for tokens and features, unifying attention, feedforward, and normalization via operator and variational views.

citing papers explorer

Showing 4 of 4 citing papers.

  • Copositive Matrices with Ordered Off-Diagonal Entries math.OC · 2026-05-15 · unverdicted · none · ref 208

    Copositive matrices with nondecreasing off-diagonal entries admit a PSD plus nonnegative decomposition, which implies exactness of a natural relaxation for separable quadratic optimization over the simplex.

  • Transformers for Learning on Noisy and Task-Level Manifolds: Approximation and Generalization Insights cs.LG · 2025-05-06 · unverdicted · none · ref 8

    Transformers achieve approximation and generalization error bounds for noisy manifold regression that scale with the intrinsic dimension of the task-level manifold.

  • Algebraic Invariants of Lightning Self-Attention math.AG · 2026-04-17 · unverdicted · none · ref 26

    Lightning self-attention coefficients are coordinates on an algebraic variety obeying Chow-type, low-rank, Veronese-type, and Sylvester-resultant invariants.

  • A Mathematical Explanation of Transformers cs.LG · 2025-10-05 · unverdicted · none · ref 24

    The Transformer is interpreted as discretization of a structured integro-differential equation in continuous domains for tokens and features, unifying attention, feedforward, and normalization via operator and variational views.