For uniform keys on the d-dimensional sphere, softmax attention becomes selective at inverse temperature scaling β_n* ≍ n^{2/(d-1)}, with explicit limiting laws for attention weights and outputs in each regime.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
FLASH Policy uses sparse Legendre polynomial trajectory fitting and history-anchored flow matching to enable single-step inference for visuomotor control, reporting 31.4 ms per-episode latency and >=92% success on five simulated plus two real manipulation tasks.
Establishes a uniform Bahadur representation for sieve M-estimators under temporal dependence and constructs valid simultaneous confidence regions using Gaussian approximation and self-convolved bootstrap.
Range characterizations are established for the k-weighted conical Radon transform and Compton transform by factoring into divergent beam and spherical section transforms and combining with prior consistency conditions.
citing papers explorer
-
Scaling Limits of Long-Context Transformers
For uniform keys on the d-dimensional sphere, softmax attention becomes selective at inverse temperature scaling β_n* ≍ n^{2/(d-1)}, with explicit limiting laws for attention weights and outputs in each regime.
-
FLASH: Efficient Visuomotor Policy via Sparse Sampling
FLASH Policy uses sparse Legendre polynomial trajectory fitting and history-anchored flow matching to enable single-step inference for visuomotor control, reporting 31.4 ms per-episode latency and >=92% success on five simulated plus two real manipulation tasks.
-
Simultaneous Inference for Nonlinear Time Series, a Sieve M-regression Approach
Establishes a uniform Bahadur representation for sieve M-estimators under temporal dependence and constructs valid simultaneous confidence regions using Gaussian approximation and self-convolved bootstrap.
-
Range characterization of the weighted divergent beam and cone integral transforms
Range characterizations are established for the k-weighted conical Radon transform and Compton transform by factoring into divergent beam and spherical section transforms and combining with prior consistency conditions.