For uniform keys on the d-dimensional sphere, softmax attention becomes selective at inverse temperature scaling β_n* ≍ n^{2/(d-1)}, with explicit limiting laws for attention weights and outputs in each regime.
Mathematical Models and Methods in Applied Sciences , volume=
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 3verdicts
UNVERDICTED 3roles
method 2polarities
use method 2representative citing papers
Establishes exponential convergence in Wasserstein distance for the mean-field limit and finite-particle approximation of a consensus-based method solving nonconvex bi-level optimization problems.
Consensus-based optimization with structure preservation and cross-dimensional interactions can accurately approximate quantum entanglement.
citing papers explorer
-
Scaling Limits of Long-Context Transformers
For uniform keys on the d-dimensional sphere, softmax attention becomes selective at inverse temperature scaling β_n* ≍ n^{2/(d-1)}, with explicit limiting laws for attention weights and outputs in each regime.
-
Convergence of Consensus-Based Particle Methods for Nonconvex Bi-Level Optimization
Establishes exponential convergence in Wasserstein distance for the mean-field limit and finite-particle approximation of a consensus-based method solving nonconvex bi-level optimization problems.
-
Computation of entanglement for quantum states by a Consensus-Based Optimization method
Consensus-based optimization with structure preservation and cross-dimensional interactions can accurately approximate quantum entanglement.