The training problem for deep linear neural networks under squared loss admits an exact convex reformulation in a lifted space over a generalized completely positive cone, with dimension independent of depth.
Banach space representer theorems for neural networks and ridge splines
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
citation-role summary
background 1
citation-polarity summary
fields
cs.LG 2years
2026 2verdicts
UNVERDICTED 2roles
background 1polarities
unclear 1representative citing papers
A single-head softmax transformer with O(log(1/ε)) blocks and O(√(N/ε)) MLP width implements preconditioned Richardson iteration to achieve ε-accurate Gaussian KRR predictions on length-N prompts under bounded data.
citing papers explorer
-
Exact Convex Reformulations of Linear Neural Networks via Completely Positive Lifting
The training problem for deep linear neural networks under squared loss admits an exact convex reformulation in a lifted space over a generalized completely positive cone, with dimension independent of depth.
-
Transformers Can Implement Preconditioned Richardson Iteration for In-Context Gaussian Kernel Regression
A single-head softmax transformer with O(log(1/ε)) blocks and O(√(N/ε)) MLP width implements preconditioned Richardson iteration to achieve ε-accurate Gaussian KRR predictions on length-N prompts under bounded data.