Dokl akad nauk Sssr , volume=

A method for solving the convex programming problem with convergence rate O (1/k2) , author=

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Efficient Gradient Methods for Distributed Saddle Problems

math.OC · 2026-05-18 · unverdicted · novelty 7.0

A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.

MDN: Parallelizing Stepwise Momentum for Delta Linear Attention

cs.LG · 2026-05-07 · unverdicted · novelty 5.0

MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.

citing papers explorer

Showing 2 of 2 citing papers.

Efficient Gradient Methods for Distributed Saddle Problems math.OC · 2026-05-18 · unverdicted · none · ref 46
A novel decoupled method for distributed saddle problems achieves optimal communication complexity via multi-stage residual norm minimization, with a matching lower bound and extension to variational inequalities.
MDN: Parallelizing Stepwise Momentum for Delta Linear Attention cs.LG · 2026-05-07 · unverdicted · none · ref 81
MDN parallelizes stepwise momentum for delta linear attention using geometric reordering and dynamical systems analysis, yielding performance gains over Mamba2 and GDN on 400M and 1.3B models.

Dokl akad nauk Sssr , volume=

fields

years

verdicts

representative citing papers

citing papers explorer