Proves it is impossible to achieve optimal last-iterate rates for GD and SGD without knowing the horizon T in advance, incurring an unavoidable poly-log factor penalty even in the deterministic case.
Tight analyses for non-smooth stochastic gradient descent
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
SGD with greedy step size on smooth quadratics in the interpolation regime attains O(1/t^{3/4}) last-iterate convergence.
citing papers explorer
-
Gradient Descent's Last Iterate is Often (slightly) Suboptimal
Proves it is impossible to achieve optimal last-iterate rates for GD and SGD without knowing the horizon T in advance, incurring an unavoidable poly-log factor penalty even in the deterministic case.
-
Last-Iterate Convergence of Randomized Kaczmarz and SGD with Greedy Step Size
SGD with greedy step size on smooth quadratics in the interpolation regime attains O(1/t^{3/4}) last-iterate convergence.