pith. sign in

Title resolution pending

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

years

2026 2 2025 1

verdicts

UNVERDICTED 3

roles

background 1

polarities

background 1

representative citing papers

Muon Does Not Converge on Convex Lipschitz Functions

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

Muon does not converge on convex Lipschitz functions regardless of learning rate, while error feedback restores theoretical convergence but degrades performance on CIFAR-10 and nanoGPT tasks.

DADA: Dual Averaging with Distance Adaptation

math.OC · 2025-01-17 · unverdicted · novelty 5.0

DADA is a parameter-free dual averaging method for convex optimization that adapts to local function growth and applies to nonsmooth, smooth, Holder-smooth, and other classes for both constrained and unbounded domains without prior knowledge of iteration count or accuracy.

citing papers explorer

Showing 3 of 3 citing papers.

  • Muon Does Not Converge on Convex Lipschitz Functions cs.LG · 2026-05-09 · unverdicted · none · ref 91

    Muon does not converge on convex Lipschitz functions regardless of learning rate, while error feedback restores theoretical convergence but degrades performance on CIFAR-10 and nanoGPT tasks.

  • Constrained Stochastic Spectral Preconditioning Converges for Nonconvex Objectives math.OC · 2026-05-12 · unverdicted · none · ref 64

    Proximal stochastic spectral preconditioning converges for nonconvex constrained objectives under heavy-tailed noise, with a variance-reduced version achieving faster rates and a refined analysis of Muon iterations.

  • DADA: Dual Averaging with Distance Adaptation math.OC · 2025-01-17 · unverdicted · none · ref 23

    DADA is a parameter-free dual averaging method for convex optimization that adapts to local function growth and applies to nonsmooth, smooth, Holder-smooth, and other classes for both constrained and unbounded domains without prior knowledge of iteration count or accuracy.