New in-expectation convergence guarantees for SMD, ASMD (convex) and SGD, SGDM (nonconvex) under heavy-tailed noise without bounded-domain restrictions or algorithmic modifications.
Stochastic nonsmooth convex optimization with heavy-tailed noises: High-probability bound, in-expectation rate and initial distance adaptation.arXiv preprint arXiv:2303.12277, 2023
3 Pith papers cite this work. Polarity classification is still indexing.
3
Pith papers citing it
verdicts
UNVERDICTED 3representative citing papers
Presents a self-normalized subsampling procedure for asymptotically valid confidence regions from SGD iterates under both finite and infinite variance assumptions.
DADA is a parameter-free dual averaging method for convex optimization that adapts to local function growth and applies to nonsmooth, smooth, Holder-smooth, and other classes for both constrained and unbounded domains without prior knowledge of iteration count or accuracy.
citing papers explorer
-
Statistical Inference for Stochastic Gradient Descent Beyond Finite Variance
Presents a self-normalized subsampling procedure for asymptotically valid confidence regions from SGD iterates under both finite and infinite variance assumptions.