{"paper":{"title":"Beyond Bounded Variance: Variance-Reduced Normalized Methods for Nonconvex Optimization under Blum-Gladyshev Noise","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Normalized stochastic gradient descent with momentum converges under BG-0 noise with O(ε^{-6}) oracle complexity using one gradient per step.","cross_cats":["math.OC"],"primary_cat":"cs.LG","authors_text":"Abolfazl Hashemi, Antesh Upadhyay, Arda Fazla","submitted_at":"2026-05-14T18:27:49Z","abstract_excerpt":"We study nonconvex stochastic optimization under the Blum-Gladyshev ($\\mathsf{BG}$-0) noise model, where the stochastic gradient variance grows quadratically with the distance from the initialization. We consider this problem under both standard smoothness and the symmetric generalized-smoothness framework, which captures objectives whose local curvature can scale with the gradient norm. We prove that normalized stochastic gradient descent with momentum, using only one stochastic gradient per iteration, converges under $\\mathsf{BG}$-0 noise with oracle complexity $O(\\varepsilon^{-6})$. This ra"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We prove that normalized stochastic gradient descent with momentum, using only one stochastic gradient per iteration, converges under BG-0 noise with oracle complexity O(ε^{-6}). This rate holds both for standard smoothness and for α-symmetric generalized smoothness.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The stochastic gradients satisfy the Blum-Gladyshev (BG-0) noise model in which the variance grows quadratically with the distance from the initialization point (stated in the problem setup and used throughout the convergence analysis).","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Normalized momentum SGD and variance-reduced STORM achieve O(ε^{-6}) and O(ε^{-4}) oracle complexities respectively under quadratic distance-dependent noise in nonconvex stochastic optimization.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Normalized stochastic gradient descent with momentum converges under BG-0 noise with O(ε^{-6}) oracle complexity using one gradient per step.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"7a7b53b402bf069b2cd37ff8a73209cd9357ee8d19f404bc624ac309fbb8f173"},"source":{"id":"2605.15314","kind":"arxiv","version":1},"verdict":{"id":"d7443536-b25a-4ce7-9a0a-4377b379317d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T16:07:08.135682Z","strongest_claim":"We prove that normalized stochastic gradient descent with momentum, using only one stochastic gradient per iteration, converges under BG-0 noise with oracle complexity O(ε^{-6}). This rate holds both for standard smoothness and for α-symmetric generalized smoothness.","one_line_summary":"Normalized momentum SGD and variance-reduced STORM achieve O(ε^{-6}) and O(ε^{-4}) oracle complexities respectively under quadratic distance-dependent noise in nonconvex stochastic optimization.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The stochastic gradients satisfy the Blum-Gladyshev (BG-0) noise model in which the variance grows quadratically with the distance from the initialization point (stated in the problem setup and used throughout the convergence analysis).","pith_extraction_headline":"Normalized stochastic gradient descent with momentum converges under BG-0 noise with O(ε^{-6}) oracle complexity using one gradient per step."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.15314/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"doi_title_agreement","ran_at":"2026-05-19T16:31:18.318118Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T16:16:11.920756Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"claim_evidence","ran_at":"2026-05-19T14:41:54.215526Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T13:33:22.772438Z","status":"skipped","version":"1.0.0","findings_count":0}],"snapshot_sha256":"d8f756ca59876d124e3e8b139d85c3b9033a7ff9293c5c258e5ad306a10a1d31"},"references":{"count":46,"sample":[{"doi":"","year":2025,"title":"Towards weaker variance assumptions for stochastic optimization.arXiv preprint arXiv:2504.09951, 2025","work_id":"6875dcae-6a12-43c1-9bf4-a96c6fe9e5e7","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"Lower bounds for non-convex stochastic optimization.Mathematical Programming, 199(1):165–214, 2023","work_id":"4b7b7601-c041-4fb5-8269-b79d5b18b79e","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2011,"title":"Non-asymptotic analysis of stochastic approximation algorithms for machine learning","work_id":"99709799-98a4-4718-92b9-b912349e1e4a","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":1954,"title":"Approximation methods which converge with probability one.The Annals of Mathematical Statistics, pages 382–386, 1954","work_id":"f4062c65-302e-498a-a472-a966b60a41c4","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2018,"title":"Optimization methods for large-scale learning","work_id":"330b1c2c-ad06-4fe3-b7fc-a305039b1943","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":46,"snapshot_sha256":"20e776866599e4bbbba0566a6cc8cdf265c67660e3669c81779f910bc0023f25","internal_anchors":1},"formal_canon":{"evidence_count":2,"snapshot_sha256":"1d5ce154eb1f6d5d8cc4677c26e02a03e48cbcddb4d9cd79655094e1e08999ca"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}