{"paper":{"title":"High-Probability Guarantees for Random Zeroth-Order (Stochastic) Gradient Descent","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Random zeroth-order gradient descent reaches ε-suboptimality with probability 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) queries for smooth strongly convex functions.","cross_cats":[],"primary_cat":"math.OC","authors_text":"Haishan Ye","submitted_at":"2026-04-26T09:02:05Z","abstract_excerpt":"Zeroth-order optimization aims to minimize an objective function using only function evaluations, and is therefore fundamental in black-box optimization, hyperparameter tuning, bandit learning, and adversarial machine learning. While classical zeroth-order methods are well understood in expectation, much less is known about their high-probability behavior, especially for smooth and strongly convex objectives. In this paper, we establish high-probability convergence guarantees for random zeroth-order gradient descent in both deterministic and stochastic settings. For deterministic $L$-smooth an"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"For deterministic L-smooth and μ-strongly convex objectives of d-dimension, the classical two-query random zeroth-order method finds an ε-suboptimal solution with probability at least 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) function queries. For stochastic objectives under bounded-noise, random zeroth-order SGD achieves the same with O(d log(1/ε)(log(1/ε)+log(1/δ))/ε) queries.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The objective is L-smooth and μ-strongly convex (deterministic case) or has bounded noise without uniformly bounded stochastic gradients (stochastic case); these are invoked to derive the stated query complexities but their necessity for the high-probability result is not relaxed in the abstract.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Random zeroth-order gradient descent reaches ε-suboptimal solutions with probability 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) queries deterministically and O(d log(1/ε)(log(1/ε)+log(1/δ))/ε) queries under bounded stochastic noise.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Random zeroth-order gradient descent reaches ε-suboptimality with probability 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) queries for smooth strongly convex functions.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"5bb2a04d37b030016610e3ec1b6423c3fa7afedf47d8d7ea0fe9bbba2e123b05"},"source":{"id":"2604.23613","kind":"arxiv","version":2},"verdict":{"id":"7b712e9e-3fec-4f67-8f27-f20344d87c54","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-08T05:45:08.341843Z","strongest_claim":"For deterministic L-smooth and μ-strongly convex objectives of d-dimension, the classical two-query random zeroth-order method finds an ε-suboptimal solution with probability at least 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) function queries. For stochastic objectives under bounded-noise, random zeroth-order SGD achieves the same with O(d log(1/ε)(log(1/ε)+log(1/δ))/ε) queries.","one_line_summary":"Random zeroth-order gradient descent reaches ε-suboptimal solutions with probability 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) queries deterministically and O(d log(1/ε)(log(1/ε)+log(1/δ))/ε) queries under bounded stochastic noise.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The objective is L-smooth and μ-strongly convex (deterministic case) or has bounded noise without uniformly bounded stochastic gradients (stochastic case); these are invoked to derive the stated query complexities but their necessity for the high-probability result is not relaxed in the abstract.","pith_extraction_headline":"Random zeroth-order gradient descent reaches ε-suboptimality with probability 1-δ using O((dL/μ)log(1/ε) + log(1/δ)) queries for smooth strongly convex functions."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2604.23613/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-21T08:37:52.229328Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T22:56:21.762503Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"bc245d88e81ed969ac84eb025b90b20c273f3798bcd1040e7d6522cb2e1e25a5"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}