{"record_type":"pith_number_record","schema_url":"https://pith.science/schemas/pith-number/v1.json","pith_number":"pith:2025:MDJRXQ3VZX2YYDOUQKAZ5ZIR7E","short_pith_number":"pith:MDJRXQ3V","schema_version":"1.0","canonical_sha256":"60d31bc375cdf58c0dd482819ee511f92ce49bc55800c56ba74de8ebc9d5a9fd","source":{"kind":"arxiv","id":"2510.06141","version":6},"attestation_state":"computed","paper":{"title":"High-Probability Convergence Guarantees of Decentralized SGD","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Decentralized SGD converges in high probability under the same cost conditions as mean-squared error convergence.","cross_cats":["cs.MA","math.OC"],"primary_cat":"cs.LG","authors_text":"Aleksandar Armacki, Ali H. Sayed","submitted_at":"2025-10-07T17:15:08Z","abstract_excerpt":"Convergence in high-probability (HP) has attracted increasing interest, due to implying exponentially decaying tail bounds and strong guarantees for individual runs of an algorithm. While many works study HP guarantees in centralized settings, much less is understood in the decentralized setup, where existing works require strong assumptions, like uniformly bounded gradients, or asymptotically vanishing noise. This results in a significant gap between the assumptions used to establish convergence in the HP and the mean-squared error (MSE) sense, and is also contrary to centralized settings, wh"},"verification_status":{"content_addressed":true,"pith_receipt":true,"author_attested":false,"weak_author_claims":0,"strong_author_claims":0,"externally_anchored":false,"storage_verified":false,"citation_signatures":0,"replication_records":0,"graph_snapshot":true,"references_resolved":false,"formal_links_present":false},"canonical_record":{"source":{"id":"2510.06141","kind":"arxiv","version":6},"metadata":{"license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","primary_cat":"cs.LG","submitted_at":"2025-10-07T17:15:08Z","cross_cats_sorted":["cs.MA","math.OC"],"title_canon_sha256":"945ee39b323cbbb4c27d71fa1ad428692319b38fea223f88d125d5e8c8004e14","abstract_canon_sha256":"9c2cc01ab7754beaeabacc352b1b2e41035d22d5802b5e2556450524975ef797"},"schema_version":"1.0"},"receipt":{"kind":"pith_receipt","key_id":"pith-v1-2026-05","algorithm":"ed25519","signed_at":"2026-05-22T01:03:47.740800Z","signature_b64":"RCrut4crqmvY/gJU4foi5XfAuo3nzh1k+ZuPIfP+LDmk3EnJIf6AmmFNQTMiDWH13pUMWi0TJamCySreyuTyAw==","signed_message":"canonical_sha256_bytes","builder_version":"pith-number-builder-2026-05-17-v1","receipt_version":"0.3","canonical_sha256":"60d31bc375cdf58c0dd482819ee511f92ce49bc55800c56ba74de8ebc9d5a9fd","last_reissued_at":"2026-05-22T01:03:47.739841Z","signature_status":"signed_v1","first_computed_at":"2026-05-22T01:03:47.739841Z","public_key_fingerprint":"8d4b5ee74e4693bcd1df2446408b0d54"},"graph_snapshot":{"paper":{"title":"High-Probability Convergence Guarantees of Decentralized SGD","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Decentralized SGD converges in high probability under the same cost conditions as mean-squared error convergence.","cross_cats":["cs.MA","math.OC"],"primary_cat":"cs.LG","authors_text":"Aleksandar Armacki, Ali H. Sayed","submitted_at":"2025-10-07T17:15:08Z","abstract_excerpt":"Convergence in high-probability (HP) has attracted increasing interest, due to implying exponentially decaying tail bounds and strong guarantees for individual runs of an algorithm. While many works study HP guarantees in centralized settings, much less is understood in the decentralized setup, where existing works require strong assumptions, like uniformly bounded gradients, or asymptotically vanishing noise. This results in a significant gap between the assumptions used to establish convergence in the HP and the mean-squared error (MSE) sense, and is also contrary to centralized settings, wh"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We show that DSGD converges in HP under the same conditions on the cost as in the MSE sense, removing the restrictive assumptions used in prior works. Our sharp analysis yields order-optimal rates for both non-convex and strongly convex costs and establishes a linear speed-up in the number of users.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The noise is light-tailed so that moment-generating functions exist and can be bounded, which is weaker than uniform gradient bounds but still requires a specific tail condition on the stochastic gradients that may not hold for arbitrary data distributions.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Decentralized SGD achieves high-probability convergence with order-optimal rates and linear speedup in the number of users under standard smoothness and convexity conditions on the cost function.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Decentralized SGD converges in high probability under the same cost conditions as mean-squared error convergence.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"4bf4a4445837577014aa24b74530d03d93a257649a6f21e4bdd2b03d2ebcd450"},"source":{"id":"2510.06141","kind":"arxiv","version":6},"verdict":{"id":"da5f4c20-abd3-4947-afc0-7c86d54b70b7","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-18T08:42:09.459510Z","strongest_claim":"We show that DSGD converges in HP under the same conditions on the cost as in the MSE sense, removing the restrictive assumptions used in prior works. Our sharp analysis yields order-optimal rates for both non-convex and strongly convex costs and establishes a linear speed-up in the number of users.","one_line_summary":"Decentralized SGD achieves high-probability convergence with order-optimal rates and linear speedup in the number of users under standard smoothness and convexity conditions on the cost function.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The noise is light-tailed so that moment-generating functions exist and can be bounded, which is weaker than uniform gradient bounds but still requires a specific tail condition on the stochastic gradients that may not hold for arbitrary data distributions.","pith_extraction_headline":"Decentralized SGD converges in high probability under the same cost conditions as mean-squared error convergence."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2510.06141/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"},"aliases":[{"alias_kind":"arxiv","alias_value":"2510.06141","created_at":"2026-05-22T01:03:47.739981+00:00"},{"alias_kind":"arxiv_version","alias_value":"2510.06141v6","created_at":"2026-05-22T01:03:47.739981+00:00"},{"alias_kind":"doi","alias_value":"10.48550/arxiv.2510.06141","created_at":"2026-05-22T01:03:47.739981+00:00"},{"alias_kind":"pith_short_12","alias_value":"MDJRXQ3VZX2Y","created_at":"2026-05-22T01:03:47.739981+00:00"},{"alias_kind":"pith_short_16","alias_value":"MDJRXQ3VZX2YYDOU","created_at":"2026-05-22T01:03:47.739981+00:00"},{"alias_kind":"pith_short_8","alias_value":"MDJRXQ3V","created_at":"2026-05-22T01:03:47.739981+00:00"}],"events":[],"event_summary":{},"paper_claims":[],"inbound_citations":{"count":0,"internal_anchor_count":0,"sample":[]},"formal_canon":{"evidence_count":0,"sample":[],"anchors":[]},"links":{"html":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E","json":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E.json","graph_json":"https://pith.science/api/pith-number/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/graph.json","events_json":"https://pith.science/api/pith-number/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/events.json","paper":"https://pith.science/paper/MDJRXQ3V"},"agent_actions":{"view_html":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E","download_json":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E.json","view_paper":"https://pith.science/paper/MDJRXQ3V","resolve_alias":"https://pith.science/api/pith-number/resolve?arxiv=2510.06141&json=true","fetch_graph":"https://pith.science/api/pith-number/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/graph.json","fetch_events":"https://pith.science/api/pith-number/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/events.json","actions":{"anchor_timestamp":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/action/timestamp_anchor","attest_storage":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/action/storage_attestation","attest_author":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/action/author_attestation","sign_citation":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/action/citation_signature","submit_replication":"https://pith.science/pith/MDJRXQ3VZX2YYDOUQKAZ5ZIR7E/action/replication_record"}},"created_at":"2026-05-22T01:03:47.739981+00:00","updated_at":"2026-05-22T01:03:47.739981+00:00"}