{"paper":{"title":"Population Risk Bounds for Kolmogorov-Arnold Networks Trained by DP-SGD with Correlated Noise","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Kolmogorov-Arnold Networks receive population risk bounds under mini-batch DP-SGD with correlated noise.","cross_cats":["stat.ML"],"primary_cat":"cs.LG","authors_text":"Christoph Lampert, Jan Schuchardt, Junyu Zhou, Marius Kloft, Nikita Kalinin, Puyu Wang, Sophie Fellenz","submitted_at":"2026-05-12T18:44:47Z","abstract_excerpt":"We establish the first population risk bounds for Kolmogorov-Arnold Networks (KANs) trained by mini-batch SGD with gradient clipping, covering non-private SGD as well as differentially private SGD (DP-SGD) with Gaussian perturbations that interpolate between independent and temporally correlated noise. This setting is substantially closer to practice than prior KAN theory along two axes: training is by mini-batch SGD, the standard recipe for modern networks, rather than full-batch gradient descent (GD); and correlated-noise mechanisms have empirically shown a more favorable privacy-utility tra"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We establish the first population risk bounds for Kolmogorov-Arnold Networks (KANs) trained by mini-batch SGD with gradient clipping, covering non-private SGD as well as differentially private SGD (DP-SGD) with Gaussian perturbations that interpolate between independent and temporally correlated noise.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The new analysis route for correlated-noise DP training in the non-convex regime relies on an auxiliary unprojected dynamics, a shifted iterate absorbing noise, and a high-probability bootstrap certifying projection inactivity; if these constructs fail to control the temporal dependence or clipping effects under the paper's noise model, the population risk bounds do not hold.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"First population risk bounds for KANs under mini-batch DP-SGD with correlated noise, using a new non-convex optimization analysis combined with stability-based generalization.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Kolmogorov-Arnold Networks receive population risk bounds under mini-batch DP-SGD with correlated noise.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"ab1bbb4363b562f10002f51631b88a8ca9cb5720649a2b78be370d606080231e"},"source":{"id":"2605.12648","kind":"arxiv","version":1},"verdict":{"id":"f4bd3885-a81c-48b4-8c24-3c8eeb83e0da","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-14T21:34:44.874838Z","strongest_claim":"We establish the first population risk bounds for Kolmogorov-Arnold Networks (KANs) trained by mini-batch SGD with gradient clipping, covering non-private SGD as well as differentially private SGD (DP-SGD) with Gaussian perturbations that interpolate between independent and temporally correlated noise.","one_line_summary":"First population risk bounds for KANs under mini-batch DP-SGD with correlated noise, using a new non-convex optimization analysis combined with stability-based generalization.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The new analysis route for correlated-noise DP training in the non-convex regime relies on an auxiliary unprojected dynamics, a shifted iterate absorbing noise, and a high-probability bootstrap certifying projection inactivity; if these constructs fail to control the temporal dependence or clipping effects under the paper's noise model, the population risk bounds do not hold.","pith_extraction_headline":"Kolmogorov-Arnold Networks receive population risk bounds under mini-batch DP-SGD with correlated noise."},"references":{"count":70,"sample":[{"doi":"","year":2019,"title":"A convergence theory for deep learning via over- parameterization","work_id":"2a252a94-ee15-4740-ac95-48e4c0bfa182","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2023,"title":"A smooth binary mechanism for efficient private continual observation.Advances in Neural Information Processing Systems, 36:49133–49145, 2023","work_id":"1906ab45-e80c-4c97-bb7d-7a0dc9890a0c","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"The hitchhiker’s guide to efficient, end-to-end, and tight dp auditing","work_id":"aab8c523-1506-462e-9db1-af6944f76fe3","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2019,"title":"Fine-grained analysis of optimization and generalization for overparameterized two-layer neural networks","work_id":"2f8744c1-d98b-482b-acc4-22e10362f492","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2013,"title":"Beyond differential privacy: Composition theorems and relational logic for f-divergences between probabilistic programs","work_id":"323bbaa3-5240-4f18-9ba7-ca58a10e0c23","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":70,"snapshot_sha256":"c1b3269b1e8d00867cf9a927756b0c8cd579fc3562e0a994c53e73e67642967d","internal_anchors":2},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}