{"paper":{"title":"Grokking or Glitching? How Low-Precision Drives Slingshot Loss Spikes","license":"http://creativecommons.org/licenses/by/4.0/","headline":"Floating-point precision limits trigger slingshot loss spikes by creating numerical feature inflation in neural network training.","cross_cats":["cs.CL","math.OC","stat.ML"],"primary_cat":"cs.LG","authors_text":"Jianjun Cao, Liu Hanqing, Yuanze Li, Zijian Zhou","submitted_at":"2026-05-07T12:45:21Z","abstract_excerpt":"Deep neural networks exhibit periodic loss spikes during unregularized long-term training, a phenomenon known as the \"Slingshot Mechanism.\" Existing work usually attributes this to intrinsic optimization dynamics, but its triggering mechanism remains unclear. This paper proves that this phenomenon is a result of floating-point arithmetic precision limits. As training enters a high-confidence stage, the difference between the correct-class logit and the other logits may exceed the absorption-error threshold. Then during backpropagation, the gradient of the correct class is rounded exactly to ze"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"This paper proves that this phenomenon is a result of floating-point arithmetic precision limits. ... We prove that this drift forms a positive feedback loop with the feature, causing the global classifier mean and the global feature mean to grow exponentially.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The assumption that, once the logit difference exceeds the absorption-error threshold, the gradient of the correct class is rounded exactly to zero during backpropagation while incorrect-class gradients remain nonzero, and that this imbalance necessarily creates an exponential positive feedback loop with the features.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Slingshot loss spikes arise from floating-point precision limits that round correct-class gradients to zero, breaking zero-sum constraints and driving exponential parameter growth through numerical feature inflation.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Floating-point precision limits trigger slingshot loss spikes by creating numerical feature inflation in neural network training.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"4403ecc533f8b8e57dc16cf4bc478615390e44a5f2ff1722747c74a475579148"},"source":{"id":"2605.06152","kind":"arxiv","version":3},"verdict":{"id":"73fe5ca6-aeea-4410-b502-b00c8227078a","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-13T07:06:05.189778Z","strongest_claim":"This paper proves that this phenomenon is a result of floating-point arithmetic precision limits. ... We prove that this drift forms a positive feedback loop with the feature, causing the global classifier mean and the global feature mean to grow exponentially.","one_line_summary":"Slingshot loss spikes arise from floating-point precision limits that round correct-class gradients to zero, breaking zero-sum constraints and driving exponential parameter growth through numerical feature inflation.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The assumption that, once the logit difference exceeds the absorption-error threshold, the gradient of the correct class is rounded exactly to zero during backpropagation while incorrect-class gradients remain nonzero, and that this imbalance necessarily creates an exponential positive feedback loop with the features.","pith_extraction_headline":"Floating-point precision limits trigger slingshot loss spikes by creating numerical feature inflation in neural network training."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.06152/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"claim_evidence","ran_at":"2026-05-20T13:02:04.285171Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-20T08:36:17.189756Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-19T19:01:19.348912Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T12:55:23.639732Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"5704dbfe57f64c954f96da93292ab57ce4e9d70270532fa950c2c4fa2712e08a"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":2,"snapshot_sha256":"b9b5ab6c7044a478fdc62f310b3c9f9d4878575de948cce219a613dc5ec39778"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}