{"paper":{"title":"Hydra: Efficient, Correct Code Generation via Checkpoint-and-Rollback Support","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"Hydra enables asynchronous compile checks and targeted rollback repairs for LLM-generated C/C++ code.","cross_cats":["cs.AI","cs.PL"],"primary_cat":"cs.SE","authors_text":"Alexander Du, Danyang Zhuo, Jianjun Ou, Matthew Lentz","submitted_at":"2026-05-14T03:18:16Z","abstract_excerpt":"Large language models are increasingly used for code generation, but many generated programs fail to compile, a prerequisite for further correctness checks such as unit tests. Existing solutions for repairing static errors are costly in both latency and token consumption. Post-hoc repair delays error detection until generation completes and commonly regenerates large regions of previously valid code. Constrained semantic decoding checks after each token, incurring per-token overhead while limiting repair to the current token even when the root cause lies earlier.\n  We present Hydra, a system f"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"Paired with a token-efficient repair strategy, Hydra reduces latency by up to 71% and token consumption by up to 70% relative to post-hoc repair on C/C++ code generation tasks that encounter static errors.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"That retrofitting the Clang C/C++ compiler with modest modifications is sufficient to provide reliable checkpoint-and-rollback support without introducing substantial overhead or breaking compatibility for typical code generation workloads.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Hydra enables asynchronous static error checking and targeted checkpoint-rollback repair during LLM code generation, cutting latency by up to 71% and token use by up to 70% versus post-hoc repair on C/C++ tasks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"Hydra enables asynchronous compile checks and targeted rollback repairs for LLM-generated C/C++ code.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"1df7969ab240d05ddbf9aba7aaf73eb2ec51119ac8760483aff134379138b8be"},"source":{"id":"2605.15238","kind":"arxiv","version":1},"verdict":{"id":"3dc3bb36-d03d-40eb-925c-8dec88008e7d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-19T16:28:08.503480Z","strongest_claim":"Paired with a token-efficient repair strategy, Hydra reduces latency by up to 71% and token consumption by up to 70% relative to post-hoc repair on C/C++ code generation tasks that encounter static errors.","one_line_summary":"Hydra enables asynchronous static error checking and targeted checkpoint-rollback repair during LLM code generation, cutting latency by up to 71% and token use by up to 70% versus post-hoc repair on C/C++ tasks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"That retrofitting the Clang C/C++ compiler with modest modifications is sufficient to provide reliable checkpoint-and-rollback support without introducing substantial overhead or breaking compatibility for typical code generation workloads.","pith_extraction_headline":"Hydra enables asynchronous compile checks and targeted rollback repairs for LLM-generated C/C++ code."},"integrity":{"clean":false,"summary":{"advisory":1,"critical":0,"by_detector":{"doi_compliance":{"total":1,"advisory":1,"critical":0,"informational":0}},"informational":0},"endpoint":"/pith/2605.15238/integrity.json","findings":[{"note":"DOI in the printed bibliography is fragmented by whitespace or line breaks. A longer candidate (10.5555/1387589.1387601) was visible in the surrounding text but could not be confirmed against doi.org as printed.","detector":"doi_compliance","severity":"advisory","ref_index":9,"audited_at":"2026-05-19T16:36:34.879914Z","detected_doi":"10.5555/1387589.1387601","finding_type":"recoverable_identifier","verdict_class":"incontrovertible","detected_arxiv_id":null}],"available":true,"detectors_run":[{"name":"doi_title_agreement","ran_at":"2026-05-19T17:01:18.457683Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T16:36:34.879914Z","status":"completed","version":"1.0.0","findings_count":1},{"name":"claim_evidence","ran_at":"2026-05-19T16:01:54.929824Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"ai_meta_artifact","ran_at":"2026-05-19T13:33:22.824902Z","status":"skipped","version":"1.0.0","findings_count":0}],"snapshot_sha256":"775e732fa7d6441e9d487aa2c6a2e0a1a4f4ef3e0cc5ff2b44d97e4662994d0e"},"references":{"count":42,"sample":[{"doi":"","year":2023,"title":"Lakshya Agrawal, Aditya Kanade, Navin Goyal, Shuvendu K Lahiri, and Sriram Rajamani. 2023. Monitor-guided decoding of code LMs with static analysis of repository context. InConference on Neural Inform","work_id":"ff496f85-8752-46c6-bfb7-21a3e351b238","ref_index":1,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2026,"title":"Anthropic. 2026. Claude Code.https://claude.com/product/claude- codeAccessed: 2026-04-09","work_id":"07b1757e-3ba0-4d96-ad71-f33105983384","ref_index":2,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.18653/v1/2024.findings-","year":2024,"title":"InFindings of the Association for Computational Linguistics: EMNLP 2024, Yaser Al-Onaizan, Mohit Bansal, and Yun-Nung Chen (Eds.)","work_id":"fffd831e-2bf2-4f3a-b0a5-062434ae1d3e","ref_index":3,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"10.1145/3689728","year":2024,"title":"Statically contextualizing large language models with typed holes","work_id":"2f105b35-00d3-4807-839b-cb19f8b7a18d","ref_index":4,"cited_arxiv_id":"","is_internal_anchor":false},{"doi":"","year":2025,"title":"Test Intention Guided LLM-Based Unit Test Generation","work_id":"c383b7e9-bd1e-4268-b134-a4e43d7560a3","ref_index":5,"cited_arxiv_id":"","is_internal_anchor":false}],"resolved_work":42,"snapshot_sha256":"237f2d80289bc8f3eae74f36e314f5d00051601146d43cb6dd91e2a8f7e8e411","internal_anchors":3},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}