{"paper":{"title":"CECOR: Correction-oriented synthetic data construction for factual error correction","license":"http://creativecommons.org/licenses/by/4.0/","headline":"CECoR corrects multi-hop factual errors by decomposing claims into steps and injecting perturbations to synthesize training data.","cross_cats":[],"primary_cat":"cs.CL","authors_text":"Chenyang Wang, Dongxiao He, Jianbiao Yang, Jianwu Dang, Lei Zhu, Longbiao Wang, Xiaobao Wang","submitted_at":"2026-05-04T07:08:39Z","abstract_excerpt":"Factual Error Correction (FEC) aims to revise inaccurate text into statements that are factually consistent with external evidence. Although recent methods perform well on single-hop correction, they often treat claims as atomic units and struggle with multi-hop cases that require compositional reasoning across multiple evidence sources. This challenge is further amplified by limited paired data and difficulties in locating semantic errors within complex reasoning chains. We present CECoR (Compositional Error Correction via Reasoning-aware Synthesis), a reasoning-aware framework that introduce"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"CECoR achieves strong performance on multi-hop benchmarks, outperforming both distantly supervised methods and few-shot LLM baselines. It also generalizes effectively to single-hop correction and remains stable under noisy evidence.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The assumption that controlled perturbations injected after decomposition produce high-quality, realistic training pairs that faithfully represent the distribution of real semantic errors in multi-hop claims.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"CECoR decomposes multi-hop claims into steps, synthesizes training pairs via perturbation injection, and uses supervised fine-tuning plus reinforcement learning to improve factual error correction on multi-hop benchmarks.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"CECoR corrects multi-hop factual errors by decomposing claims into steps and injecting perturbations to synthesize training data.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"20faccf0aa32586194b68fb8fbcce1bd313475643563abd0b01c404ace4a2e67"},"source":{"id":"2605.02277","kind":"arxiv","version":2},"verdict":{"id":"e965608d-b8e6-4a80-93ed-7c6864e53a0d","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-08T19:12:58.179015Z","strongest_claim":"CECoR achieves strong performance on multi-hop benchmarks, outperforming both distantly supervised methods and few-shot LLM baselines. It also generalizes effectively to single-hop correction and remains stable under noisy evidence.","one_line_summary":"CECoR decomposes multi-hop claims into steps, synthesizes training pairs via perturbation injection, and uses supervised fine-tuning plus reinforcement learning to improve factual error correction on multi-hop benchmarks.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The assumption that controlled perturbations injected after decomposition produce high-quality, realistic training pairs that faithfully represent the distribution of real semantic errors in multi-hop claims.","pith_extraction_headline":"CECoR corrects multi-hop factual errors by decomposing claims into steps and injecting perturbations to synthesize training data."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2605.02277/integrity.json","findings":[],"available":true,"detectors_run":[{"name":"ai_meta_artifact","ran_at":"2026-05-20T16:35:11.551781Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_title_agreement","ran_at":"2026-05-20T03:31:22.966478Z","status":"completed","version":"1.0.0","findings_count":0},{"name":"doi_compliance","ran_at":"2026-05-19T16:30:55.456831Z","status":"completed","version":"1.0.0","findings_count":0}],"snapshot_sha256":"c59ba20c9578a41eceeb8bb576488ead8b1af5f5a004021636f9264897484bef"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":1,"snapshot_sha256":"3e42efc395affab8a16ec9c74e07348e1a6313d629a2e22d7423e9fb413d7273"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}