Documentation-Guided Agentic Codebase Migration from C to Rust
Pith reviewed 2026-05-20 21:14 UTC · model grok-4.3
The pith
Architecture-aware documentation guides agents to migrate entire C repositories to Rust
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RustPrint first converts the source C repository into architecture-aware documentation that captures module structure, data flow, APIs, and design rationale. Coding agents then use this documentation as a blueprint to plan Rust crates, implement modules, check for compilability, reduce unsafe code, and iteratively refine the translated code. The system compares the documentation generated from the Rust output to the source documentation to identify mismatches for repair and also translates and runs the original test suites to guide fixes based on runtime failures.
What carries the argument
Architecture-aware documentation generated from the source repository, used as a migration blueprint that agents follow for planning, implementation, compilation checks, and repair via documentation mismatches and test failures.
If this is right
- RustPrint produces compilable Rust code for every one of the eight tested C repositories under both open-weight and closed-weight LLM backbones.
- With the Kimi-K2-Instruct backbone the system reaches 93.26 percent feature preservation and 95.17 percent cross-evaluation test pass rate, exceeding the agentic Claude Code baseline.
- Prior LLM-based translators Self-Repair and EvoC2Rust fail to produce repository-wide compilable output on the same targets.
- Documentation mismatches between source and translated versions, together with test-suite failures, supply targeted repair signals that improve the final Rust code.
Where Pith is reading between the lines
- The same documentation-first coordination could be tried for other source-to-target language pairs if comparable architecture documentation can be extracted automatically.
- Performance may vary with the fidelity of the initial documentation, so controlled tests that degrade the blueprint detail would reveal how much accuracy is required.
- The repair loop that compares generated documentation and runs tests might apply to other agentic coding tasks such as large-scale refactoring or feature addition.
Load-bearing premise
The architecture-aware documentation generated from the source repository accurately captures module structure, data flow, APIs, and design rationale in sufficient detail to serve as an effective migration blueprint that agents can use for planning and repair.
What would settle it
Running the framework on one of the same repositories but with deliberately incomplete or inaccurate documentation and checking whether the agents still produce fully compilable, feature-preserving Rust code would test whether the blueprint quality is essential.
Figures
read the original abstract
Migrating legacy C repositories to Rust promises stronger memory safety, but existing translators often work at the level of files or functions and miss architectural intent. We present RustPrint, a documentation-guided agentic framework for repository-level C-to-Rust migration. RustPrint first converts the source repository into architecture-aware documentation and treats it as a migration blueprint capturing module structure, data flow, APIs, and design rationale. Coding agents then use this blueprint to plan crates, implement modules, check compilability, reduce unsafe code, and iteratively refine the translated repository. RustPrint next compares documentation from the Rust output against the source documentation and uses mismatches as repair signals. It also translates and runs source test suites so runtime failures can guide targeted fixes. Experiments on eight real-world C repositories ranging from 11K to 84K LoC show that RustPrint compiles every target under both an open-weight (Kimi-K2-Instruct) and a closed-weight (GPT-5.4) backbone, while prior LLM-based translators (Self-Repair, EvoC2Rust) fail repository-wide. With the open-weight Kimi-K2-Instruct backbone, RustPrint exceeds an agentic Claude Code baseline on feature preservation (93.26% vs. 52.52%) and on cross-evaluation test pass rate (95.17% vs. 79.85%). These results suggest that documentation-guided coordination is a useful direction for scalable codebase migration.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces RustPrint, a documentation-guided agentic framework for repository-level C-to-Rust migration. It first generates architecture-aware documentation from the source C codebase to serve as a migration blueprint capturing module structure, data flow, APIs, and design rationale. LLM-based coding agents then use this blueprint to plan crates, implement modules, check compilability, reduce unsafe code, and iteratively refine the translation. The framework compares documentation from the Rust output against the source for repair signals and translates/runs source test suites to guide fixes. Experiments on eight real-world C repositories (11K–84K LoC) report that RustPrint achieves full compilation success under both Kimi-K2-Instruct and GPT-5.4 backbones, while prior methods (Self-Repair, EvoC2Rust) fail repository-wide; with the open-weight backbone it also outperforms an agentic Claude Code baseline on feature preservation (93.26% vs. 52.52%) and cross-evaluation test pass rate (95.17% vs. 79.85%).
Significance. If the results hold under scrutiny, the work provides empirical support for documentation-guided agentic coordination as a scalable approach to repository-wide migration, addressing a key limitation of prior file- or function-level translators that miss architectural intent. The evaluation on multiple large, real-world repositories and across open- and closed-weight LLM backbones is a clear strength, as is the use of concrete, multi-faceted metrics (compilation success, feature preservation, and runtime test pass rates) rather than synthetic benchmarks. These elements position the paper as a useful contribution to automated software migration and LLM-agent tooling in software engineering.
major comments (2)
- [§3] §3 (Documentation Generation and Blueprint Usage): the central claim attributes repository-wide compilation and the 93.26% feature-preservation gain to the architecture-aware documentation serving as an effective migration blueprint, yet the manuscript reports no quantitative fidelity metric (e.g., API extraction precision/recall against ground-truth headers or human-rated coverage of cross-module invariants) and no ablation that removes the documentation component while retaining the agentic repair loops and test feedback.
- [§4.2] §4.2 (Baseline Comparisons): the reported superiority over the agentic Claude Code baseline (93.26% vs. 52.52% feature preservation) does not specify whether the baseline was given equivalent access to source-derived architectural documentation or the same iterative repair and test-execution harness; without this control, the performance delta cannot be confidently attributed to the documentation-guided mechanism.
minor comments (2)
- [Abstract] The abstract and §4 refer to “cross-evaluation test pass rate” without a concise definition or pointer to the exact protocol used to generate and execute the cross-evaluated test suites.
- [§4] Figure captions and table headers in the experimental section would benefit from explicit column definitions (e.g., what “feature preservation” counts as a preserved feature) to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of the work's significance and for the constructive comments. We address each major comment below and describe the revisions we will make to strengthen the claims regarding the documentation blueprint and baseline controls.
read point-by-point responses
-
Referee: [§3] §3 (Documentation Generation and Blueprint Usage): the central claim attributes repository-wide compilation and the 93.26% feature-preservation gain to the architecture-aware documentation serving as an effective migration blueprint, yet the manuscript reports no quantitative fidelity metric (e.g., API extraction precision/recall against ground-truth headers or human-rated coverage of cross-module invariants) and no ablation that removes the documentation component while retaining the agentic repair loops and test feedback.
Authors: We agree that a quantitative fidelity metric for the generated documentation and an ablation isolating its contribution would provide stronger support for the central claim. In the revised manuscript we will add (i) precision/recall metrics for API and module-structure extraction against ground-truth headers on a representative subset of the eight repositories and (ii) an ablation in which the agentic loops and test-feedback harness operate without the architecture-aware documentation. These additions will be reported in an expanded §3 and §5. revision: yes
-
Referee: [§4.2] §4.2 (Baseline Comparisons): the reported superiority over the agentic Claude Code baseline (93.26% vs. 52.52% feature preservation) does not specify whether the baseline was given equivalent access to source-derived architectural documentation or the same iterative repair and test-execution harness; without this control, the performance delta cannot be confidently attributed to the documentation-guided mechanism.
Authors: The agentic Claude Code baseline was run with the identical iterative repair and test-execution harness used by RustPrint but without access to the source-derived architectural documentation. We will revise §4.2 to state this configuration explicitly and to clarify that the documentation blueprint is the sole differing component. If the referee considers an additional controlled run necessary, we can perform it in the revision. revision: yes
Circularity Check
No circularity: empirical evaluation on external benchmarks
full rationale
The paper describes an agentic migration system (RustPrint) that generates architecture-aware documentation from C repositories and uses it to guide LLM agents for translation, compilation checking, and repair. All load-bearing claims rest on direct experimental measurements across eight real-world repositories (11K–84K LoC), including repository-wide compilation success, feature preservation (93.26%), and cross-evaluation test pass rates (95.17%), compared against independent baselines (Self-Repair, EvoC2Rust, agentic Claude Code). No equations, fitted parameters, self-citations, or uniqueness theorems are invoked to derive results; the architecture-aware documentation is treated as an input artifact whose effectiveness is measured externally rather than assumed by construction. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Large language models can effectively interpret architecture-aware documentation to coordinate repository-level code planning, implementation, and repair.
invented entities (1)
-
Architecture-aware documentation as migration blueprint
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
RustPrint first converts the source repository into architecture-aware documentation and treats it as a migration blueprint capturing module structure, data flow, APIs, and design rationale.
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Experiments on eight real-world C repositories ranging from 11K to 84K LoC
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.