LongRTL: Graph-Similarity-Guided LLM-driven Long Context RTL Optimization
Pith reviewed 2026-06-27 15:00 UTC · model grok-4.3
The pith
Graph similarity on ASTs lets LLMs optimize long RTL designs by partitioning them into subtrees, optimizing the parts with retrieval, and reassembling them while preserving overall function.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors claim that AST graph similarity to reusable design templates identifies semantically meaningful subtrees for independent optimization; multi-modal RAG then generates improved submodule code; and logic-aware Graph-RAG reassembly produces a complete design that maintains global functional equivalence, thereby scaling LLM optimization to long-context industrial RTL.
What carries the argument
The Partition Agent that decomposes RTL designs into AST subtrees guided by graph similarity to reusable design templates.
If this is right
- Enables structure-aware optimization on entangled, poorly modularized long RTL code.
- Preserves global functional equivalence through the reconstruction step.
- Scales LLM use from toy examples to industrial-scale hardware codebases.
- Combines partitioning, multi-modal RAG optimization, and Graph-RAG reassembly into one workflow.
Where Pith is reading between the lines
- The same partitioning-plus-reassembly pattern could apply to long code in other structured domains such as verification scripts or embedded software.
- Automated equivalence checking during reassembly might be added as an explicit step to catch errors earlier.
- The method could be tested by measuring optimization quality on public large RTL repositories with known golden outputs.
- Integration with existing simulation or synthesis tools would give immediate feedback on whether each optimized subtree still works.
Load-bearing premise
That AST graph similarity to design templates will reliably find subtrees whose independent optimization and logic-aware reassembly will keep the original design's behavior unchanged.
What would settle it
Apply the full pipeline to a known large open RTL design, then run the original and optimized versions through the same testbench and check whether every output vector matches.
read the original abstract
Large Language Models (LLMs) show great promise in RTL code generation and optimization. However, real-world RTL designs are typically long, entangled, and poorly modularized, posing a major challenge due to context-length limitations and lack of structure. To overcome these obstacles, we propose a scalable LLM-based RTL optimization framework guided by graph similarity. Our method introduces three collaborative agents: (1) a Partition Agent that decomposes RTL designs into semantically meaningful AST subtrees, guided by AST graph similarity to reusable design templates; (2) an Optimization Agent that generates RTL submodule code based on partitioned subtrees using multi-modal Retrieval-Augmented Generation (RAG) with both AST and RTL guidance; and (3) a Reconstruction Agent that reassembles optimized submodules based on logic-aware ordering and Graph-RAG prompting, ensuring global functional equivalence. Together, these components enable robust, structure-aware optimization of long-context RTL designs, bridging the gap between toy examples and industrial-scale hardware codebases.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes LongRTL, a graph-similarity-guided LLM framework for optimizing long-context RTL designs. It describes three collaborative agents: a Partition Agent that decomposes RTL into AST subtrees via graph similarity to reusable templates, an Optimization Agent that generates optimized submodule code using multi-modal RAG, and a Reconstruction Agent that reassembles submodules via logic-aware ordering and Graph-RAG prompting while claiming to preserve global functional equivalence. The work positions this as a scalable approach bridging toy examples to industrial-scale hardware codebases.
Significance. If the equivalence-preservation and scalability claims hold, the multi-agent graph-guided decomposition could meaningfully extend LLM applicability to complex, long RTL modules that exceed context limits. The combination of AST similarity partitioning with RAG-based optimization is a plausible direction for structure-aware hardware code improvement. However, the complete absence of any experimental results, benchmarks, error metrics, or verification procedures in the manuscript leaves the practical significance unassessable.
major comments (3)
- [Abstract] Abstract: The central guarantee that the Reconstruction Agent 'ensures global functional equivalence' is asserted without any described mechanism (formal verification, equivalence checking, post-reassembly simulation, or even functional test vectors). This assumption is load-bearing for the entire framework.
- [Abstract] Abstract: No experimental results, benchmarks, error metrics, comparisons to baselines, or even small-scale case studies are reported to support the claim of 'robust, structure-aware optimization' for long-context or industrial-scale RTL. The soundness of the central claim therefore cannot be evaluated from the provided text.
- [Abstract] Abstract: The Partition Agent's reliance on AST graph similarity to 'reusable design templates' to produce 'semantically meaningful subtrees' whose independent optimization preserves semantics is stated without any supporting evidence, test cases, or discussion of failure modes (e.g., altered signal widths or state-machine semantics).
minor comments (1)
- [Abstract] The abstract introduces several agent names and technical terms (Partition Agent, multi-modal RAG, Graph-RAG) without defining their precise interfaces or data flows, which would benefit from a high-level diagram or pseudocode even in an early draft.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive feedback. We address each major comment point by point below, with planned revisions where appropriate. The manuscript presents a framework proposal, and we acknowledge areas where additional clarification or discussion is warranted.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central guarantee that the Reconstruction Agent 'ensures global functional equivalence' is asserted without any described mechanism (formal verification, equivalence checking, post-reassembly simulation, or even functional test vectors). This assumption is load-bearing for the entire framework.
Authors: We agree that the manuscript asserts preservation of global functional equivalence via the Reconstruction Agent's logic-aware ordering and Graph-RAG prompting without specifying a verification mechanism. The design intends to maintain equivalence through structural preservation during reassembly, but no formal or empirical verification procedure is detailed. We will revise the relevant sections to include a discussion of verification strategies, such as post-reconstruction simulation or equivalence checking where feasible. revision: yes
-
Referee: [Abstract] Abstract: No experimental results, benchmarks, error metrics, comparisons to baselines, or even small-scale case studies are reported to support the claim of 'robust, structure-aware optimization' for long-context or industrial-scale RTL. The soundness of the central claim therefore cannot be evaluated from the provided text.
Authors: The current manuscript focuses on describing the proposed three-agent framework as a conceptual approach to overcome context limitations in RTL optimization. We acknowledge the absence of empirical results, benchmarks, or case studies, which limits direct assessment of practical performance. We will revise the abstract and introduction to more clearly position the work as a framework proposal and outline directions for future empirical evaluation. revision: partial
-
Referee: [Abstract] Abstract: The Partition Agent's reliance on AST graph similarity to 'reusable design templates' to produce 'semantically meaningful subtrees' whose independent optimization preserves semantics is stated without any supporting evidence, test cases, or discussion of failure modes (e.g., altered signal widths or state-machine semantics).
Authors: We recognize that the Partition Agent description relies on AST graph similarity without providing concrete evidence, test cases, or analysis of failure modes. The approach uses graph similarity metrics on ASTs to identify subtrees aligned with reusable templates, with the intent of preserving semantics through structural correspondence. We will revise to elaborate on the similarity computation and add a discussion of assumptions and potential edge cases such as changes in signal semantics or state machines. revision: yes
Circularity Check
No circularity: framework proposal with asserted properties but no self-referential derivation or fitted predictions.
full rationale
The paper describes a three-agent LLM framework for RTL optimization. The Partition Agent uses AST graph similarity to templates, the Optimization Agent uses multi-modal RAG, and the Reconstruction Agent uses logic-aware ordering and Graph-RAG to 'ensure global functional equivalence.' This is a methodological claim about component behavior rather than a derivation chain, equation, or prediction that reduces to its own inputs by construction. No equations, fitted parameters, self-citations as load-bearing premises, or uniqueness theorems appear in the abstract or described structure. The equivalence guarantee is presented as an outcome of the agent design, not derived from or equivalent to prior fitted results within the paper. The work is therefore self-contained as an engineering proposal without the circular patterns enumerated.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Rtlcoder: Fully open-source and efficient llm-assisted rtl code gen- eration technique,
S. Liu, W. Fang, Y . Lu, J. Wang, Q. Zhang, H. Zhang, and Z. Xie, “Rtlcoder: Fully open-source and efficient llm-assisted rtl code gen- eration technique,”IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2024
2024
-
[2]
Verigen: A large language model for verilog code generation,
S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, “Verigen: A large language model for verilog code generation,”ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, pp. 1–31, 2024
2024
-
[3]
Betterv: Controlled verilog generation with discriminative guidance,
P. Zehua, H. Zhen, M. Yuan, Y . Huang, and B. Yu, “Betterv: Controlled verilog generation with discriminative guidance,” inForty- first International Conference on Machine Learning, 2024
2024
-
[4]
Rtlrewriter: Methodologies for large models aided rtl code optimization,
X. Yao, Y . Wang, X. Li, Y . Lian, R. Chen, L. Chen, M. Yuan, H. Xu, and B. Yu, “Rtlrewriter: Methodologies for large models aided rtl code optimization,” inProceedings of the 43rd IEEE/ACM International Conference on Computer-Aided Design, 2024, pp. 1–7
2024
-
[5]
Symrtlo: Enhancing rtl code optimization with llms and neuron-inspired symbolic reasoning,
Y . Wang, W. Ye, P. Guo, Y . He, Z. Wang, B. Tian, S. He, G. Sun, Z. Shen, S. Chenet al., “Symrtlo: Enhancing rtl code optimization with llms and neuron-inspired symbolic reasoning,”arXiv preprint arXiv:2504.10369, 2025
-
[6]
Uvllm: An automated universal rtl verifi- cation framework using llms,
Y . Hu, J. Ye, K. Xu, J. Sun, S. Zhang, X. Jiao, D. Pan, J. Zhou, N. Wang, W. Shanet al., “Uvllm: An automated universal rtl verifi- cation framework using llms,”arXiv preprint arXiv:2411.16238, 2024
-
[7]
Chatcpu: An agile cpu design and verification platform with llm,
X. Wang, G.-W. Wan, S.-Z. Wong, L. Zhang, T. Liu, Q. Tian, and J. Ye, “Chatcpu: An agile cpu design and verification platform with llm,” inProceedings of the 61st ACM/IEEE Design Automation Conference, 2024, pp. 1–6
2024
-
[8]
Scalertl: Scaling llms with reasoning data and test-time compute for accurate rtl code generation,
C. Deng, Y .-D. Tsai, G.-T. Liu, Z. Yu, and H. Ren, “Scalertl: Scaling llms with reasoning data and test-time compute for accurate rtl code generation,”arXiv preprint arXiv:2506.05566, 2025
-
[9]
Representation Learning with Contrastive Predictive Coding
A. v. d. Oord, Y . Li, and O. Vinyals, “Representation learning with contrastive predictive coding,”arXiv preprint arXiv:1807.03748, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[10]
Pyverilog: A python-based hardware design processing toolkit for verilog hdl,
S. Takamaeda-Yamazaki, “Pyverilog: A python-based hardware design processing toolkit for verilog hdl,” inInternational Symposium on Applied Reconfigurable Computing. Springer, 2015, pp. 451–460
2015
-
[11]
A. Hurst, A. Lerer, A. P. Goucher, A. Perelman, A. Ramesh, A. Clark, A. Ostrow, A. Welihinda, A. Hayes, A. Radfordet al., “Gpt-4o system card,”arXiv preprint arXiv:2410.21276, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[12]
Semi-Supervised Classification with Graph Convolutional Networks
T. Kipf, “Semi-supervised classification with graph convolutional networks,”arXiv preprint arXiv:1609.02907, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[13]
Design Compiler User Guide,
Synopsys, “Design Compiler User Guide,” https://www.synopsys.com/ implementation-and-signoff/rtl-synthesis-test/dc-ultra.html, 2024
2024
-
[14]
Asap7: A 7-nm finfet predictive process design kit,
L. T. Clark, V . Vashishtha, L. Shifren, A. Gujja, S. Sinha, B. Cline, C. Ramamurthy, and G. Yeric, “Asap7: A 7-nm finfet predictive process design kit,”Microelectronics Journal, vol. 53, pp. 105–115, 2016
2016
-
[15]
Icarus verilog: open-source verilog more than a year later,
S. Williams and M. Baxter, “Icarus verilog: open-source verilog more than a year later,”Linux Journal, vol. 2002, no. 99, p. 3, 2002
2002
-
[16]
Abc: An academic industrial-strength verification tool,
R. Brayton and A. Mishchenko, “Abc: An academic industrial-strength verification tool,” inInternational Conference on Computer Aided Verification. Springer, 2010, pp. 24–40
2010
-
[17]
Automatic datapath optimization using e-graphs,
S. Coward, G. A. Constantinides, and T. Drane, “Automatic datapath optimization using e-graphs,” in2022 IEEE 29th Symposium on Computer Arithmetic (ARITH). IEEE, 2022, pp. 43–50
2022
-
[18]
Yosys-a free verilog synthesis suite,
C. Wolf, J. Glaser, and J. Kepler, “Yosys-a free verilog synthesis suite,” inProceedings of the 21st Austrian Workshop on Microelectronics (Austrochip), vol. 97, 2013
2013
-
[19]
Google, “Gemini,” https://gemini.google.com, 2025
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.