CPPL: A Circuit Prompt Programming Language
Pith reviewed 2026-05-20 00:41 UTC · model grok-4.3
The pith
CPPL routes LLM hardware outputs through a JSON IR and compiler checks to raise functional correctness over direct Verilog generation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that mediating LLM circuit generation through CPPL, which combines a Python frontend DSL with a JSON-based circuit IR, turns an error-prone text task into a statically checkable compiler frontend; the resulting designs pass validation for hierarchy and bindings, infer widths automatically, lower to CIRCT, and exhibit improved functional correctness on the RTLLM benchmark while benefiting from CIRCT optimization passes that reduce AIG node counts after synthesis.
What carries the argument
CPPL IR, a JSON-based circuit representation that encodes modules, operations, and connections in a form LLMs can produce and a compiler can validate before lowering to CIRCT.
If this is right
- LLM outputs become subject to automatic checks for port bindings, hierarchy consistency, and width inference before any synthesis occurs.
- Designs reach a standard compiler infrastructure that applies existing optimization passes without further manual intervention.
- Post-synthesis hardware metrics improve because CIRCT can operate on the lowered representation rather than on raw LLM text.
- The generation task for the LLM is simplified to producing JSON that matches a documented schema instead of raw RTL syntax.
Where Pith is reading between the lines
- The same mediation pattern could be reused for LLM generation of code in other compiler-based domains where syntax and constraint enforcement are currently brittle.
- Because the IR is JSON, incremental or partial module generation becomes easier to integrate into larger design flows.
- Fine-tuning or few-shot prompting targeted at the CPPL schema could be tested as a direct way to raise correctness further without altering the compiler.
Load-bearing premise
The CPPL IR and its compiler can capture every necessary circuit structure and constraint that an LLM might intend without leaving validation gaps or restricting the range of expressible designs.
What would settle it
A rerun of the RTLLM benchmark in which CPPL-generated designs show no gain in functional correctness over direct Verilog generation would demonstrate that the compiler-mediated path does not improve reliability.
read the original abstract
Large language models (LLMs) have shown promise in register-transfer level (RTL) design automation, but direct RTL generation remains difficult to validate, optimize, and integrate with compiler-based hardware design flows. Hardware compiler infrastructures such as CIRCT provide typed intermediate representations, legality checks, and optimization passes, yet current LLMs struggle to emit raw compiler IR because of MLIR syntax, SSA discipline, dialect-specific operations, and strict width constraints. This paper presents CPPL, a compiler-mediated design framework that turns LLM-assisted hardware generation into a statically checkable frontend problem rather than an unconstrained RTL text-generation task. CPPL combines a Python frontend DSL for declaring module interfaces and hierarchy with CPPL IR, a JSON-based circuit IR designed to expose compiler-visible structure while remaining accessible to LLMs. The compiler infers operation widths from declared module ports, validates generated IR, checks hierarchy and port bindings, and deterministically lowers the result to CIRCT for synthesizable Verilog generation. On the RTLLM benchmark, CPPL improves functional correctness over direct Verilog and direct CIRCT IR generation, while CIRCT optimization reduces post-synthesis AIG node counts. These results show that a compiler-mediated interface can make LLM-assisted hardware design more reliable, analyzable, and amenable to backend optimization. CPPL is available at https://github.com/SawyDust1228/CPPL.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents CPPL, a compiler-mediated framework for LLM-assisted RTL design. It combines a Python DSL for declaring module interfaces and hierarchy with a JSON-based CPPL IR that exposes structure for static checks. The compiler performs width inference from ports, hierarchy/port validation, and deterministic lowering to CIRCT, from which synthesizable Verilog is generated. On the RTLLM benchmark the approach is reported to improve functional correctness relative to direct Verilog generation and direct CIRCT-IR generation, while CIRCT passes further reduce post-synthesis AIG node counts. The implementation is released at https://github.com/SawyDust1228/CPPL.
Significance. If the empirical claims hold, the work demonstrates that routing LLM output through a typed, compiler-visible IR can turn an unconstrained text-generation task into a statically checkable frontend problem, improving reliability and enabling downstream optimization. The public release of the code and benchmark harness constitutes a concrete strength for reproducibility.
major comments (2)
- [Evaluation] Evaluation section: the abstract states that CPPL improves functional correctness on RTLLM, yet the manuscript supplies no description of the experimental protocol (prompt templates, number of LLM calls per task, temperature, decoding strategy, definition of 'functional correctness,' or statistical measures). Without these details the central empirical claim cannot be assessed.
- [§3] §3 (CPPL IR and compiler): the claim that the JSON IR plus width-inference and port-binding checks 'can accept LLM outputs while preserving all needed structure' is load-bearing for the reliability argument, but the manuscript does not demonstrate that the deliberately simplified IR can express parameterized widths, conditional generation, or non-static bindings that appear in realistic designs outside the RTLLM suite.
minor comments (2)
- [Abstract] Abstract: the statement that 'CIRCT optimization reduces post-synthesis AIG node counts' should be accompanied by the specific passes used and the magnitude of the reduction.
- [§3] Notation: the manuscript should clarify whether the JSON schema for CPPL IR is formally specified or only illustrated by examples.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, indicating where we agree and the revisions we will make to improve clarity and completeness.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the abstract states that CPPL improves functional correctness on RTLLM, yet the manuscript supplies no description of the experimental protocol (prompt templates, number of LLM calls per task, temperature, decoding strategy, definition of 'functional correctness,' or statistical measures). Without these details the central empirical claim cannot be assessed.
Authors: We agree that the current manuscript lacks sufficient detail on the experimental protocol, which is required to allow readers to assess and reproduce the functional correctness results. In the revised manuscript we will expand the Evaluation section with a full description of the prompt templates, the number of LLM calls per task, temperature and decoding settings, the precise definition of functional correctness used, and statistical measures including variance or confidence intervals across runs. revision: yes
-
Referee: [§3] §3 (CPPL IR and compiler): the claim that the JSON IR plus width-inference and port-binding checks 'can accept LLM outputs while preserving all needed structure' is load-bearing for the reliability argument, but the manuscript does not demonstrate that the deliberately simplified IR can express parameterized widths, conditional generation, or non-static bindings that appear in realistic designs outside the RTLLM suite.
Authors: We acknowledge that the manuscript does not provide explicit demonstrations or examples of parameterized widths, conditional generation, or non-static bindings beyond the RTLLM benchmark. The CPPL IR was intentionally kept minimal to target the structure present in the evaluated tasks while remaining LLM-friendly. In the revision we will add a dedicated limitations paragraph in §3 clarifying the current scope of the IR, noting these features as areas for future extension, and qualifying the reliability claims to the RTLLM setting rather than asserting general coverage of all realistic designs. revision: partial
Circularity Check
No circularity; empirical evaluation on external benchmark
full rationale
The paper introduces CPPL as a Python DSL and JSON IR frontend that compiles to CIRCT for Verilog generation. Its central claims are empirical improvements in functional correctness on the public RTLLM benchmark and AIG node count reductions via CIRCT passes. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text. The compiler's width inference, validation, and lowering steps are described as deterministic and external to the LLM outputs, with no reduction of results to self-defined inputs. The approach relies on independent infrastructure (CIRCT) and benchmark data rather than any self-referential derivation.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption LLMs can reliably generate structured JSON following the CPPL IR schema when prompted appropriately.
- standard math CIRCT provides correct optimization and Verilog lowering passes for the generated IR.
invented entities (1)
-
CPPL IR (JSON-based circuit representation)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
CPPL IR uses JSON schema for modules/ports/ops, width inference rules (T-BIN, T-MUX, T-INST etc.) and deterministic lowering to CIRCT for Verilog
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Structural semantics preservation theorem for module hierarchy and port bindings
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
A. Yang, A. Li, B. Yang, B. Zhang, B. Hui, B. Zheng, B. Yu, C. Gao, C. Huang, C. Lvet al., “Qwen3 technical report,”arXiv preprint, 2025
work page 2025
-
[2]
A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruanet al., “Deepseek-v3 technical report,”arXiv preprint, 2024
work page 2024
-
[3]
J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkatet al., “Gpt-4 technical report,”arXiv preprint, 2023
work page 2023
-
[4]
VerilogEval: Evaluating Large Language Models for Verilog Code Generation,
M. Liu, N. Pinckney, B. Khailany, and H. Ren, “VerilogEval: Evaluating Large Language Models for Verilog Code Generation,” inProc. ICCAD, 2023
work page 2023
-
[5]
RTLLM: An Open-source Bench- mark for Designing RTL Generation with Large Language Model,
Y . Lu, S. Liu, Q. Zhang, and Z. Xie, “RTLLM: An Open-source Bench- mark for Designing RTL Generation with Large Language Model,” in Proc. ASPDAC, 2024
work page 2024
-
[6]
Origen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection,
F. Cui, C. Yin, K. Zhou, Y . Xiao, G. Sun, Q. Xu, Q. Guo, Y . Liang, X. Zhang, D. Songet al., “Origen: Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection,” inProc. ICCAD, 2024
work page 2024
-
[7]
BetterV: Controlled Verilog Generation with Discriminative Guidance,
Z. Pei, H.-L. Zhen, M. Yuan, Y . Huang, and B. Yu, “BetterV: Controlled Verilog Generation with Discriminative Guidance,”Proc. ICML, 2024
work page 2024
-
[8]
RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique,
S. Liu, W. Fang, Y . Lu, J. Wang, Q. Zhang, H. Zhang, and Z. Xie, “RTLCoder: Fully Open-Source and Efficient LLM-Assisted RTL Code Generation Technique,”IEEE TCAD, 2024
work page 2024
-
[9]
CIRCT: Circuit IR Compilers and Tools,
“CIRCT: Circuit IR Compilers and Tools,” https://circt.llvm.org/, 2026
work page 2026
-
[10]
RTL++: Graph-Enhanced LLM for RTL Code Generation,
M. Akyash, K. Azar, and H. Kamali, “RTL++: Graph-Enhanced LLM for RTL Code Generation,” inProc. ICLAD, 2025
work page 2025
-
[11]
MAGE: A Multi- Agent Engine for Automated RTL Code Generation,
Y . Zhao, H. Zhang, H. Huang, Z. Yu, and J. Zhao, “MAGE: A Multi- Agent Engine for Automated RTL Code Generation,” inProc. DAC, 2025
work page 2025
-
[12]
Z. Yu, M. Liu, M. Zimmer, Y . Celine, Y . Liu, and H. Ren, “Spec2RTL- Agent: Automated Hardware Code Generation from Complex Specifi- cations Using LLM Agent Systems,” inProc. ICLAD, 2025
work page 2025
-
[13]
SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning,
Y . Wang, W. Ye, P. Guo, Y . He, Z. Wang, B. Tian, S. He, G. Sun, Z. Shen, S. Chenet al., “SymRTLO: Enhancing RTL Code Optimization with LLMs and Neuron-Inspired Symbolic Reasoning,”Proc. NIPS, 2026
work page 2026
-
[14]
RTLRewriter: Methodologies for Large Models Aided RTL Code Optimization,
X. Yao, Y . Wang, X. Li, Y . Lian, R. Chen, L. Chen, M. Yuan, H. Xu, and B. Yu, “RTLRewriter: Methodologies for Large Models Aided RTL Code Optimization,” inProc. ICCAD, 2024
work page 2024
-
[15]
ASPEN: LLM-Guided E-Graph Rewriting for RTL Datapath Optimization,
N. Zhang, C. Deng, J. M. Kuehn, C.-T. Ho, C. Yu, Z. Zhang, and H. Ren, “ASPEN: LLM-Guided E-Graph Rewriting for RTL Datapath Optimization,” inProc. MLCAD, 2025
work page 2025
-
[16]
Learning to Debug: LLM-Organized Knowledge Trees for Solving RTL Assertion Failures,
Y . Bai and H. Ren, “Learning to Debug: LLM-Organized Knowledge Trees for Solving RTL Assertion Failures,”arXiv preprint, 2025
work page 2025
-
[17]
HLSDebugger: Identification and Correction of Logic Bugs in HLS Code with LLM Solutions,
J. Wang, S. Liu, Y . Lu, and Z. Xie, “HLSDebugger: Identification and Correction of Logic Bugs in HLS Code with LLM Solutions,” in Proc. ICCAD, 2025
work page 2025
-
[18]
Assertllm: Generating hardware verification assertions from design specifications via multi-llms,
Z. Yan, W. Fang, M. Li, M. Li, S. Liu, Z. Xie, and H. Zhang, “Assertllm: Generating hardware verification assertions from design specifications via multi-llms,” inProc. ASPDAC, 2025
work page 2025
-
[19]
MLIR: Scaling compiler infrastructure for domain specific computation,
C. Lattner, M. Amini, U. Bondhugula, A. Cohen, A. Davis, J. Pienaar, R. Riddle, T. Shpeisman, N. Vasilache, and O. Zinenko, “MLIR: Scaling compiler infrastructure for domain specific computation,” inProc. CGO, 2021
work page 2021
-
[20]
Assassyn: A Unified Abstraction for Architectural Simulation and Implementation,
J. Weng, B. Han, D. Gao, R. Gao, W. Zhang, A. Zhong, C. Xu, J. Xin, Y . Luo, L. W. Willset al., “Assassyn: A Unified Abstraction for Architectural Simulation and Implementation,” inProc. ISCA, 2025
work page 2025
-
[21]
Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic EHDL and Synthesis,
Y . Xiao, Z. Luo, K. Zhou, and Y . Liang, “Cement: Streamlining FPGA Hardware Design with Cycle-Deterministic EHDL and Synthesis,” in Proc. FPGA, 2024
work page 2024
-
[22]
Chisel: constructing hardware in a scala embedded language,
J. Bachrach, H. V o, B. Richards, Y . Lee, A. Waterman, R. Avi ˇzienis, J. Wawrzynek, and K. Asanovi ´c, “Chisel: constructing hardware in a scala embedded language,” inProc. DAC, 2012
work page 2012
-
[23]
A Compiler Infrastruc- ture For Accelerator Generators,
R. Nigam, S. Thomas, Z. Li, and A. Sampson, “A Compiler Infrastruc- ture For Accelerator Generators,” inProc. ASPLOS, 2021
work page 2021
-
[24]
PipeRTL: Timing-Aware Pipeline Optimization at IR-Level for RTL Generation,
S. Yin, F. Liu, L. Zou, R. Fu, W. Zhao, C. Bai, T.-Y . Ho, Y . Xie, and B. Yu, “PipeRTL: Timing-Aware Pipeline Optimization at IR-Level for RTL Generation,”arXiv preprint, 2026
work page 2026
-
[25]
CombRewriter: Enabling Combinational Logic Simplification in MLIR-Based Hardware Com- piler,
H. Zheng, Z. He, S. Yin, Y . Ma, and B. Yu, “CombRewriter: Enabling Combinational Logic Simplification in MLIR-Based Hardware Com- piler,” inProc. ASPDAC, 2026
work page 2026
-
[26]
LLHD: A Multi-level Intermediate Representation For Hardware Description Languages,
F. Schuiki, A. Kurth, T. Grosser, and L. Benini, “LLHD: A Multi-level Intermediate Representation For Hardware Description Languages,” in Proc. PLDI, 2020
work page 2020
-
[27]
Khronos: Fusing Memory Access for Improved Hardware RTL Simulation,
K. Zhou, Y . Liang, Y . Lin, R. Wang, and R. Huang, “Khronos: Fusing Memory Access for Improved Hardware RTL Simulation,” in Proc. MICRO, 2023. 8
work page 2023
-
[28]
LLVM: A compilation framework for lifelong program analysis & transformation,
C. Lattner and V . Adve, “LLVM: A compilation framework for lifelong program analysis & transformation,” inProc. CGO, 2004
work page 2004
-
[29]
Can Large Language Models Understand Intermediate Representations in Compilers?
H. Jiang, J. Zhu, Y . Wan, B. Fang, H. Zhang, R. Jin, and Q. Guan, “Can Large Language Models Understand Intermediate Representations in Compilers?” inProc. ICML, 2025
work page 2025
-
[30]
H. Dong, Q. Su, Y . Gao, Z. Li, Y . Ruan, G. Pekhimenko, C. J. Maddison, and X. Si, “Appl: A prompt programming language for harmonious integration of programs and large language model prompts,” inProc. ACL, 2025
work page 2025
-
[31]
Sglang: Efficient execution of structured language model programs,
L. Zheng, L. Yin, Z. Xie, C. Sun, J. Huang, C. H. Yu, S. Cao, C. Kozyrakis, I. Stoica, J. E. Gonzalezet al., “Sglang: Efficient execution of structured language model programs,”Proc. NIPS, 2024
work page 2024
-
[32]
Efficient memory management for large language model serving with pagedattention,
W. Kwon, Z. Li, S. Zhuang, Y . Sheng, L. Zheng, C. H. Yu, J. Gonzalez, H. Zhang, and I. Stoica, “Efficient memory management for large language model serving with pagedattention,” inProc. SOSP, 2023
work page 2023
- [33]
-
[34]
Evaluating large language models trained on code,
M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. D. O. Pinto, J. Kaplan, H. Edwards, Y . Burda, N. Joseph, G. Brockmanet al., “Evaluating large language models trained on code,”arXiv preprint, 2021
work page 2021
-
[35]
The ICARUS Verilog Compilation System,
S. Williams, “The ICARUS Verilog Compilation System,” https:// steveicarus.github.io/iverilog/, 2002
work page 2002
-
[36]
Yosys-a free verilog synthesis suite,
C. Wolf, J. Glaser, and J. Kepler, “Yosys-a free verilog synthesis suite,” https://yosyshq.net/yosys/, 2013
work page 2013
-
[37]
Logic synthesis with generative deep neural networks,
X. Li, X. Li, L. Chen, X. Zhang, M. Yuan, and J. Wang, “Logic synthesis with generative deep neural networks,”arXiv preprint, 2024. 9
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.