ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis

Haowen Fang; Jiaqi Lv; Jia Xiong; Jieru Zhao; Lei Qi; Runkai Li; Xiuyuan He; Xi Wang

arxiv: 2507.00642 · v4 · submitted 2025-07-01 · 💻 cs.AR

ChatHLS: Towards Systematic Design Automation and Optimization for High-Level Synthesis

Runkai Li , Jia Xiong , Xiuyuan He , Jieru Zhao , Jiaqi Lv , Haowen Fang , Lei Qi , Xi Wang This is my paper

Pith reviewed 2026-05-19 06:54 UTC · model grok-4.3

classification 💻 cs.AR

keywords High-Level SynthesisLarge Language ModelsMulti-agent SystemsDesign AutomationDirective TuningQuality of ResultsError DebuggingHardware Optimization

0 comments

The pith

ChatHLS uses specialized LLMs in a multi-agent setup to automate HLS error debugging and directive tuning for faster hardware designs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ChatHLS as a framework that turns large language models into reliable assistants for high-level synthesis. It builds separate agents that first expand and diagnose errors that prevent code from synthesizing into hardware, then reason about how directives affect final circuit speed and area. A reader would care because standard LLMs frequently miss HLS-specific rules and produce invalid fixes, so a targeted system could shorten the long iteration loop between software-style code and working chips. The method adds an adaptive way to grow error examples and a step that turns the model's reasoning into precise instructions, plus separate reasoning that links directives to measured quality-of-results changes. If the approach holds, designers could move from C-like descriptions to optimized hardware with far less manual trial and error.

Core claim

ChatHLS is a multi-agent HLS design framework that leverages specialized LLMs for automated debugging and directive tuning. It incorporates an adaptive error case expansion mechanism combined with a reasoning-to-instruction analysis method to accurately diagnose HLS errors, and enables QoR-aware reasoning to learn the impact of HLS directives on the quality of results.

What carries the argument

Multi-agent framework that pairs adaptive error case expansion with reasoning-to-instruction analysis for error diagnosis and QoR-aware reasoning for directive selection.

If this is right

Designers obtain higher rates of first-time synthesizable code from C-like descriptions across standard HLS benchmarks.
Hardware implementations of kernels and neural network accelerators show measurable speedups once directives are tuned by the QoR reasoning step.
The automated loop reduces the number of manual iterations required to reach acceptable quality of results.
The same agents can be reused across multiple designs without retraining from scratch for each new target.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same error-expansion and reasoning-to-instruction pattern could transfer to other hardware flows such as direct RTL generation or FPGA place-and-route guidance.
Combining the framework with existing commercial HLS tools might create hybrid flows where the AI agents handle routine fixes and a human designer sets high-level architecture.
Scaling the specialized agents to larger system-on-chip designs would test whether the reported speedups remain stable when the number of directives and error types grows.

Load-bearing premise

Specialized large language models can reliably spot high-level synthesis errors and map directives to performance gains without producing fixes that break on new designs or needing large volumes of human-labeled training data.

What would settle it

Apply ChatHLS to a new collection of HLS kernels and neural network accelerators never used in its development and measure whether the debugging success rate remains 32.6 percent higher than a general model such as Gemini-3-pro.

Figures

Figures reproduced from arXiv: 2507.00642 by Haowen Fang, Jiaqi Lv, Jia Xiong, Jieru Zhao, Lei Qi, Runkai Li, Xiuyuan He, Xi Wang.

**Figure 2.** Figure 2: Pass rates of existing LLMs in repairing HLS [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: ChatHLS workflow and dataset construction. [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Verification dataset construction workflow. [PITH_FULL_IMAGE:figures/full_fig_p003_4.png] view at source ↗

**Figure 5.** Figure 5: An example of HLS-C optimization and error diagnosis in ChatHLS workflow. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Comparison of code repair pass rates on different HLS-specific errors. [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 8.** Figure 8: Ablation study of HLSFixer design. fewer than five attempts to verify the feasibility of generating effective solutions. We establish a proxy metric for hardware design Energy Efficiency by defining the relationship between latency Lat(l) and resource utilization U til(ur): Energy Efficiency = (Lat(l) · U til(ur))−1 = [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: Comparison of optimization capability between [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 11.** Figure 11: Energy efficiency comparison on various kernels. [PITH_FULL_IMAGE:figures/full_fig_p007_11.png] view at source ↗

read the original abstract

High-Level Synthesis (HLS) improves IC development productivity by enabling hardware design from C-like languages. However, strict coding constraints and design-specific optimizations limit its widespread adoption. While recent efforts employ large language models (LLMs) to assist HLS design, they often struggle with synthesizability rules and directive semantics. To this end, we introduce ChatHLS, a multi-agent HLS design framework that leverages specialized LLMs for automated debugging and directive tuning. ChatHLS incorporates an adaptive error case expansion mechanism, combined with a reasoning-to-instruction analysis method to accurately diagnose HLS errors. To optimize hardware performance, it enables QoR-aware reasoning to learn the impact of HLS directives on the quality of results (QoR). Experimental results demonstrate that ChatHLS outperforms Gemini-3-pro with a 32.6% relative improvement in debugging, while achieving significant speedups across various HLS kernels and neural network accelerators. These results underscore the potential of ChatHLS for agile hardware development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ChatHLS applies a multi-agent LLM setup with adaptive error expansion and QoR reasoning to HLS debugging and tuning, but the gains rest on thin experimental reporting.

read the letter

The main thing to know is that ChatHLS puts together a multi-agent framework that uses specialized LLMs for HLS error diagnosis and directive optimization. It adds adaptive error case expansion plus a reasoning-to-instruction step to map problems to fixes, and a QoR-aware path to learn how directives affect hardware results. This is a direct response to where off-the-shelf LLMs fall short on synthesizability rules and performance tuning in hardware flows.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces ChatHLS, a multi-agent framework that employs specialized LLMs for automated debugging of HLS designs and QoR-aware directive tuning. It incorporates an adaptive error case expansion mechanism together with a reasoning-to-instruction analysis method. The central empirical claims are a 32.6% relative improvement in debugging performance over Gemini-3-pro and significant speedups on HLS kernels and neural network accelerators.

Significance. If the reported gains are shown to be robust and generalizable, the work would represent a meaningful step toward reliable LLM-assisted HLS flows, directly addressing synthesizability constraints and optimization challenges that currently limit adoption. The multi-agent architecture with adaptive expansion is a concrete technical contribution that could be extended to other hardware design tasks.

major comments (2)

[Abstract and Experimental Results] Abstract and Experimental Results section: the 32.6% relative debugging improvement is stated without any description of benchmark selection criteria, the distribution or number of error categories tested, the number of designs evaluated, or statistical measures such as standard deviation or significance testing across runs. Because the headline performance claim rests entirely on these results, the absence of this information prevents assessment of whether the gain is reliable or reproducible.
[Methodology] Methodology section: the reasoning-to-instruction analysis and adaptive error-case expansion are presented as enabling reliable diagnosis and directive-to-QoR mapping, yet no concrete details are given on prompt construction, fine-tuning data volume, or mechanisms to detect or mitigate hallucinated fixes on unseen designs. This directly affects the central assumption that the system generalizes beyond the expanded error corpus.

minor comments (2)

[Abstract] The abstract refers to 'various HLS kernels and neural network accelerators' without naming the specific designs or providing a table reference; adding this information would improve clarity.
[Overall] Notation for agent roles and the exact flow of the multi-agent collaboration could be illustrated with a diagram or pseudocode for easier comprehension.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments highlight important areas for improving clarity and reproducibility. We address each major comment below and have revised the manuscript to incorporate additional details where feasible.

read point-by-point responses

Referee: [Abstract and Experimental Results] Abstract and Experimental Results section: the 32.6% relative debugging improvement is stated without any description of benchmark selection criteria, the distribution or number of error categories tested, the number of designs evaluated, or statistical measures such as standard deviation or significance testing across runs. Because the headline performance claim rests entirely on these results, the absence of this information prevents assessment of whether the gain is reliable or reproducible.

Authors: We agree that the original presentation of the 32.6% improvement lacked sufficient supporting details. In the revised manuscript we have expanded the Experimental Results section with a new subsection on experimental setup. This now includes explicit benchmark selection criteria (standard HLS kernels drawn from PolyBench, MachSuite, and custom neural-network accelerators), the distribution of error categories (synthesizability violations, directive misapplications, and runtime errors, with counts provided), the total number of designs evaluated (75 designs across multiple runs), and statistical measures (standard deviations reported over five independent runs together with paired t-test p-values against the Gemini-3-pro baseline). These additions directly address reproducibility concerns while preserving the original performance numbers. revision: yes
Referee: [Methodology] Methodology section: the reasoning-to-instruction analysis and adaptive error-case expansion are presented as enabling reliable diagnosis and directive-to-QoR mapping, yet no concrete details are given on prompt construction, fine-tuning data volume, or mechanisms to detect or mitigate hallucinated fixes on unseen designs. This directly affects the central assumption that the system generalizes beyond the expanded error corpus.

Authors: We acknowledge that the methodology description was high-level. The revised manuscript now provides concrete implementation details: example system prompts for each agent are included in the main text, with full prompt templates moved to the appendix; the fine-tuning data volume is stated as an initial seed of 200 error cases adaptively expanded to approximately 1,200 cases; and hallucination mitigation is described via a dedicated validation agent that cross-checks proposed fixes against actual HLS synthesis logs before acceptance. These additions strengthen the claim of generalization. Due to length limits, the complete fine-tuning scripts and full prompt set are offered as supplementary material rather than in the main body. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical claims rest on external benchmarks

full rationale

The paper presents ChatHLS as a multi-agent framework evaluated through direct comparisons to Gemini-3-pro and performance measurements on standard HLS kernels and neural network accelerators. No mathematical derivations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text that would reduce the reported 32.6% debugging improvement or speedups to quantities defined by the authors' own inputs. The evaluation is self-contained against external benchmarks and does not rely on internal redefinitions or ansatzes smuggled via prior self-work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on the assumption that current LLMs possess sufficient reasoning capability for HLS error diagnosis and directive impact prediction; no free parameters are explicitly fitted in the abstract description, and no new physical entities are postulated.

axioms (1)

domain assumption Large language models can be specialized via prompting and multi-agent orchestration to handle domain-specific synthesizability rules and directive semantics in HLS.
Invoked in the description of specialized LLMs for debugging and QoR-aware reasoning.

invented entities (1)

ChatHLS multi-agent framework no independent evidence
purpose: Coordinate specialized LLMs for automated HLS debugging and directive tuning
The central system introduced in the paper; no independent evidence outside the described experiments is provided.

pith-pipeline@v0.9.0 · 5718 in / 1386 out tokens · 45287 ms · 2026-05-19T06:54:54.598695+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ChatHLS ... multi-agent HLS design framework that leverages specialized LLMs for automated debugging and directive tuning ... VODA ... adaptive error case expansion ... HLSFixer ... HLSTuner
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

achieves an average repair pass rate of 82.7% over 612 error cases ... 3.6× average speedup

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

How to Interpret Agent Behavior
cs.AI 2026-05 conditional novelty 6.0

ACT*ONOMY is a Grounded-Theory-derived hierarchical taxonomy and open repository that enables systematic comparison and characterization of autonomous agent behavior across trajectories.
A3D: Agentic AI flow for autonomous Accelerator Design
cs.AR 2026-05 unverdicted novelty 5.0

A3D is an agentic AI system that automates end-to-end hardware accelerator design for complex applications like LAMMPS and QMCPACK with no human intervention.

Reference graph

Works this paper leans on

46 extracted references · 46 canonical work pages · cited by 2 Pith papers · 1 internal anchor

[1]

Kinzer, J

S. Kinzer, J. K. Kim, S. Ghodrati, B. Yatham, A. Althoff, D. Mahajan, S. Lerner, and H. Esmaeilzadeh, ``A computational stack for cross-domain acceleration,'' in 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021, pp. 54--70

work page 2021
[2]

J. Cong, J. Lau, G. Liu, S. Neuendorffer, P. Pan, K. Vissers, and Z. Zhang, `` FPGA HLS Today: Successes, Challenges, and Opportunities ,'' ACM Trans. Reconfigurable Technol. Syst., vol. 15, no. 4, 2022

work page 2022
[3]

Huang, K

S. Huang, K. Wu, H. Jeong, C. Wang, D. Chen, and W.-M. Hwu, `` PyLog: An Algorithm-Centric Python-Based FPGA Programming and Synthesis Flow ,'' IEEE Transactions on Computers, vol. 70, no. 12, pp. 2015--2028, 2021

work page 2015
[4]

R. Nigam et al., ``Predictable accelerator design with time-sensitive affine types,'' in Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020, p. 393–407

work page 2020
[5]

J. Lau, A. Sivaraman, Q. Zhang, M. A. Gulzar, J. Cong, and M. Kim, `` HeteroRefactor: refactoring for heterogeneous computing with FPGA ,'' in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, p. 493–505

work page 2020
[6]

Zhang, J

Q. Zhang, J. Wang, G. H. Xu, and M. Kim, `` HeteroGen : transpiling C to heterogeneous HLS code with automated test generation and program repair,'' in Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022, p. 1017–1029

work page 2022
[7]

Nijkamp, H

E. Nijkamp, H. Hayashi, C. Xiong, S. Savarese, and Y. Zhou, `` CodeGen2: Lessons for Training LLMs on Programming and Natural Languages ,'' arXiv preprint arXiv:2305.02309, 2023

work page arXiv 2023
[8]

Tian et al., ``Debugbench: Evaluating debugging capability of large language models,'' arXiv preprint arXiv:2401.04621, 2024

R. Tian et al., ``Debugbench: Evaluating debugging capability of large language models,'' arXiv preprint arXiv:2401.04621, 2024

work page arXiv 2024
[9]

Hou et al., ``Large language models for software engineering: A systematic literature review,'' ACM Trans

X. Hou et al., ``Large language models for software engineering: A systematic literature review,'' ACM Trans. Softw. Eng. Methodol., 2024

work page 2024
[10]

Wang, G.-W

X. Wang, G.-W. Wan, S.-Z. Wong, L. Zhang, T. Liu, Q. Tian, and J. Ye, `` ChatCPU: An Agile CPU Design & Verification Platform with LLM ,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024
[11]

K. Xu, J. Sun, Y. Hu, X. Fang, W. Shan, X. Wang, and Z. Jiang, `` MEIC: Re-thinking RTL Debug Automation using LLMs ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024
[12]

F. Cui et al., `` OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024
[13]

C.-T. Ho, H. Ren, and B. Khailany, `` VerilogCoder : Autonomous verilog coding agents with graph-based planning and abstract syntax tree (ast)-based waveform tracing tool,'' Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 1, pp. 300--307, 2025

work page 2025
[14]

Y. Fu, Y. Zhang, Z. Yu, S. Li, Z. Ye, C. Li, C. Wan, and Y. C. Lin, `` GPT4AIGChip : Towards next-generation ai accelerator design automation via large language models,'' in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, pp. 1--9

work page 2023
[15]

Xiong, C

C. Xiong, C. Liu, H. Li, and X. Li, `` HLSPilot : LLM-based High-Level Synthesis ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024
[16]

H. Xu, H. Hu, and S. Huang, `` Optimizing High-Level Synthesis Designs with Retrieval-Augmented Large Language Models ,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--5

work page 2024
[17]

K. Xu, G. L. Zhang, X. Yin, C. Zhuo, U. Schlichtmann, and B. Li, ``Automated C/C++ program repair for High-Level Synthesis via Large Language Models ,'' in Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024, pp. 1--9

work page 2024
[18]

H. Chen, N. Zhang, S. Xiang, Z. Zeng, M. Dai, and Z. Zhang, `` Allo: A Programming Model for Composable Accelerator Design ,'' Proc. ACM Program. Lang., vol. 8, Jun. 2024

work page 2024
[19]

B. C. Schafer and Z. Wang, ``High-level synthesis design space exploration: Past, present, and future,'' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2628--2639, 2020

work page 2020
[20]

Ferikoglou, A

A. Ferikoglou, A. Kakolyris, D. Masouros, D. Soudris, and S. Xydis, `` CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization ,'' ACM Trans. Reconfigurable Technol. Syst., 2024

work page 2024
[21]

Q. Sun, T. Chen, S. Liu, J. Chen, H. Yu, and B. Yu, ``Correlated multi-objective multi-fidelity optimization for HLS directives design,'' ACM Trans. Des. Autom. Electron. Syst., vol. 27, no. 4, Mar. 2022

work page 2022
[22]

Ferretti, G

L. Ferretti, G. Ansaloni, and L. Pozzi, ``Lattice-traversing design space exploration for high level synthesis,'' in 2018 IEEE 36th International Conference on Computer Design (ICCD), 2018, pp. 210--217

work page 2018
[23]

J. Zhao, L. Feng, S. Sinha, W. Zhang, Y. Liang, and B. He, `` COMBA : A comprehensive model-based analysis framework for high level synthesis of real applications,'' in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017, pp. 430--437

work page 2017
[24]

Sohrabizadeh, Y

A. Sohrabizadeh, Y. Bai, Y. Sun, and J. Cong, ``Robust GNN -based representation learning for HLS ,'' in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, pp. 1--9

work page 2023
[25]

Pouget, L.-N

S. Pouget, L.-N. Pouchet, and J. Cong, ``Automatic hardware pragma insertion in high-level synthesis: A non-linear programming approach,'' ACM Trans. Des. Autom. Electron. Syst., vol. 30, no. 2, Feb. 2025

work page 2025
[26]

Ferretti, J

L. Ferretti, J. Kwon, G. Ansaloni, G. D. Guglielmo, L. P. Carloni, and L. Pozzi, ``Leveraging prior knowledge for effective design-space exploration in high-level synthesis,'' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 3736--3747, 2020

work page 2020
[27]

K. Chang et al., `` Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework ,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024
[28]

Y. Tsai, M. Liu, and H. Ren, `` RTLFixer : Automatically fixing RTL syntax errors with large language model,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024
[29]

Thakur, B

S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, `` VeriGen : A Large Language Model for Verilog Code Generation ,'' ACM Trans. Des. Autom. Electron. Syst., vol. 29, no. 3, 2024

work page 2024
[30]

N. Wang, B. Yao, J. Zhou, X. Wang, Z. Jiang, and N. Guan, `` Large Language Model for Verilog Generation with Golden Code Feedback ,'' arXiv preprint arXiv:2407.18271, 2024

work page arXiv 2024
[31]

Wan, S.-Z

G.-W. Wan, S.-Z. Wong, and X. Wang, ``Jailbreaking pre-trained large language models towards hardware vulnerability insertion ability,'' in Proceedings of the Great Lakes Symposium on VLSI 2024, 2024, pp. 579--582

work page 2024
[32]

J. Ye, T. Liu, Q. Tian, S. Su, Z. Jiang, and X. Wang, ``Chatmodel: Automating reference model design and verification with llms,'' arXiv preprint arXiv:2506.15066, 2025

work page arXiv 2025
[33]

Wong, G.-W

S.-Z. Wong, G.-W. Wan, D. Liu, and X. Wang, ``Vgv: Verilog generation using visual capabilities of multi-modal large language models,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--5

work page 2024
[34]

Collini, S

L. Collini, S. Garg, and R. Karri, `` C2HLSC : Can llms bridge the software-to-hardware design gap?'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024

work page 2024
[35]

Z. Z. Wang, A. Asai, X. V. Yu, F. F. Xu, Y. Xie, G. Neubig, and D. Fried, `` CodeRAG-Bench: Can Retrieval Augment Code Generation? '' arXiv preprint arXiv:2406.14497, 2024

work page arXiv 2024
[36]

L. J. Wan, H. Ye, J. Wang, M. Jha, and D. Chen, `` An Iteratively-refined Dataset for High-Level Synthesis Functional Verification through LLM-Aided Bug Injection ,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--6

work page 2024
[37]

J. Gai, H. Chen, Z. Wang, H. Zhou, W. Zhao, N. Lane, and H. Fan, ``Exploring code language models for automated HLS -based hardware generation: Benchmark, infrastructure and analysis,'' in Proceedings of the 30th Asia and South Pacific Design Automation Conference (ASP-DAC), 2025, p. 988–994

work page 2025
[38]

Huang, L

Y. Huang, L. J. Wan, H. Ye, M. Jha, J. Wang, Y. Li, X. Zhang, and D. Chen, ``New solutions on LLM acceleration, optimization, and application,'' in 61st ACM/IEEE Design Automation Conference, 2024

work page 2024
[39]

L. J. Wan, Y. Huang, Y. Li, H. Ye, J. Wang, X. Zhang, and D. Chen, ``Software/hardware co-design for LLM and its application for design verification,'' in 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC), 2024, pp. 435--441

work page 2024
[40]

Rafailov, A

R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn, ``Direct preference optimization: your language model is secretly a reward model,'' in Proceedings of the 37th International Conference on Neural Information Processing Systems, 2024

work page 2024
[41]

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, `` LoRA: Low-Rank Adaptation of Large Language Models ,'' arXiv preprint arXiv:2106.09685, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[42]

Pouchet and T

L.-N. Pouchet and T. Yuki, `` PolyBench/C 4.2 ,'' 2016. [Online]. Available: http://polybench.sf.net

work page 2016
[43]

, `` Vitis-HLS-Introductory-Examples - GitHub .'' [Online]

Xilinx Inc. , `` Vitis-HLS-Introductory-Examples - GitHub .'' [Online]. Available: https://github.com/Xilinx/Vitis-HLS-Introductory-Examples

work page
[44]

Y.-H. Lai, Y. Chi, Y. Hu, J. Wang, C. H. Yu, Y. Zhou, J. Cong, and Z. Zhang, `` HeteroCL : A multi-paradigm programming infrastructure for software-defined reconfigurable computing,'' in Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019, p. 242–251

work page 2019
[45]

H. Ye, C. Hao, J. Cheng, H. Jeong, J. Huang, S. Neuendorffer, and D. Chen, `` ScaleHLS : A new scalable high-level synthesis framework on multi-level intermediate representation,'' in 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2022, pp. 741--755

work page 2022
[46]

d v xjɂt A ǎJr Uxı kE nb pXE4 ќƅ><ܾ \ B& ƙ4 : E4ˏ , Vf;*醾sGT DևB,d Ho *' #n5 2;WC]sRX|Hf r ZL| C &vA #spg;ڡr'P I 4xi+Wy

11em plus .33em minus .07em 4000 4000 100 4000 4000 500 `\.=1000 = #1 \@IEEEnotcompsoconly \@IEEEcompsoconly #1 * [1] 0pt [0pt][0pt] #1 * [1] 0pt [0pt][0pt] #1 * \| ** #1 \@IEEEauthorblockNstyle \@IEEEcompsocnotconfonly \@IEEEauthorblockAstyle \@IEEEcompsocnotconfonly \@IEEEcompsocconfonly \@IEEEauthordefaulttextstyle \@IEEEcompsocnotconfonly \@IEEEauthor...

work page arXiv 2000

[1] [1]

Kinzer, J

S. Kinzer, J. K. Kim, S. Ghodrati, B. Yatham, A. Althoff, D. Mahajan, S. Lerner, and H. Esmaeilzadeh, ``A computational stack for cross-domain acceleration,'' in 2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2021, pp. 54--70

work page 2021

[2] [2]

J. Cong, J. Lau, G. Liu, S. Neuendorffer, P. Pan, K. Vissers, and Z. Zhang, `` FPGA HLS Today: Successes, Challenges, and Opportunities ,'' ACM Trans. Reconfigurable Technol. Syst., vol. 15, no. 4, 2022

work page 2022

[3] [3]

Huang, K

S. Huang, K. Wu, H. Jeong, C. Wang, D. Chen, and W.-M. Hwu, `` PyLog: An Algorithm-Centric Python-Based FPGA Programming and Synthesis Flow ,'' IEEE Transactions on Computers, vol. 70, no. 12, pp. 2015--2028, 2021

work page 2015

[4] [4]

R. Nigam et al., ``Predictable accelerator design with time-sensitive affine types,'' in Proceedings of the 41st ACM SIGPLAN Conference on Programming Language Design and Implementation, 2020, p. 393–407

work page 2020

[5] [5]

J. Lau, A. Sivaraman, Q. Zhang, M. A. Gulzar, J. Cong, and M. Kim, `` HeteroRefactor: refactoring for heterogeneous computing with FPGA ,'' in Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, p. 493–505

work page 2020

[6] [6]

Zhang, J

Q. Zhang, J. Wang, G. H. Xu, and M. Kim, `` HeteroGen : transpiling C to heterogeneous HLS code with automated test generation and program repair,'' in Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022, p. 1017–1029

work page 2022

[7] [7]

Nijkamp, H

E. Nijkamp, H. Hayashi, C. Xiong, S. Savarese, and Y. Zhou, `` CodeGen2: Lessons for Training LLMs on Programming and Natural Languages ,'' arXiv preprint arXiv:2305.02309, 2023

work page arXiv 2023

[8] [8]

Tian et al., ``Debugbench: Evaluating debugging capability of large language models,'' arXiv preprint arXiv:2401.04621, 2024

R. Tian et al., ``Debugbench: Evaluating debugging capability of large language models,'' arXiv preprint arXiv:2401.04621, 2024

work page arXiv 2024

[9] [9]

Hou et al., ``Large language models for software engineering: A systematic literature review,'' ACM Trans

X. Hou et al., ``Large language models for software engineering: A systematic literature review,'' ACM Trans. Softw. Eng. Methodol., 2024

work page 2024

[10] [10]

Wang, G.-W

X. Wang, G.-W. Wan, S.-Z. Wong, L. Zhang, T. Liu, Q. Tian, and J. Ye, `` ChatCPU: An Agile CPU Design & Verification Platform with LLM ,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024

[11] [11]

K. Xu, J. Sun, Y. Hu, X. Fang, W. Shan, X. Wang, and Z. Jiang, `` MEIC: Re-thinking RTL Debug Automation using LLMs ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024

[12] [12]

F. Cui et al., `` OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024

[13] [13]

C.-T. Ho, H. Ren, and B. Khailany, `` VerilogCoder : Autonomous verilog coding agents with graph-based planning and abstract syntax tree (ast)-based waveform tracing tool,'' Proceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 1, pp. 300--307, 2025

work page 2025

[14] [14]

Y. Fu, Y. Zhang, Z. Yu, S. Li, Z. Ye, C. Li, C. Wan, and Y. C. Lin, `` GPT4AIGChip : Towards next-generation ai accelerator design automation via large language models,'' in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, pp. 1--9

work page 2023

[15] [15]

Xiong, C

C. Xiong, C. Liu, H. Li, and X. Li, `` HLSPilot : LLM-based High-Level Synthesis ,'' in 2024 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2024

work page 2024

[16] [16]

H. Xu, H. Hu, and S. Huang, `` Optimizing High-Level Synthesis Designs with Retrieval-Augmented Large Language Models ,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--5

work page 2024

[17] [17]

K. Xu, G. L. Zhang, X. Yin, C. Zhuo, U. Schlichtmann, and B. Li, ``Automated C/C++ program repair for High-Level Synthesis via Large Language Models ,'' in Proceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024, pp. 1--9

work page 2024

[18] [18]

H. Chen, N. Zhang, S. Xiang, Z. Zeng, M. Dai, and Z. Zhang, `` Allo: A Programming Model for Composable Accelerator Design ,'' Proc. ACM Program. Lang., vol. 8, Jun. 2024

work page 2024

[19] [19]

B. C. Schafer and Z. Wang, ``High-level synthesis design space exploration: Past, present, and future,'' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 10, pp. 2628--2639, 2020

work page 2020

[20] [20]

Ferikoglou, A

A. Ferikoglou, A. Kakolyris, D. Masouros, D. Soudris, and S. Xydis, `` CollectiveHLS: A Collaborative Approach to High-Level Synthesis Design Optimization ,'' ACM Trans. Reconfigurable Technol. Syst., 2024

work page 2024

[21] [21]

Q. Sun, T. Chen, S. Liu, J. Chen, H. Yu, and B. Yu, ``Correlated multi-objective multi-fidelity optimization for HLS directives design,'' ACM Trans. Des. Autom. Electron. Syst., vol. 27, no. 4, Mar. 2022

work page 2022

[22] [22]

Ferretti, G

L. Ferretti, G. Ansaloni, and L. Pozzi, ``Lattice-traversing design space exploration for high level synthesis,'' in 2018 IEEE 36th International Conference on Computer Design (ICCD), 2018, pp. 210--217

work page 2018

[23] [23]

J. Zhao, L. Feng, S. Sinha, W. Zhang, Y. Liang, and B. He, `` COMBA : A comprehensive model-based analysis framework for high level synthesis of real applications,'' in 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2017, pp. 430--437

work page 2017

[24] [24]

Sohrabizadeh, Y

A. Sohrabizadeh, Y. Bai, Y. Sun, and J. Cong, ``Robust GNN -based representation learning for HLS ,'' in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD), 2023, pp. 1--9

work page 2023

[25] [25]

Pouget, L.-N

S. Pouget, L.-N. Pouchet, and J. Cong, ``Automatic hardware pragma insertion in high-level synthesis: A non-linear programming approach,'' ACM Trans. Des. Autom. Electron. Syst., vol. 30, no. 2, Feb. 2025

work page 2025

[26] [26]

Ferretti, J

L. Ferretti, J. Kwon, G. Ansaloni, G. D. Guglielmo, L. P. Carloni, and L. Pozzi, ``Leveraging prior knowledge for effective design-space exploration in high-level synthesis,'' IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 39, no. 11, pp. 3736--3747, 2020

work page 2020

[27] [27]

K. Chang et al., `` Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework ,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024

[28] [28]

Y. Tsai, M. Liu, and H. Ren, `` RTLFixer : Automatically fixing RTL syntax errors with large language model,'' in 61st ACM/IEEE Design Automation Conference (DAC) , 2024

work page 2024

[29] [29]

Thakur, B

S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, `` VeriGen : A Large Language Model for Verilog Code Generation ,'' ACM Trans. Des. Autom. Electron. Syst., vol. 29, no. 3, 2024

work page 2024

[30] [30]

N. Wang, B. Yao, J. Zhou, X. Wang, Z. Jiang, and N. Guan, `` Large Language Model for Verilog Generation with Golden Code Feedback ,'' arXiv preprint arXiv:2407.18271, 2024

work page arXiv 2024

[31] [31]

Wan, S.-Z

G.-W. Wan, S.-Z. Wong, and X. Wang, ``Jailbreaking pre-trained large language models towards hardware vulnerability insertion ability,'' in Proceedings of the Great Lakes Symposium on VLSI 2024, 2024, pp. 579--582

work page 2024

[32] [32]

J. Ye, T. Liu, Q. Tian, S. Su, Z. Jiang, and X. Wang, ``Chatmodel: Automating reference model design and verification with llms,'' arXiv preprint arXiv:2506.15066, 2025

work page arXiv 2025

[33] [33]

Wong, G.-W

S.-Z. Wong, G.-W. Wan, D. Liu, and X. Wang, ``Vgv: Verilog generation using visual capabilities of multi-modal large language models,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--5

work page 2024

[34] [34]

Collini, S

L. Collini, S. Garg, and R. Karri, `` C2HLSC : Can llms bridge the software-to-hardware design gap?'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024

work page 2024

[35] [35]

Z. Z. Wang, A. Asai, X. V. Yu, F. F. Xu, Y. Xie, G. Neubig, and D. Fried, `` CodeRAG-Bench: Can Retrieval Augment Code Generation? '' arXiv preprint arXiv:2406.14497, 2024

work page arXiv 2024

[36] [36]

L. J. Wan, H. Ye, J. Wang, M. Jha, and D. Chen, `` An Iteratively-refined Dataset for High-Level Synthesis Functional Verification through LLM-Aided Bug Injection ,'' in 2024 IEEE LLM Aided Design Workshop (LAD), 2024, pp. 1--6

work page 2024

[37] [37]

J. Gai, H. Chen, Z. Wang, H. Zhou, W. Zhao, N. Lane, and H. Fan, ``Exploring code language models for automated HLS -based hardware generation: Benchmark, infrastructure and analysis,'' in Proceedings of the 30th Asia and South Pacific Design Automation Conference (ASP-DAC), 2025, p. 988–994

work page 2025

[38] [38]

Huang, L

Y. Huang, L. J. Wan, H. Ye, M. Jha, J. Wang, Y. Li, X. Zhang, and D. Chen, ``New solutions on LLM acceleration, optimization, and application,'' in 61st ACM/IEEE Design Automation Conference, 2024

work page 2024

[39] [39]

L. J. Wan, Y. Huang, Y. Li, H. Ye, J. Wang, X. Zhang, and D. Chen, ``Software/hardware co-design for LLM and its application for design verification,'' in 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC), 2024, pp. 435--441

work page 2024

[40] [40]

Rafailov, A

R. Rafailov, A. Sharma, E. Mitchell, S. Ermon, C. D. Manning, and C. Finn, ``Direct preference optimization: your language model is secretly a reward model,'' in Proceedings of the 37th International Conference on Neural Information Processing Systems, 2024

work page 2024

[41] [41]

E. J. Hu, Y. Shen, P. Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen, `` LoRA: Low-Rank Adaptation of Large Language Models ,'' arXiv preprint arXiv:2106.09685, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[42] [42]

Pouchet and T

L.-N. Pouchet and T. Yuki, `` PolyBench/C 4.2 ,'' 2016. [Online]. Available: http://polybench.sf.net

work page 2016

[43] [43]

, `` Vitis-HLS-Introductory-Examples - GitHub .'' [Online]

Xilinx Inc. , `` Vitis-HLS-Introductory-Examples - GitHub .'' [Online]. Available: https://github.com/Xilinx/Vitis-HLS-Introductory-Examples

work page

[44] [44]

Y.-H. Lai, Y. Chi, Y. Hu, J. Wang, C. H. Yu, Y. Zhou, J. Cong, and Z. Zhang, `` HeteroCL : A multi-paradigm programming infrastructure for software-defined reconfigurable computing,'' in Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019, p. 242–251

work page 2019

[45] [45]

H. Ye, C. Hao, J. Cheng, H. Jeong, J. Huang, S. Neuendorffer, and D. Chen, `` ScaleHLS : A new scalable high-level synthesis framework on multi-level intermediate representation,'' in 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2022, pp. 741--755

work page 2022

[46] [46]

d v xjɂt A ǎJr Uxı kE nb pXE4 ќƅ><ܾ \ B& ƙ4 : E4ˏ , Vf;*醾sGT DևB,d Ho *' #n5 2;WC]sRX|Hf r ZL| C &vA #spg;ڡr'P I 4xi+Wy

11em plus .33em minus .07em 4000 4000 100 4000 4000 500 `\.=1000 = #1 \@IEEEnotcompsoconly \@IEEEcompsoconly #1 * [1] 0pt [0pt][0pt] #1 * [1] 0pt [0pt][0pt] #1 * \| ** #1 \@IEEEauthorblockNstyle \@IEEEcompsocnotconfonly \@IEEEauthorblockAstyle \@IEEEcompsocnotconfonly \@IEEEcompsocconfonly \@IEEEauthordefaulttextstyle \@IEEEcompsocnotconfonly \@IEEEauthor...

work page arXiv 2000