AgRefactor: Self-Evolving Agentic Workflow for HLS Compatibility and Performance

Jason Cong; Yang Zou; Yizhou Sun; Zijian Ding

arxiv: 2606.30949 · v1 · pith:7X66SRR5new · submitted 2026-06-29 · 💻 cs.AI · cs.AR

AgRefactor: Self-Evolving Agentic Workflow for HLS Compatibility and Performance

Yang Zou , Zijian Ding , Yizhou Sun , Jason Cong This is my paper

Pith reviewed 2026-07-01 01:27 UTC · model grok-4.3

classification 💻 cs.AI cs.AR

keywords High-Level SynthesisHLS refactoringmulti-agent workflowself-evolving memoryLLM agentscode transformationhardware accelerationpragma optimization

0 comments

The pith

AgRefactor uses a self-evolving memory in a multi-agent LLM workflow to refactor software into HLS-compatible code and achieve speedups over prior tools.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents AgRefactor as an LLM-based multi-agent workflow designed to convert ordinary software into code that High-Level Synthesis tools can turn into hardware. The system adds a memory component that stores and reuses knowledge from earlier refactoring jobs and mixes LLM rewrites with existing automated tools to limit expense. Tests cover eleven real programs that are five to ten times longer than those examined in earlier studies. When the agents also optimize performance directives, the resulting hardware runs 6.51 times faster on average than the best pragma-tuning method while using modest extra resources.

Core claim

AgRefactor is an LLM-based multi-agent workflow for refactoring software into HLS-compatible programs that incorporates a self-evolving memory system accumulating factual and strategic knowledge across tasks and integrates automated refactoring tools to balance LLM-driven rewrites with efficient transformations. On 9 out of 11 challenging real-world benchmarks 5-10x longer than prior cases, it outperforms or matches state-of-the-art automated refactoring tools and a strong LLM baseline; further agentic performance optimization yields a 6.51x geometric mean speedup over the SoTA pragma tuning tool and a 1.20x speedup over optimized open-source designs with less than 20% extra resources.

What carries the argument

The self-evolving memory system that accumulates and retrieves factual and strategic knowledge across tasks to improve robustness and efficiency on unseen programs.

If this is right

Refactoring becomes practical for programs five to ten times longer than those handled by earlier automated or LLM methods.
Computational cost drops because agents can delegate many edits to existing automated refactoring tools instead of calling the LLM for every change.
Hardware designs produced from the refactored code run substantially faster after the agents apply performance-directed transformations.
The entire process remains fully automatic and produces designs that use under 20 percent extra resources compared with hand-optimized open-source versions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The memory mechanism could be reused in other iterative code-transformation domains such as porting between different high-level languages or frameworks.
Combining the workflow with static analysis or formal verification passes might reduce the chance that refactored code contains subtle functional errors.
The reported speedups suggest the same agent structure could be applied to generate accelerator code for domains beyond HLS, such as GPU or FPGA kernel tuning.

Load-bearing premise

The self-evolving memory system accumulates and retrieves factual and strategic knowledge across tasks, improving robustness and efficiency on unseen programs.

What would settle it

A new collection of long real-world programs on which AgRefactor with the memory system fails to match or exceed the performance of the same workflow without memory or the prior SoTA tools.

Figures

Figures reproduced from arXiv: 2606.30949 by Jason Cong, Yang Zou, Yizhou Sun, Zijian Ding.

**Figure 2.** Figure 2: summarizes the results. Although both HeteroRefactor and HLSRewriter perform well on their own benchmarks, they failed many more larger benchmarks, such as libjpeg-turbo. Beyond the known limitation of handling external libraries such as STL containers (e.g., “std::set” and “std::map”), HeteroRefactor also shows limitations when dealing with common pointer operations. In [PITH_FULL_IMAGE:figures/full_fig… view at source ↗

**Figure 3.** Figure 3: Overview of AGREFACTOR. Given a C/C++ program and a user-specified top-level function, the framework automatically produces a synthesizable HLS implementation by progressing through identifying, planning, refactoring, and fixing stages, while continuously updating a long-term memory bank. Once refactored, the synthesizable code and its testbench are forwarded to a performance optimization agent. Situated i… view at source ↗

**Figure 4.** Figure 4: Messages passed between agents. B. Long-term Memory for HLS Refactoring To enable self-evolution, AGREFACTOR accumulates successful and unsuccessful trials as queryable knowledgeEach memory entry ˙ is defined as a tuple (pi, Ii, si, ci), where pi is the initial program, Ii lists the identified incompatible constructs, si is the refactoring strategy, and ci is the generalized critique generated by the Analy… view at source ↗

**Figure 5.** Figure 5: Performance improvement over optimized open-source designs under [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

read the original abstract

High-Level Synthesis (HLS) provides a fast path from concepts to silicon, but converting real-world software into synthesizable HLS code remains challenging due to restrictive language support and the gap between software and hardware programming practices. Existing automated and LLM-based refactoring approaches partially address this problem, yet they often lack flexibility, struggle to scale, and incur high computational costs. We introduce AgRefactor, an LLM-based multi-agent workflow for refactoring software into HLS-compatible programs. AgRefactor incorporates a self-evolving memory system that accumulates and retrieves factual and strategic knowledge across tasks, improving robustness and efficiency on unseen programs. To reduce cost and enhance scalability, it integrates automated refactoring tools, enabling agents to balance LLM-driven rewrites with efficient tool-based transformations. On 9 out of 11 challenging real-world benchmarks, which are 5-10x longer than the most complex cases studied in prior work, AgRefactor outperforms or matches the state-of-the-art automated refactoring tool and a strong LLM-based baseline built on the same framework backbone. Further agentic performance optimization yields a 6.51x geometric mean speedup over the SoTA pragma tuning tool and a 1.20x speedup over optimized open-source designs with less than 20% extra resources. AgRefactor is fully-automated and open-sourced.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

AgRefactor combines multi-agent LLMs with self-evolving memory and hybrid tool use for HLS refactoring and claims solid gains on longer benchmarks, but the abstract alone leaves the experimental support thin.

read the letter

The main point is a multi-agent workflow that adds self-evolving memory to refactor software into HLS form. It reports handling 9 of 11 real benchmarks that are 5-10x longer than prior work, matching or beating both an automated tool and an LLM baseline on the same backbone, then delivering a 6.51x geometric mean speedup after further optimization.

The paper does a couple of things cleanly. Mixing LLM rewrites with existing automated refactoring tools is a practical way to control cost and scale. The self-evolving memory that accumulates factual and strategic knowledge across tasks is the clearest addition relative to the baselines mentioned. Using longer, real-world cases instead of short synthetic ones is also a step in the right direction.

The soft spots sit in the evidence. The abstract states the speedups and success rates but supplies no methods details, result tables, variance numbers, or failure analysis. Without those it is hard to tell how much the memory component actually drives the gains versus other design choices, or whether the agent runs are stable. The full manuscript would need to show the memory retrieval mechanics and the exact benchmark characteristics before the numbers can be taken at face value.

This work is aimed at teams doing custom accelerator design or embedded hardware who want to cut down on manual HLS porting. Readers working on agentic code transformation or HLS tooling would find the workflow and the benchmark set useful to examine.

I would send it to peer review. The problem is real, the hybrid setup is worth checking in detail, and the length of the benchmarks gives it enough substance to justify referee time even if revisions are needed on the experimental reporting.

Referee Report

2 major / 0 minor

Summary. The paper introduces AgRefactor, an LLM-based multi-agent workflow for refactoring software into HLS-compatible programs. It incorporates a self-evolving memory system for accumulating and retrieving knowledge across tasks, integrates automated refactoring tools to balance LLM rewrites with tool-based transformations, and claims to outperform or match SOTA automated and LLM baselines on 9 out of 11 challenging real-world benchmarks (5-10x longer than prior work), while delivering 6.51x geometric mean speedup over SoTA pragma tuning and 1.20x over optimized open-source designs with <20% extra resources. The system is fully automated and open-sourced.

Significance. If the performance claims hold under rigorous evaluation, the work would advance automated HLS refactoring by demonstrating scalable agentic workflows that combine LLM flexibility with traditional tools and self-evolving memory, addressing key limitations in cost, scalability, and robustness for long programs. The open-sourcing supports reproducibility and community follow-up.

major comments (2)

Abstract: the central performance claims (outperformance on 9/11 benchmarks, 6.51x and 1.20x speedups) are presented without any accompanying experimental setup, benchmark descriptions, baseline implementations, statistical details, or resource measurements, rendering the claims unverifiable from the provided material.
Abstract: the description of the self-evolving memory system and its integration with automated tools is stated at a high level only, with no details on memory structure, retrieval mechanisms, or how they contribute to the reported robustness gains on unseen programs.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their review and address the two major comments on the abstract below. The full manuscript provides the requested details in dedicated sections.

read point-by-point responses

Referee: Abstract: the central performance claims (outperformance on 9/11 benchmarks, 6.51x and 1.20x speedups) are presented without any accompanying experimental setup, benchmark descriptions, baseline implementations, statistical details, or resource measurements, rendering the claims unverifiable from the provided material.

Authors: Abstracts are intentionally concise per standard academic practice and do not contain full experimental details. The complete manuscript supplies all requested information: benchmark descriptions and lengths (5-10x longer than prior work) appear in Section 4.1, baseline implementations in Section 4.2, experimental setup and statistical details in Sections 4.3 and 5, and resource measurements in Section 5.2 plus associated tables. The claims are therefore verifiable from the full paper. We see no need to expand the abstract, as doing so would violate length conventions without adding value. revision: no
Referee: Abstract: the description of the self-evolving memory system and its integration with automated tools is stated at a high level only, with no details on memory structure, retrieval mechanisms, or how they contribute to the reported robustness gains on unseen programs.

Authors: The abstract summarizes contributions at the conventional high level. Full technical details are provided in the manuscript body: memory structure is described in Section 3.2, retrieval mechanisms in Section 3.3, integration with automated tools in Section 3.4, and contributions to robustness on unseen programs (including ablation studies) in Section 5.3. These sections explain knowledge accumulation across tasks and resulting efficiency/robustness gains. No abstract revision is warranted. revision: no

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an empirical engineering system (AgRefactor multi-agent workflow with self-evolving memory) and reports benchmark performance results. No equations, derivations, predictions from first principles, or mathematical claims appear in the provided abstract or described full text. All load-bearing elements are implementation descriptions and experimental outcomes on real-world HLS benchmarks, with no self-definitional loops, fitted inputs renamed as predictions, or self-citation chains reducing a central result to its own inputs. The self-evolving memory is presented as an architectural feature whose benefits are measured empirically rather than derived tautologically. This is a standard non-circular empirical paper.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, background axioms, or newly postulated entities.

pith-pipeline@v0.9.1-grok · 5766 in / 1236 out tokens · 44639 ms · 2026-07-01T01:27:30.063406+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

33 extracted references · 10 canonical work pages · 5 internal anchors

[1]

High-level synthesis for FPGAs: From prototyping to deployment,

J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang, “High-level synthesis for FPGAs: From prototyping to deployment,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 4, pp. 473–491, 2011

2011
[2]

FPGA HLS today: successes, challenges, and opportunities,

J. Cong, J. Lau, G. Liu, S. Neuendorffer, P. Pan, K. Vissers, and Z. Zhang, “FPGA HLS today: successes, challenges, and opportunities,” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 15, no. 4, pp. 1–42, 2022

2022
[3]

ScaleHLS: A new scalable high-level synthesis framework on multi-level intermediate representation,

H. Ye, C. Hao, J. Cheng, H. Jeong, J. Huang, S. Neuendorffer, and D. Chen, “ScaleHLS: A new scalable high-level synthesis framework on multi-level intermediate representation,” in2022 IEEE Iternational Sym- posium on High-Performance Computer Architecture (HPCA). IEEE, 2022, pp. 741–755

2022
[4]

Stream-HLS: Towards automatic dataflow acceleration,

S. Basalama and J. Cong, “Stream-HLS: Towards automatic dataflow acceleration,”arXiv e-prints, pp. arXiv–2501, 2025

2025
[5]

HIDA: A hierarchical dataflow compiler for high-level synthesis,

H. Ye, H. Jun, and D. Chen, “HIDA: A hierarchical dataflow compiler for high-level synthesis,” inProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, 2024, pp. 215–230

2024
[6]

A unified framework for automated code transformation and pragma insertion,

S. Pouget, L.-N. Pouchet, and J. Cong, “A unified framework for automated code transformation and pragma insertion,” inProceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025, pp. 187–198

2025
[7]

Allo: A programming model for composable accelerator design,

H. Chen, N. Zhang, S. Xiang, Z. Zeng, M. Dai, and Z. Zhang, “Allo: A programming model for composable accelerator design,”Proceedings of the ACM on Programming Languages, vol. 8, no. PLDI, pp. 593–620, 2024

2024
[8]

Heterorefactor: refactoring for heterogeneous computing with fpga,

J. Lau, A. Sivaraman, Q. Zhang, M. A. Gulzar, J. Cong, and M. Kim, “Heterorefactor: refactoring for heterogeneous computing with fpga,” inProceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 493–505

2020
[9]

Heterogen: transpiling c to heterogeneous hls code with automated test generation and program repair,

Q. Zhang, J. Wang, G. H. Xu, and M. Kim, “Heterogen: transpiling c to heterogeneous hls code with automated test generation and program repair,” inProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022, pp. 1017–1029

2022
[10]

C2HLSC: Can LLMs bridge the software-to-hardware design gap?

L. Collini, S. Garg, and R. Karri, “C2HLSC: Can LLMs bridge the software-to-hardware design gap?”arXiv preprint arXiv:2406.09233, 2024

work page arXiv 2024
[11]

Hard- ware acceleration of complex HEP algorithms with HLS and FPGAs: Methodology and preliminary implementation,

A. Wojenski, H. Zbroszczyk, M. Kruszewski, P. Szymanski, E. Wawrzyn, D. Wielanek, W. Zabolotny, D. Pawlowska, and T. Gniazdowski, “Hard- ware acceleration of complex HEP algorithms with HLS and FPGAs: Methodology and preliminary implementation,”Computer Physics Com- munications, vol. 295, p. 108997, 2024

2024
[12]

Hlsrewriter: Efficient refactoring and optimization of c/c++ code with llms for high-level synthesis,

K. Xu, G. L. Zhang, X. Yin, C. Zhuo, U. Schlichtmann, and B. Li, “Hlsrewriter: Efficient refactoring and optimization of c/c++ code with llms for high-level synthesis,”ACM Transactions on Design Automation of Electronic Systems, 2025

2025
[13]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” arXiv preprint arXiv:2005.11401, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2005
[14]

MemGPT: Towards LLMs as Operating Systems

C. Packer, S. Wooders, K. Lin, V . Fang, S. G. Patil, I. Stoica, and J. E. Gonzalez, “Memgpt: Towards llms as operating systems,”arXiv preprint arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[15]

Hipporag: Neurobiologically inspired long-term memory for large language models.arXiv preprint arXiv:2405.14831, 2024

B. Jiménez Gutiérrez, Y . Shu, Y . Gu, M. Yasunaga, and Y . Su, “Hip- porag: Neurobiologically inspired long-term memory for large language models,”arXiv preprint arXiv:2405.14831, 2024

work page arXiv 2024
[16]

Agent Workflow Memory

Z. Z. Wang, J. Mao, D. Fried, and G. Neubig, “Agent workflow memory,” arXiv preprint arXiv:2409.07429, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[17]

MemoryLLM: Towards self-updatable large language models,

Y . Wang, Y . Gao, X. Chen, H. Jiang, S. Li, J. Yang, Q. Yin, Z. Li, X. Li, B. Yin, J. Shang, and J. McAuley, “MemoryLLM: Towards self-updatable large language models,”arXiv preprint arXiv:2402.04624, 2024

work page arXiv 2024
[18]

A-MEM: Agentic Memory for LLM Agents

W. Xu, Z. Liang, K. Mei, H. Gao, J. Tan, and Y . Zhang, “A-MEM: Agentic memory for LLM agents,”arXiv preprint arXiv:2502.12110, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

A. Novikov, N. V ˜u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. Ruiz, A. Mehrabianet al., “Alphaevolve: A coding agent for scientific and algorithmic discovery,” arXiv preprint arXiv:2506.13131, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[20]

AutoDSE: Enabling software programmers to design efficient FPGA accelerators,

A. Sohrabizadeh, C. H. Yu, M. Gao, and J. Cong, “AutoDSE: Enabling software programmers to design efficient FPGA accelerators,”ACM Trans. Des. Autom. Electron. Syst., vol. 27, no. 4, Feb. 2022. [Online]. Available: https://doi.org/10.1145/3494534

work page doi:10.1145/3494534 2022
[21]

Leetcode,

LeetCode, “Leetcode,” 2025, accessed: 2025-10-02. [Online]. Available: https://leetcode.com/

2025
[22]

libsodium,

F. Denis and the libsodium contributors, “libsodium,” 2025, accessed: 2025-10-02. [Online]. Available: https://doc.libsodium.org/

2025
[23]

minimap2,

H. Li, “minimap2,” 2025, accessed: 2025-10-02. [Online]. Available: https://github.com/lh3/minimap2

2025
[24]

libjpeg-turbo,

The libjpeg-turbo Project, “libjpeg-turbo,” 2025, accessed: 2025-10-02. [Online]. Available: https://libjpeg-turbo.org/

2025
[25]

Av1 reference codec (libaom),

Alliance for Open Media, “Av1 reference codec (libaom),” 2025, accessed: 2025-10-02. [Online]. Available: https://aomedia. googlesource.com/aom/

2025
[26]

LightningSimV2: Faster and scalable simulation for high-level synthesis via graph compilation and optimization,

R. Sarkar, R. Paul, and C. C. Hao, “LightningSimV2: Faster and scalable simulation for high-level synthesis via graph compilation and optimization,” in2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2024, pp. 104–114

2024
[27]

Holistic Optimization Framework for FPGA Accelerators,

S. Pouget, M. Lo, L.-N. Pouchet, and J. Cong, “Holistic Optimization Framework for FPGA Accelerators,”ACM Transactions on Design Automation of Electronic Systems, vol. 31, no. 1, pp. 1–37, 2025

2025
[28]

Ag2: Open-source agentos for ai agents,

C. Wang, Q. Wu, and the AG2 Community, “Ag2: Open-source agentos for ai agents,” 2024, available at https://docs.ag2.ai/. [Online]. Available: https://github.com/ag2ai/ag2

2024
[29]

Sentencetransformers,

UKPLab, “Sentencetransformers,” 2025, accessed: 2025-10-02. [Online]. Available: https://github.com/UKPLab/sentence-transformers

2025
[30]

Soda: Stencil with optimized dataflow architecture,

Y . Chi, J. Cong, P. Wei, and P. Zhou, “Soda: Stencil with optimized dataflow architecture,” inProceedings of the International Conference on Computer-Aided Design, 2018, pp. 1–8

2018
[31]

HLSFactory: A framework empowering high-level synthesis datasets for machine learning and beyond,

S. Abi-Karam, R. Sarkar, A. Seigler, S. Lowe, Z. Wei, H. Chen, N. Rao, L. John, A. Arora, and C. Hao, “HLSFactory: A framework empowering high-level synthesis datasets for machine learning and beyond,” inPro- ceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024, pp. 1–9

2024
[32]

Vitis Libraries,

AMD/Xilinx, “Vitis Libraries,” https://github.com/Xilinx/Vitis\ _Libraries, 2024

2024
[33]

GPT-5 model family,

OpenAI, “GPT-5 model family,” OpenAI, 2025, accessed: 2025-2026. [Online]. Available: https://openai.com/gpt-5/ APPENDIX AGREFACTORis publicly available at https://github.com/Williamzou0123/AgRefactor This appendix presents supplementary studies that complement the main evaluation: the effect of self-accumulating memory across training epochs, the contrib...

work page arXiv 2025

[1] [1]

High-level synthesis for FPGAs: From prototyping to deployment,

J. Cong, B. Liu, S. Neuendorffer, J. Noguera, K. Vissers, and Z. Zhang, “High-level synthesis for FPGAs: From prototyping to deployment,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 30, no. 4, pp. 473–491, 2011

2011

[2] [2]

FPGA HLS today: successes, challenges, and opportunities,

J. Cong, J. Lau, G. Liu, S. Neuendorffer, P. Pan, K. Vissers, and Z. Zhang, “FPGA HLS today: successes, challenges, and opportunities,” ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol. 15, no. 4, pp. 1–42, 2022

2022

[3] [3]

ScaleHLS: A new scalable high-level synthesis framework on multi-level intermediate representation,

H. Ye, C. Hao, J. Cheng, H. Jeong, J. Huang, S. Neuendorffer, and D. Chen, “ScaleHLS: A new scalable high-level synthesis framework on multi-level intermediate representation,” in2022 IEEE Iternational Sym- posium on High-Performance Computer Architecture (HPCA). IEEE, 2022, pp. 741–755

2022

[4] [4]

Stream-HLS: Towards automatic dataflow acceleration,

S. Basalama and J. Cong, “Stream-HLS: Towards automatic dataflow acceleration,”arXiv e-prints, pp. arXiv–2501, 2025

2025

[5] [5]

HIDA: A hierarchical dataflow compiler for high-level synthesis,

H. Ye, H. Jun, and D. Chen, “HIDA: A hierarchical dataflow compiler for high-level synthesis,” inProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 1, 2024, pp. 215–230

2024

[6] [6]

A unified framework for automated code transformation and pragma insertion,

S. Pouget, L.-N. Pouchet, and J. Cong, “A unified framework for automated code transformation and pragma insertion,” inProceedings of the 2025 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2025, pp. 187–198

2025

[7] [7]

Allo: A programming model for composable accelerator design,

H. Chen, N. Zhang, S. Xiang, Z. Zeng, M. Dai, and Z. Zhang, “Allo: A programming model for composable accelerator design,”Proceedings of the ACM on Programming Languages, vol. 8, no. PLDI, pp. 593–620, 2024

2024

[8] [8]

Heterorefactor: refactoring for heterogeneous computing with fpga,

J. Lau, A. Sivaraman, Q. Zhang, M. A. Gulzar, J. Cong, and M. Kim, “Heterorefactor: refactoring for heterogeneous computing with fpga,” inProceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 493–505

2020

[9] [9]

Heterogen: transpiling c to heterogeneous hls code with automated test generation and program repair,

Q. Zhang, J. Wang, G. H. Xu, and M. Kim, “Heterogen: transpiling c to heterogeneous hls code with automated test generation and program repair,” inProceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2022, pp. 1017–1029

2022

[10] [10]

C2HLSC: Can LLMs bridge the software-to-hardware design gap?

L. Collini, S. Garg, and R. Karri, “C2HLSC: Can LLMs bridge the software-to-hardware design gap?”arXiv preprint arXiv:2406.09233, 2024

work page arXiv 2024

[11] [11]

Hard- ware acceleration of complex HEP algorithms with HLS and FPGAs: Methodology and preliminary implementation,

A. Wojenski, H. Zbroszczyk, M. Kruszewski, P. Szymanski, E. Wawrzyn, D. Wielanek, W. Zabolotny, D. Pawlowska, and T. Gniazdowski, “Hard- ware acceleration of complex HEP algorithms with HLS and FPGAs: Methodology and preliminary implementation,”Computer Physics Com- munications, vol. 295, p. 108997, 2024

2024

[12] [12]

Hlsrewriter: Efficient refactoring and optimization of c/c++ code with llms for high-level synthesis,

K. Xu, G. L. Zhang, X. Yin, C. Zhuo, U. Schlichtmann, and B. Li, “Hlsrewriter: Efficient refactoring and optimization of c/c++ code with llms for high-level synthesis,”ACM Transactions on Design Automation of Electronic Systems, 2025

2025

[13] [13]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küttler, M. Lewis, W.-t. Yih, T. Rocktäschel, S. Riedel, and D. Kiela, “Retrieval-augmented generation for knowledge-intensive NLP tasks,” arXiv preprint arXiv:2005.11401, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2005

[14] [14]

MemGPT: Towards LLMs as Operating Systems

C. Packer, S. Wooders, K. Lin, V . Fang, S. G. Patil, I. Stoica, and J. E. Gonzalez, “Memgpt: Towards llms as operating systems,”arXiv preprint arXiv:2310.08560, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[15] [15]

Hipporag: Neurobiologically inspired long-term memory for large language models.arXiv preprint arXiv:2405.14831, 2024

B. Jiménez Gutiérrez, Y . Shu, Y . Gu, M. Yasunaga, and Y . Su, “Hip- porag: Neurobiologically inspired long-term memory for large language models,”arXiv preprint arXiv:2405.14831, 2024

work page arXiv 2024

[16] [16]

Agent Workflow Memory

Z. Z. Wang, J. Mao, D. Fried, and G. Neubig, “Agent workflow memory,” arXiv preprint arXiv:2409.07429, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[17] [17]

MemoryLLM: Towards self-updatable large language models,

Y . Wang, Y . Gao, X. Chen, H. Jiang, S. Li, J. Yang, Q. Yin, Z. Li, X. Li, B. Yin, J. Shang, and J. McAuley, “MemoryLLM: Towards self-updatable large language models,”arXiv preprint arXiv:2402.04624, 2024

work page arXiv 2024

[18] [18]

A-MEM: Agentic Memory for LLM Agents

W. Xu, Z. Liang, K. Mei, H. Gao, J. Tan, and Y . Zhang, “A-MEM: Agentic memory for LLM agents,”arXiv preprint arXiv:2502.12110, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[19] [19]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

A. Novikov, N. V ˜u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. Ruiz, A. Mehrabianet al., “Alphaevolve: A coding agent for scientific and algorithmic discovery,” arXiv preprint arXiv:2506.13131, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[20] [20]

AutoDSE: Enabling software programmers to design efficient FPGA accelerators,

A. Sohrabizadeh, C. H. Yu, M. Gao, and J. Cong, “AutoDSE: Enabling software programmers to design efficient FPGA accelerators,”ACM Trans. Des. Autom. Electron. Syst., vol. 27, no. 4, Feb. 2022. [Online]. Available: https://doi.org/10.1145/3494534

work page doi:10.1145/3494534 2022

[21] [21]

Leetcode,

LeetCode, “Leetcode,” 2025, accessed: 2025-10-02. [Online]. Available: https://leetcode.com/

2025

[22] [22]

libsodium,

F. Denis and the libsodium contributors, “libsodium,” 2025, accessed: 2025-10-02. [Online]. Available: https://doc.libsodium.org/

2025

[23] [23]

minimap2,

H. Li, “minimap2,” 2025, accessed: 2025-10-02. [Online]. Available: https://github.com/lh3/minimap2

2025

[24] [24]

libjpeg-turbo,

The libjpeg-turbo Project, “libjpeg-turbo,” 2025, accessed: 2025-10-02. [Online]. Available: https://libjpeg-turbo.org/

2025

[25] [25]

Av1 reference codec (libaom),

Alliance for Open Media, “Av1 reference codec (libaom),” 2025, accessed: 2025-10-02. [Online]. Available: https://aomedia. googlesource.com/aom/

2025

[26] [26]

LightningSimV2: Faster and scalable simulation for high-level synthesis via graph compilation and optimization,

R. Sarkar, R. Paul, and C. C. Hao, “LightningSimV2: Faster and scalable simulation for high-level synthesis via graph compilation and optimization,” in2024 IEEE 32nd Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 2024, pp. 104–114

2024

[27] [27]

Holistic Optimization Framework for FPGA Accelerators,

S. Pouget, M. Lo, L.-N. Pouchet, and J. Cong, “Holistic Optimization Framework for FPGA Accelerators,”ACM Transactions on Design Automation of Electronic Systems, vol. 31, no. 1, pp. 1–37, 2025

2025

[28] [28]

Ag2: Open-source agentos for ai agents,

C. Wang, Q. Wu, and the AG2 Community, “Ag2: Open-source agentos for ai agents,” 2024, available at https://docs.ag2.ai/. [Online]. Available: https://github.com/ag2ai/ag2

2024

[29] [29]

Sentencetransformers,

UKPLab, “Sentencetransformers,” 2025, accessed: 2025-10-02. [Online]. Available: https://github.com/UKPLab/sentence-transformers

2025

[30] [30]

Soda: Stencil with optimized dataflow architecture,

Y . Chi, J. Cong, P. Wei, and P. Zhou, “Soda: Stencil with optimized dataflow architecture,” inProceedings of the International Conference on Computer-Aided Design, 2018, pp. 1–8

2018

[31] [31]

HLSFactory: A framework empowering high-level synthesis datasets for machine learning and beyond,

S. Abi-Karam, R. Sarkar, A. Seigler, S. Lowe, Z. Wei, H. Chen, N. Rao, L. John, A. Arora, and C. Hao, “HLSFactory: A framework empowering high-level synthesis datasets for machine learning and beyond,” inPro- ceedings of the 2024 ACM/IEEE International Symposium on Machine Learning for CAD, 2024, pp. 1–9

2024

[32] [32]

Vitis Libraries,

AMD/Xilinx, “Vitis Libraries,” https://github.com/Xilinx/Vitis\ _Libraries, 2024

2024

[33] [33]

GPT-5 model family,

OpenAI, “GPT-5 model family,” OpenAI, 2025, accessed: 2025-2026. [Online]. Available: https://openai.com/gpt-5/ APPENDIX AGREFACTORis publicly available at https://github.com/Williamzou0123/AgRefactor This appendix presents supplementary studies that complement the main evaluation: the effect of self-accumulating memory across training epochs, the contrib...

work page arXiv 2025