pith. sign in

arxiv: 2506.08332 · v3 · submitted 2025-06-10 · 💻 cs.AI

ORFS-agent: Tool-Using Agents for Chip Design Optimization

Pith reviewed 2026-05-19 11:19 UTC · model grok-4.3

classification 💻 cs.AI
keywords LLM agentschip design optimizationparameter tuningRTL-to-layout flowsBayesian optimizationmulti-objective optimizationtool-using agentsopen-source EDA
0
0 comments X

The pith

An LLM agent tunes thousands of parameters in open-source chip design flows more efficiently than Bayesian optimization, improving wirelength and clock period with 40% fewer iterations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents ORFS-agent, an LLM-based iterative agent that automates parameter tuning for register-transfer level to physical layout workflows in integrated circuit design. It shows that frontier thinking-model backends can explore high-dimensional configuration spaces adaptively, delivering up to 1.0% better geometric-mean normalized wirelength, 1.3% better effective clock period, and 2.7% better co-optimization objectives than OR-AutoTuner across six benchmarks on ASAP7 and SKY130HD while using 40% fewer iterations. The agent remains modular and model-agnostic, requiring no fine-tuning, and supports natural-language objectives for trading off metrics. This approach matters because even small parameter changes in these flows can produce large effects on final power, performance, and area, and current optimization methods consume substantial compute resources.

Core claim

ORFS-agent is an LLM-based iterative optimization agent that automates parameter tuning in open-source hardware design flows by adaptively exploring configurations; across six benchmarks it improves geometric-mean normalized wirelength by up to 1.0%, effective clock period by 1.3%, and co-optimization objectives by 2.7% over OR-AutoTuner while using 40% fewer iterations, with open-weight models staying within 0.24% of the best closed model and optional retrieval tools accelerating early convergence without changing final endpoints.

What carries the argument

The ORFS-agent, an LLM-driven iterative loop that selects parameter configurations, invokes evaluation tools in the RTL-to-layout flow, and reasons over results to refine the next configuration without any model fine-tuning.

If this is right

  • Thinking-model backends improve the same objectives by up to 7.5% over the earlier Sonnet 3.5 backend.
  • Open-weight models enable private deployment while remaining competitive.
  • Natural-language instructions allow explicit trade-offs between metrics in multi-objective settings.
  • Retrieval tools speed early progress but do not raise final performance ceilings.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same agent structure could be tested on other high-dimensional engineering optimization tasks such as compiler flag tuning or process-parameter selection in manufacturing.
  • Checkpoint-aligned reasoning traces could support human review or automated auditing of optimization decisions in regulated design environments.
  • If the pattern holds across model generations, the need for bespoke surrogate models in many EDA flows may decrease.

Load-bearing premise

LLM reasoning combined with tool use will consistently yield better or more resource-efficient parameter choices than Bayesian optimization in the high-dimensional configuration space of open-source RTL-to-layout flows.

What would settle it

A controlled comparison on additional unseen benchmarks or with different LLMs showing no reduction in iterations or no improvement in wirelength, clock period, or co-optimization metrics relative to OR-AutoTuner would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2506.08332 by Amur Ghose, Andrew B. Kahng, Sayak Kundu, Zhiang Wang.

Figure 1
Figure 1. Figure 1: ORFS-agent integrated with OpenROAD-flow-scripts (ORFS); metrics gathered via METRICS2.1 [22]. [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall ORFS-agent flowchart. specify the maximum runtime for each batch. Below, we show an example scenario with K = 25, TIMEOUT = 30 minutes and parameters (Core Util, Clock Period) that can be integrated, along with possible function-calling tools. Consider the j th iteration. Assume that we target the routed wirelength and take the CTS wirelength as a surrogate (recalling that the CTS stage is earlier … view at source ↗
Figure 3
Figure 3. Figure 3: Optimization trajectory of ASAP7-IBEX (co-optimization, 600 [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Correlation of metrics for ASAP7-IBEX: wirelength ( [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Pareto frontier of ECP vs. W L under 4% tolerance, relative to the baseline. The blue line represents the case with 4 parameters and no tool use, and the red line represents the case with 12 parameters and tool use. APPENDIX D ABLATIONS FOR DATA LEAKAGE, CONFIDENCE AND ROBUSTNESS In this appendix, we analyze statistical significance in Table XV, and study prompt-level ablations in Tables XVI and XVII. Tabl… view at source ↗
read the original abstract

Machine learning has been widely used to optimize complex engineering workflows across numerous domains. In integrated circuit design, modern flows (e.g., register-transfer level to physical layout) involve extensive configuration via thousands of parameters, and small changes can have large downstream impacts on design performance, power, and area. Recent advances in Large Language Models (LLMs) offer new opportunities for learning and reasoning within such high-dimensional optimization tasks. In this work, we introduce ORFS-agent, an LLM-based iterative optimization agent that automates parameter tuning in an open-source hardware design flow. ORFS-agent adaptively explores parameter configurations, demonstrating improvements over standard Bayesian optimization approaches in terms of resource efficiency and final design metrics. Across six benchmarks on ASAP7 and SKY130HD, thinking-model backends (Sonnet 4.6 [69] and Kimi K2.5 [28]) improve the geometric-mean normalized wirelength, effective clock period, and co-optimization objectives by up to 1.0%, 1.3%, and 2.7% over OR-AutoTuner while using 40% fewer iterations; the open-weight Kimi K2.5 remains within 0.24% of Sonnet 4.6, enabling private deployment. Relative to the earlier Sonnet 3.5 backend, these thinking models improve the same objectives by up to 7.5%, 3.1%, and 4.0%. Optional retrieval tools accelerate early convergence but do not improve final endpoints. By following natural language objectives to trade off certain metrics for others, ORFS-agent demonstrates a flexible and interpretable framework for multi-objective and constrained optimization. Crucially, ORFS-agent is modular and model-agnostic, and can be plugged into any frontier LLM without any further fine-tuning. We also report checkpoint-aligned trajectories and reasoning summaries that document the agent's decision process.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper introduces ORFS-agent, an LLM-based iterative optimization agent for automating parameter tuning in open-source RTL-to-physical-layout flows. It claims that tool-using agents powered by advanced thinking-model LLMs (Sonnet 4.6 and Kimi K2.5) outperform Bayesian optimization baselines (OR-AutoTuner) across six benchmarks on ASAP7 and SKY130HD, delivering geometric-mean improvements of up to 1.0% in normalized wirelength, 1.3% in effective clock period, and 2.7% in co-optimization objectives while using 40% fewer iterations. The framework is modular and model-agnostic, supports natural-language multi-objective trade-offs, and supplies checkpoint-aligned trajectories plus reasoning summaries.

Significance. If the empirical gains are shown to be statistically reliable, the work would offer a concrete demonstration that general-purpose LLMs can be applied to high-dimensional engineering optimization in electronic design automation without domain-specific fine-tuning. The explicit provision of reasoning traces and the near-parity of an open-weight model with a closed model are genuine strengths that enhance interpretability and practical deployability.

major comments (1)
  1. [§5] §5 (Experimental Results) and associated tables/figures: the headline geometric-mean improvements (1.0%/1.3%/2.7%) and 40% iteration reduction are reported as single point estimates per benchmark without standard deviations, confidence intervals, or statistical significance tests across replicate runs. Because both Bayesian optimization and LLM sampling are stochastic and early parameter choices can produce divergent local optima, these point values alone do not establish that the observed deltas exceed run-to-run noise.
minor comments (2)
  1. [§3] The distinction between 'thinking-model backends' and standard models is referenced via citations but lacks a concise operational definition or prompt-template excerpt that would allow readers to reproduce the exact reasoning style.
  2. [Figures 4-6 and Tables 2-3] Figure and table captions would benefit from explicit indication of which data series correspond to Sonnet 4.6 versus Kimi K2.5 to reduce cross-referencing effort.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the experimental reporting. We address the concern regarding the statistical reliability of the reported results below.

read point-by-point responses
  1. Referee: [§5] §5 (Experimental Results) and associated tables/figures: the headline geometric-mean improvements (1.0%/1.3%/2.7%) and 40% iteration reduction are reported as single point estimates per benchmark without standard deviations, confidence intervals, or statistical significance tests across replicate runs. Because both Bayesian optimization and LLM sampling are stochastic and early parameter choices can produce divergent local optima, these point values alone do not establish that the observed deltas exceed run-to-run noise.

    Authors: We agree that the results are presented as single-point estimates and that the lack of standard deviations, confidence intervals, or formal statistical tests across replicate runs is a limitation, particularly given the stochastic nature of both Bayesian optimization and LLM sampling. Each full optimization trajectory requires multiple executions of the complete RTL-to-GDSII flow, making replicate runs across all six benchmarks computationally prohibitive within our experimental budget. The improvements were nevertheless observed consistently in direction across all benchmarks and objectives. In the revised manuscript we will add an explicit discussion in §5 acknowledging the single-run design, the sources of stochasticity, and the resulting limitation on claims of statistical significance. We will also attempt to obtain limited replicate data for at least one benchmark to provide preliminary variance estimates if resources allow. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical results stand alone

full rationale

The paper reports direct experimental outcomes from running ORFS-agent trajectories on six fixed benchmarks (ASAP7 and SKY130HD), measuring wirelength, clock period, and co-optimization objectives against the OR-AutoTuner baseline. These are observed point estimates from tool-using LLM sessions, not quantities derived from equations, fitted parameters, or self-referential definitions. No load-bearing self-citations, ansatzes, or uniqueness theorems underpin the central claims; the evaluation is self-contained against external benchmarks and does not reduce any result to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is empirical and does not introduce mathematical derivations, free parameters fitted to target results, or new postulated entities; it relies on standard LLM capabilities and existing design tools.

pith-pipeline@v0.9.0 · 5886 in / 1274 out tokens · 57622 ms · 2026-05-19T11:19:01.108706+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bridging the Last Mile of Circuit Design: PostEDA-Bench, a Hierarchical Benchmark for PPA Convergence and DRC Fixing

    cs.AR 2026-05 unverdicted novelty 7.0

    PostEDA-Bench shows LLM agents succeed reasonably on basic DRC and single-objective PPA tasks but struggle on practical DRC reasoning (best 36.66% success) and multi-objective PPA (best 20% success).

Reference graph

Works this paper leans on

85 extracted references · 85 canonical work pages · cited by 1 Pith paper · 9 internal anchors

  1. [1]

    Parameter Optimization of VLSI Placement Through Deep Reinforcement Learning,

    A. Agnesina, K. Chang and S. K. Lim, “Parameter Optimization of VLSI Placement Through Deep Reinforcement Learning,” TCAD, 42 (4) (2022), pp. 1295–1308

  2. [2]

    AutoDMP: Automated DREAMPlace-based Macro Placement,

    A. Agnesina, P. Rajvanshi, T. Yang, G. Pradipta, A. Jiao, B. Keller et al., “AutoDMP: Automated DREAMPlace-based Macro Placement,” Proc. ISPD, 2023, pp. 149–157

  3. [3]

    OpenROAD: Toward a Self-Driving, Open-Source Digital Layout Implementation Tool Chain,

    T. Ajayi, D. Blaauw, T.-B. Chan, C.-K. Cheng, V . A. Chhabria, D. K. Choo, et al., “OpenROAD: Toward a Self-Driving, Open-Source Digital Layout Implementation Tool Chain,” Proc. Gov. Microcircuit Applications and Critical Technology Conf., 2019, pp. 1105–1110

  4. [4]

    EDA-Copilot: A RAG-Powered Intelligent Assistant for EDA Tools,

    Z. Xiao, X. He, H. Wu, B. Yu and Y . Guo, “EDA-Copilot: A RAG-Powered Intelligent Assistant for EDA Tools,” TODAES (2025)

  5. [5]

    Machine Learning for Combinatorial Optimization: A Methodological Tour d’Horizon,

    Y . Bengio, A. Lodi and A. Prouvost, “Machine Learning for Combinatorial Optimization: A Methodological Tour d’Horizon,” Eur. J. Oper. Res., 290 (2) (2021), pp. 405–421

  6. [6]

    Hyper-Heuristics: A Survey of the State of the Art,

    E. K. Burke, M. Gendreau, M. Hyde, G. Kendall, G. Ochoa, E. ¨Ozcan and R. Qu, “Hyper-Heuristics: A Survey of the State of the Art,” J. Oper. Res. Soc., 64 (12) (2013), pp. 1695–1724

  7. [7]

    Language Models Are Few-Shot Learners,

    T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan and P. Dhariwal et al., “Language Models Are Few-Shot Learners,” Advances in Neural Information Processing Systems , 33 (2020), pp. 1877–1901

  8. [8]

    Retrieval- Guided Reinforcement Learning for Boolean Circuit Minimization,

    A. B. Chowdhury, M. Romanelli, B. Tan, R. Karri and S. Garg, “Retrieval- Guided Reinforcement Learning for Boolean Circuit Minimization,” Proc. ICLR, 2024

  9. [9]

    Evaluating Large Language Models Trained on Code

    M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto and J. Kaplan et al., “Evaluating Large Language Models Trained on Code,” arXiv preprint arXiv:2107.03374 , 2021

  10. [10]

    Assessment of Reinforcement Learning for Macro Placement,

    C.-K. Cheng, A. B. Kahng, S. Kundu, Y . Wang and Z. Wang, “Assessment of Reinforcement Learning for Macro Placement,” Proc. ISPD, 2023, pp. 158–166

  11. [11]

    AI Agents Will Work With AI Agents for Chip Design in 2025,

    N. Dahad, “AI Agents Will Work With AI Agents for Chip Design in 2025,” EE Times, 20 Dec. 2024

  12. [12]

    A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level,

    I. Drori, S. Zhang, R. Shuttleworth, L. Tang, A. Lu and E. Ke et al., “A Neural Network Solves, Explains, and Generates University Math Problems by Program Synthesis and Few-Shot Learning at Human Level,” arXiv preprint arXiv:2112.15594 , 2022

  13. [13]

    Neural Architecture Search: A Survey,

    T. Elsken, J. H. Metzen and F. Hutter, “Neural Architecture Search: A Survey,” J. Mach. Learn. Res. , 20 (55) (2019), pp. 1–91

  14. [14]

    BOHB: Robust and Efficient Hyperparameter Optimization at Scale,

    S. Falkner, A. Klein and F. Hutter, “BOHB: Robust and Efficient Hyperparameter Optimization at Scale,” Proc. ICML, 2018, pp. 1437–1446

  15. [15]

    OpenAGI: When LLM Meets Domain Experts,

    Y . Ge, W. Hua, K. Mei, J. Ji, J. Tan and S. Xu et al., “OpenAGI: When LLM Meets Domain Experts,” arXiv preprint arXiv:2304.04370 , 2023

  16. [16]

    EvoPrompt: Connecting LLMs with Evolutionary Algorithms Yields Powerful Prompt Optimizers

    Q. Guo, R. Wang, J. Guo, B. Li, K. Song, X. Tan et al., “Connecting Large Language Models With Evolutionary Algorithms Yields Powerful Prompt Optimizers,” arXiv preprint arXiv:2309.08532 , 2023

  17. [17]

    The Hardware Lottery,

    S. Hooker, “The Hardware Lottery,” arXiv preprint arXiv:2009.06489 , 2020

  18. [18]

    Benchmarking Large Language Models as AI Research Agents,

    Q. Huang, J. V ora, P. Liang and J. Leskovec, “Benchmarking Large Language Models as AI Research Agents,”Advances in Neural Information Processing Systems, 2023

  19. [19]

    Autonomous LLM-Driven Research—From Data to Human-Verifiable Research Papers,

    T. Ifargan, L. Hafner, M. Kern, O. Alcalay and R. Kishony, “Autonomous LLM-Driven Research—From Data to Human-Verifiable Research Papers,” NEJM AI, 2 (1) (2025), pp. AIoa2400555

  20. [20]

    SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

    C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press and K. Narasimhan, “SWE-bench: Can Language Models Resolve Real-World GitHub Issues?,” arXiv preprint arXiv:2310.06770 , 2023

  21. [21]

    IEEE CEDA DATC Emerging Foundations in IC Physical Design and MLCAD Research,

    J. Jung, A. B. Kahng, S. Kundu, Z. Wang and D. Yoon, “IEEE CEDA DATC Emerging Foundations in IC Physical Design and MLCAD Research,” Proc. ICCAD, 2023, pp. 1–7

  22. [22]

    METRICS2.1 and Flow Tuning in the IEEE CEDA Robust Design Flow and OpenROAD,

    J. Jung, A. B. Kahng, S. Kim and R. Varadarajan, “METRICS2.1 and Flow Tuning in the IEEE CEDA Robust Design Flow and OpenROAD,” Proc. ICCAD, 2021, pp. 1–9

  23. [23]

    VLSI Physical Design: From Graph Partitioning to Timing Closure,

    A. B. Kahng, J. Lienig, I. L. Markov and J. Hu, “VLSI Physical Design: From Graph Partitioning to Timing Closure,” Springer, 2011

  24. [24]

    Machine Learning Applications in Physical Design: Recent Results and Directions,

    A. B. Kahng, “Machine Learning Applications in Physical Design: Recent Results and Directions,” Proc. ISPD, 2018, pp. 68–73

  25. [25]

    A Mixed Open-Source and Proprietary EDA Commons for Education and Prototyping,

    A. B. Kahng, “A Mixed Open-Source and Proprietary EDA Commons for Education and Prototyping,” Proc. ICCAD, 2022, pp. 1-6

  26. [26]

    ORAssistant: A Custom RAG-Based Conversational Assistant for OpenROAD,

    A. Kaintura, P. R., S. Luar and I. I. Almeida, “ORAssistant: A Custom RAG-Based Conversational Assistant for OpenROAD,” arXiv preprint arXiv:2410.03845, 2024

  27. [27]

    Multi-Fidelity Bayesian Optimisation With Continuous Approximations,

    K. Kandasamy, G. Dasarathy, J. B. Oliva, J. Schneider and B. P ´ocz´os, “Multi-Fidelity Bayesian Optimisation With Continuous Approximations,” Proc. ICML, 2017, pp. 1799–1808

  28. [28]

    DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

    O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, K. Santhanam and S. Vardhamanan et al., “DSPy: Compiling Declarative Language Model Calls Into Self-Improving Pipelines,” arXiv preprint arXiv:2310.03714 , 2023

  29. [29]

    Algorithm Selection for Combinatorial Search Problems: A Survey,

    L. Kotthoff, “Algorithm Selection for Combinatorial Search Problems: A Survey,” Data Mining and Constraint Programming , Springer, 2016, pp. 149–190

  30. [30]

    Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,

    P. Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin and N. Goyal et al., “Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks,” Advances in Neural Information Processing Systems , 33 (2020), pp. 9459–9474

  31. [31]

    DREAMPlace: Deep Learning Toolkit–Enabled GPU Acceleration for Modern VLSI Placement,

    Y . Lin, S. Dhar, W. Li, H. Ren, B. Khailany and D. Z. Pan, “DREAMPlace: Deep Learning Toolkit–Enabled GPU Acceleration for Modern VLSI Placement,” Proc. DAC, 2019, pp. 1–6

  32. [32]

    Large Lan- guage Models to Enhance Bayesian Optimization,

    T. Liu, N. Astorga, N. Seedat and M. van der Schaar, “Large Lan- guage Models to Enhance Bayesian Optimization,” arXiv preprint arXiv:2402.03921, 2024

  33. [33]

    DARTS: Differentiable Architecture Search

    H. Liu, K. Simonyan and Y . Yang, “DARTS: Differentiable Architecture Search,” arXiv preprint arXiv:1806.09055 , 2018

  34. [34]

    ChipNeMo: Domain-Adapted LLMs for Chip Design,

    M. Liu, T.-D. Ene, R. Kirby, C. Cheng, N. Pinckney, R. Liang et al., “ChipNeMo: Domain-Adapted LLMs for Chip Design,” arXiv preprint arXiv:2311.00176, 2024

  35. [35]

    Large Language Model Agent for Hyper- Parameter Optimization,

    S. Liu, C. Gao and Y . Li, “Large Language Model Agent for Hyper- Parameter Optimization,” arXiv preprint arXiv:2402.01881 , 2024

  36. [36]

    VeriSeek: Reinforcement Learning With Golden Code Feedback for Verilog Generation,

    M. Liu, N. R. Pinckney, B. Khailany and H. Ren, “VeriSeek: Reinforcement Learning With Golden Code Feedback for Verilog Generation,” arXiv preprint arXiv:2407.18271, 2024

  37. [37]

    GAN- CTS: a Generative Adversarial Framework for Clock Tree Prediction and Optimization,

    Y .-C. Lu, J. Lee, A. Agnesina, K. Samadi, and S. K. Lim, “GAN- CTS: a Generative Adversarial Framework for Clock Tree Prediction and Optimization,” Proc. ICCAD, 2019, pp. 1–8

  38. [38]

    Reinforcement Learning for Combinatorial Optimization: A Survey,

    N. Mazyavkina, S. Sviridov, S. Ivanov and E. Burnaev, “Reinforcement Learning for Combinatorial Optimization: A Survey,” Comput. & Oper. Res., 134 (2021), p. 105400

  39. [39]

    A Graph Placement Methodology for Fast Chip Design,

    A. Mirhoseini, A. Goldie, M. Yazgan, J. W. Jiang, E. Songhori, S. Wang et al., “A Graph Placement Methodology for Fast Chip Design,” Nature, 594 (2021), pp. 207–212

  40. [40]

    DAG-Aware AIG Rewrit- ing: A Fresh Look at Combinational Logic Synthesis,

    A. Mishchenko, S. Chatterjee and R. Brayton, “DAG-Aware AIG Rewrit- ing: A Fresh Look at Combinational Logic Synthesis,” Proc. DAC, 2006, pp. 532–535

  41. [41]

    Training Language Models to Follow Instructions With Human Feedback,

    L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. Wainwright and P. Mishkin et al., “Training Language Models to Follow Instructions With Human Feedback,” Advances in Neural Information Processing Systems, 35 (2022), pp. 27730–27744

  42. [42]

    Logic-LM: Empowering Large Language Models With Symbolic Solvers for Faithful Logical Reasoning,

    L. Pan, A. Albalak, X. Wang and W. Yang, “Logic-LM: Empowering Large Language Models With Symbolic Solvers for Faithful Logical Reasoning,” arXiv preprint arXiv:2305.12295 , 2023

  43. [43]

    JARVIS: A Multi-Agent Code Assistant for High-Quality EDA Script Generation,

    G. Pasandi, K. Kunal, V . Tej, K. Shan, H. Sun, S. Jain et al., “JARVIS: A Multi-Agent Code Assistant for High-Quality EDA Script Generation,” arXiv preprint arXiv:2505.14978 , 2025

  44. [44]

    Gorilla: Large Language Model Connected with Massive APIs

    S. G. Patil, T. Zhang, X. Wang and J. E. Gonzalez, “Gorilla: Large Language Model Connected With Massive APIs,” arXiv preprint arXiv:2305.15334, 2023

  45. [45]

    Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA,

    Y . Pu, Z. He, T. Qiu, H. Wu, and B. Yu, “Customized Retrieval Augmented Generation and Benchmarking for EDA Tool Documentation QA,” Proc. ICCAD, 2024, pp. 1–9

  46. [46]

    ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

    Y . Qin, S. Liang, Y . Ye, K. Zhu, L. Yan and Y . Lu et al., “ToolLLM: Facilitating Large Language Models to Master 16,000+ Real-World APIs,” arXiv preprint arXiv:2307.16789 , 2023

  47. [47]

    Enhancing Electronic Design Automation With Large Language Models: A Taxonomy, Analysis and Opportunities,

    N. Rai, “Enhancing Electronic Design Automation With Large Language Models: A Taxonomy, Analysis and Opportunities,” TechRxiv preprint, 2025

  48. [48]

    OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks,

    U. Sharma, B.-Y . Wu, S. R. D. Kankipati, V . A. Chhabria, and A. Rovinski, “OpenROAD-Assistant: An Open-Source Large Language Model for Physical Design Tasks,” Proc. MLCAD, pp. 1–7, 2024

  49. [49]

    MetaBO: Meta-Learning for Fast Bayesian Optimization in Chip Design,

    R. Shu, J. Wang and W. Li, “MetaBO: Meta-Learning for Fast Bayesian Optimization in Chip Design,” TCAD, 2024, Early Access

  50. [50]

    Ask-EDA: A Design Assistant Empowered With LLM, Hybrid RAG and Abbreviation De-Hallucination,

    L. Shi, M. Kazda, B. Sears, N. Shropshire and R. Puri, “Ask-EDA: A Design Assistant Empowered With LLM, Hybrid RAG and Abbreviation De-Hallucination,” arXiv preprint arXiv:2406.06575 , 2024

  51. [51]

    AI Agents: AutoGPT Architecture & Break- down,

    G. Sung, “AI Agents: AutoGPT Architecture & Break- down,” Medium, 2023, https://medium.com/@georgesung/ ai-agents-autogpt-architecture-breakdown-ba37d60db944

  52. [52]

    Solving Olympiad Geometry Without Human Demonstrations,

    T. H. Trinh, Y . Wu, Q. V . Le, H. He and T. Luong, “Solving Olympiad Geometry Without Human Demonstrations,” Nature, 625 (7995) (2024), pp. 476–482

  53. [53]

    A Survey on Large Language Model Based Autonomous Agents,

    L. Wang, C. Ma, X. Feng, Z. Zhang, H. Yang, J. Zhang et al., “A Survey on Large Language Model Based Autonomous Agents,” Frontiers of Computer Science, 18 (6) (2024), p. 186345

  54. [54]

    Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

    J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter and F. Xia et al., “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models,” arXiv preprint arXiv:2201.11903 , 2022

  55. [55]

    CMOS VLSI Design: A Circuits and Systems Perspective,

    N. H. E. Weste and D. M. Harris, “CMOS VLSI Design: A Circuits and Systems Perspective,” Addison-Wesley, 2011

  56. [56]

    ChatEDA: A Large Language Model–Powered Autonomous Agent for EDA,

    H. Wu, Z. He, X. Zhang, X. Yao, S. Zheng, H. Zheng and B. Yu, “ChatEDA: A Large Language Model–Powered Autonomous Agent for EDA,” TCAD, 2024

  57. [57]

    Hardware Design and Verification With Large Language Models,

    Y .-H. Wu, Y . Lee and E. Lee, “Hardware Design and Verification With Large Language Models,” Electronics, 14 (1) (2024), pp. 120

  58. [58]

    Divergent Thoughts Toward One Goal: LLM-Based Multi-Agent Collaboration System for Electronic Design Automation,

    H. Wu, H. Zheng, Z. He and B. Yu, “Divergent Thoughts Toward One Goal: LLM-Based Multi-Agent Collaboration System for Electronic Design Automation,” arXiv preprint arXiv:2502.10857 , 2025

  59. [59]

    OpenAgents: An open platform for language agents in the wild.arXiv preprint arXiv:2310.10634, 2023

    T. Xie, F. Zhou, Z. Cheng, C. Xiong and T. Yu, “OpenAgents: An Open Platform for Language Agents in the Wild,” arXiv preprint arXiv:2310.10634, 2023

  60. [60]

    SATzilla: Portfolio- Based Algorithm Selection for SAT,

    L. Xu, F. Hutter, H. H. Hoos and K. Leyton-Brown, “SATzilla: Portfolio- Based Algorithm Selection for SAT,” J. Artif. Intell. Res. , 32 (2008), pp. 565–606

  61. [61]

    Large Language Models as Optimizers

    C. Yang, X. Wang, Y . Lu, H. Liu, Q. V . Le, D. Zhou and X. Chen, “Large Language Models as Optimizers,” arXiv preprint arXiv:2309.03409 , 2023

  62. [62]

    ADO-LLM: Analog Design Bayesian Optimization With In-Context Learning of Large Language Models,

    Y . Yin, Y . Wang, B. Xu and P. Li, “ADO-LLM: Analog Design Bayesian Optimization With In-Context Learning of Large Language Models,” Proc. ICCAD, 2024

  63. [63]

    Large Language Models for EDA: Future or Mirage?,

    Z. He and B. Yu, “Large Language Models for EDA: Future or Mirage?,” Proc. ISPD, 2024

  64. [64]

    WebArena: A Realistic Web Environment for Building Autonomous Agents,

    S. Zhou, F. F. Xu, H. Zhu and G. Neubig, “WebArena: A Realistic Web Environment for Building Autonomous Agents,” Proc. ICLR, 2024

  65. [65]

    LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation,

    R. Zhong, X. Du, S. Kai, Z. Tang, S. Xu and H.-L. Zhen et al., “LLM4EDA: Emerging Progress in Large Language Models for Electronic Design Automation,” arXiv preprint arXiv:2401.12224v1 , 2023

  66. [66]

    AlphaCode 2,

    “AlphaCode 2,” TechCrunch coverage, 2023, https://techcrunch.com/2023/ 12/06/deepmind-unveils-alphacode-2-powered-by-gemini/

  67. [67]

    Introducing Claude 4,

    Anthropic, “Introducing Claude 4,” https://www.anthropic.com/news/ claude-4

  68. [68]

    https://github.com/ The-OpenROAD-Project/asap7

    ASAP7 PDK and Cell Libraries Repo. https://github.com/ The-OpenROAD-Project/asap7

  69. [69]

    Atopile https://www.atopile.io/

  70. [70]

    Cerebrus,

    “Cerebrus,” Cadence Intelligent Chip Explorer, https://www. cadence.com/en US/home/tools/digital-design-and-signoff/ soc-implementation-and-floorplanning/cerebrus-intelligent-chip-explorer. html

  71. [71]

    ChipAgents https://chipagents.ai/

  72. [72]

    Configurable Graph-Based Task Solving With the MARCO Multi-AI Agent Framework for Chip Design,

    “Configurable Graph-Based Task Solving With the MARCO Multi-AI Agent Framework for Chip Design,” NVIDIA Developer Blog, 2025, https://developer.nvidia.com/

  73. [73]

    Devin: The First AI Software Engineer,

    “Devin: The First AI Software Engineer,” Cognition Labs, 2024, https: //devin.ai

  74. [74]

    Diode Computers https://www.ycombinator.com/companies/ diode-computers-inc

  75. [75]

    “DSO.ai,” Synopsys, https://www.synopsys.com/ai/ai-powered-eda/dso-ai. html

  76. [76]

    ISLAD https://www.islad.org/

  77. [77]

    “LLAMBO,” https://github.com/tennisonliu/LLAMBO

  78. [78]

    MLCAD https://mlcad.org/symposium/2025/

  79. [79]

    OpenROAD-Flow-Scripts,

    “OpenROAD-Flow-Scripts,” https://github.com/The-OpenROAD-Project/ OpenROAD-flow-scripts, commit hash: ce8d36a

  80. [80]

    ORFS-Agent Code,

    “ORFS-Agent Code,” https://github.com/ABKGroup/ORFS-Agent

Showing first 80 references.