RTL-BenchLS supplies a large-scale formally verified benchmark and three novel tasks that expose low performance of frontier LLMs on realistic RTL reasoning and generation.
Mage: A multi-agent engine for automated rtl code generation
7 Pith papers cite this work. Polarity classification is still indexing.
years
2026 7representative citing papers
CHIA introduces a framework for building and deploying agentic AI co-design flows as CHIA loops with tool nodes, reliability mechanisms, and five case-study demonstrations.
TimingLLM uses a fine-tuned LLM to generate structural timing cues from Verilog followed by a retrieval-augmented regressor with a learned steering vector to predict WNS and TNS with R values of 0.91 and 0.97.
CASS-RTL identifies correctness-linked attention heads, builds a steering subspace from them, and applies a geometry-aware intervention that raises pass@1/5/10 accuracy 10-20% on VerilogEval and 5% on CVDP across multiple LLMs without retraining or extra labels.
Verilog-Evolve uses executable feedback from simulation, synthesis, timing, and GEMM metrics to refine LLM-generated Verilog and evolves skills across tasks, improving functional success and downstream hardware quality on VerilogEval and mixed-precision GEMM benchmarks.
RTL-BenchMT is an agent-assisted framework for dynamically maintaining RTL generation benchmarks by fixing flaws and reducing overfitting in LLM-based EDA applications.
HORIZON applies repository-level self-evolution to hardware design artifacts and reports 100% completion on ChipBench, RTLLM, Verilog-Eval, and nine CVDP categories using a hands-free agent loop.
citing papers explorer
-
RTL-BenchLS: A Large-Scale Benchmark for RTL Reasoning and Generation with Large Language Models
RTL-BenchLS supplies a large-scale formally verified benchmark and three novel tasks that expose low performance of frontier LLMs on realistic RTL reasoning and generation.
-
CHIA: An open-source framework for principled, agentic AI-driven hardware/software co-design research
CHIA introduces a framework for building and deploying agentic AI co-design flows as CHIA loops with tool nodes, reliability mechanisms, and five case-study demonstrations.
-
CASS-RTL: Correctness-Aware Subspace Steering for RTL Generation with LLMs
CASS-RTL identifies correctness-linked attention heads, builds a steering subspace from them, and applies a geometry-aware intervention that raises pass@1/5/10 accuracy 10-20% on VerilogEval and 5% on CVDP across multiple LLMs without retraining or extra labels.
-
Verilog-Evolve: Feedback-Driven and Skill-Evolving Verilog Generation
Verilog-Evolve uses executable feedback from simulation, synthesis, timing, and GEMM metrics to refine LLM-generated Verilog and evolves skills across tasks, improving functional success and downstream hardware quality on VerilogEval and mixed-precision GEMM benchmarks.
-
RTL-BenchMT: Dynamic Maintenance of RTL Generation Benchmark Through Agent-Assisted Analysis and Revision
RTL-BenchMT is an agent-assisted framework for dynamically maintaining RTL generation benchmarks by fixing flaws and reducing overfitting in LLM-based EDA applications.
-
Agentic Hardware Design as Repository-Level Code Evolution
HORIZON applies repository-level self-evolution to hardware design artifacts and reports 100% completion on ChipBench, RTLLM, Verilog-Eval, and nine CVDP categories using a hands-free agent loop.