Benchmarking large language models for auto- mated verilog rtl code generation

· 2022 · arXiv 2212.11140

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

HarmChip: Evaluating Hardware Security Centric LLM Safety via Jailbreak Benchmarking

cs.CR · 2026-04-18 · unverdicted · novelty 7.0

HarmChip is a new benchmark exposing an alignment paradox where LLMs refuse legitimate hardware security queries but comply with semantically disguised malicious requests.

RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation

cs.SE · 2026-04-27 · unverdicted · novelty 6.0

RefEvo achieves 95% pass rate on 20 hardware modules for SystemC reference model generation using dynamic multi-agent planning, co-evolutionary verification, and spec anchoring, with 71% token reduction.

Configuration Over Selection: Hyperparameter Sensitivity Exceeds Model Differences in Open-Source LLMs for RTL Generation

cs.AR · 2026-04-18 · unverdicted · novelty 6.0

Hyperparameter configuration in open-source LLMs for RTL generation produces up to 25.5% intra-model pass-rate variation on VerilogEval and RTLLM, exceeding inter-model spreads by 5x with near-zero correlation in optimal settings across benchmarks.

citing papers explorer

Showing 3 of 3 citing papers.

HarmChip: Evaluating Hardware Security Centric LLM Safety via Jailbreak Benchmarking cs.CR · 2026-04-18 · unverdicted · none · ref 9
HarmChip is a new benchmark exposing an alignment paradox where LLMs refuse legitimate hardware security queries but comply with semantically disguised malicious requests.
RefEvo: Agentic Design with Co-Evolutionary Verification for Agile Reference Model Generation cs.SE · 2026-04-27 · unverdicted · none · ref 14
RefEvo achieves 95% pass rate on 20 hardware modules for SystemC reference model generation using dynamic multi-agent planning, co-evolutionary verification, and spec anchoring, with 71% token reduction.
Configuration Over Selection: Hyperparameter Sensitivity Exceeds Model Differences in Open-Source LLMs for RTL Generation cs.AR · 2026-04-18 · unverdicted · none · ref 1
Hyperparameter configuration in open-source LLMs for RTL generation produces up to 25.5% intra-model pass-rate variation on VerilogEval and RTLLM, exceeding inter-model spreads by 5x with near-zero correlation in optimal settings across benchmarks.

Benchmarking large language models for auto- mated verilog rtl code generation

fields

years

verdicts

representative citing papers

citing papers explorer