LLM evaluation for RTL generation identifies three performance tiers with frontier models reaching high synthesis quality and reveals systematic failure differences between proprietary and open models.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
TestGeneralizer generalizes an initial test into a set of executable tests covering more diverse scenarios, delivering +31.66% mutation-based and +23.08% LLM-assessed scenario coverage gains over ChatTester on 12 open-source Java projects.
citing papers explorer
-
Synthesis-in-the-Loop Evaluation of LLMs for RTL Generation: Quality, Reliability, and Failure Modes
LLM evaluation for RTL generation identifies three performance tiers with frontier models reaching high synthesis quality and reveals systematic failure differences between proprietary and open models.
-
Generalizing Test Cases for Comprehensive Test Scenario Coverage
TestGeneralizer generalizes an initial test into a set of executable tests covering more diverse scenarios, delivering +31.66% mutation-based and +23.08% LLM-assessed scenario coverage gains over ChatTester on 12 open-source Java projects.