Beyond Synthetic Benchmarks.arXiv:2510.26130, October

Musfiqur Rahman, SayedHassan Khatoonabadi, Emad Shihab · 2025 · arXiv 2510.26130

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

ClassEval-Pro: A Cross-Domain Benchmark for Class-Level Code Generation

cs.SE · 2026-04-29 · unverdicted · novelty 7.0

ClassEval-Pro benchmark shows frontier LLMs achieve at most 45.6% Pass@1 on class-level code tasks, with logic errors (56%) and dependency errors (38%) as dominant failure modes.

VeriGraphi: A Multi-Agent Framework of Hierarchical RTL Generation for Large Hardware Designs

cs.AR · 2026-04-16 · unverdicted · novelty 6.0

VeriGraphi introduces a knowledge-graph-anchored multi-agent pipeline that produces reliable hierarchical synthesizable Verilog for complex designs such as RISC-V processors.

Compiled AI: Deterministic Code Generation for LLM-Based Workflow Automation

cs.SE · 2026-04-06 · unverdicted · novelty 4.0

Compiled AI generates deterministic code artifacts from LLMs in a one-time compilation step, enabling reliable workflow execution with zero runtime tokens after break-even.

citing papers explorer

Showing 3 of 3 citing papers.

ClassEval-Pro: A Cross-Domain Benchmark for Class-Level Code Generation cs.SE · 2026-04-29 · unverdicted · none · ref 30
ClassEval-Pro benchmark shows frontier LLMs achieve at most 45.6% Pass@1 on class-level code tasks, with logic errors (56%) and dependency errors (38%) as dominant failure modes.
VeriGraphi: A Multi-Agent Framework of Hierarchical RTL Generation for Large Hardware Designs cs.AR · 2026-04-16 · unverdicted · none · ref 2
VeriGraphi introduces a knowledge-graph-anchored multi-agent pipeline that produces reliable hierarchical synthesizable Verilog for complex designs such as RISC-V processors.
Compiled AI: Deterministic Code Generation for LLM-Based Workflow Automation cs.SE · 2026-04-06 · unverdicted · none · ref 14
Compiled AI generates deterministic code artifacts from LLMs in a one-time compilation step, enabling reliable workflow execution with zero runtime tokens after break-even.

Beyond Synthetic Benchmarks.arXiv:2510.26130, October

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer