RealBench is a repo-level code generation benchmark pairing UML diagrams with natural language requirements, revealing that LLMs perform significantly worse on realistic repo-level tasks than existing benchmarks suggest.
2017.Unified Modeling Language (UML) Version 2.5.1
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
RealBench: A Repo-Level Code Generation Benchmark Aligned with Real-World Software Development Practices
RealBench is a repo-level code generation benchmark pairing UML diagrams with natural language requirements, revealing that LLMs perform significantly worse on realistic repo-level tasks than existing benchmarks suggest.