IndustryCode is the first multi-domain, multi-language benchmark for industrial code generation, with the top model (Claude 4.5 Opus) reaching 68.1% accuracy on sub-problems and 42.5% on main problems.
In reasoning-enhanced models, the generation of intermediate thoughts activates broad engineering heuristics that can conflict with local requirements
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
CONDITIONAL 1representative citing papers
citing papers explorer
-
IndustryCode: A Benchmark for Industry Code Generation
IndustryCode is the first multi-domain, multi-language benchmark for industrial code generation, with the top model (Claude 4.5 Opus) reaching 68.1% accuracy on sub-problems and 42.5% on main problems.