An empirical study of 86,156 test patches from five AI agents finds 80.2% lack strong oracle signals, with strong oracles linked to higher merge rates (OR=1.28) after regression controls.
Replication package for “all smoke, no alarm: Oracle signals in agent-authored test code
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
All Smoke, No Alarm: Oracle Signals in Agent-Authored Test Code
An empirical study of 86,156 test patches from five AI agents finds 80.2% lack strong oracle signals, with strong oracles linked to higher merge rates (OR=1.28) after regression controls.