LLM-generated unit tests with retrieval-augmented context detect faults in 69% of real Python bugs versus 17.2% for general-purpose human-written tests, with similar coverage levels.
How students unit test: Perceptions, practices, and pitfalls,
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.SE 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
LLM vs. Human Unit Tests: Fault Detection on Real Python Bugs
LLM-generated unit tests with retrieval-augmented context detect faults in 69% of real Python bugs versus 17.2% for general-purpose human-written tests, with similar coverage levels.