LibEvoBench benchmark shows LLMs are version-oblivious on evolving APIs, with documentation helping but version specification not.
Title resolution pending
6 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
Randomized experiment finds AI draft assistance raises feedback provision by teaching assistants 10.8 percentage points without harming quality.
Hot fixes show urgency patterns with reduced collaboration and testing, differing from regular fixes, and human versus AI agents display over 10 distinct repair behaviors in large-scale GitHub data.
StarCoder2-15B matches or beats CodeLlama-34B on code tasks despite being smaller, and StarCoder2-3B outperforms prior 15B models, with open weights and exact training data identifiers released.
Empirical evaluation shows that code generated by all seven tested LLMs contains vulnerabilities, the majority of critical or high severity.
A research roadmap analyzing the current state of search-based software engineering with foundation models, outlining challenges and directions across three integration aspects.
citing papers explorer
-
Search-Based Software Engineering and AI Foundation Models: Current Landscape and Future Roadmap
A research roadmap analyzing the current state of search-based software engineering with foundation models, outlining challenges and directions across three integration aspects.