AIDev is a new open dataset of 456k AI-agent pull requests showing agents submit code faster than humans but with lower acceptance rates and simpler changes.
Hassan, Gustavo A
8 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 8roles
background 3polarities
background 3representative citing papers
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.
Analysis of 9,799 human-reviewed agentic PRs shows only 35.7% of rejections reflect clear agent failures, with 31.2% due to workflow constraints and 33.1% lacking clear rationale, plus notable interaction differences across agents.
AI-native software ecosystems exhibit emergent behaviors best explained by complex adaptive systems theory, requiring new ecosystem-level monitoring and seven testable propositions that may extend or replace Lehman's laws.
IACDM is an 8-phase methodology using external verification agents and three pillars to close the verification gap in stochastic LLM-based software development.
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.
A rapid review of fairness in LLM-enabled multi-agent systems for the software development lifecycle concludes that the field lacks standardized evaluations, broad coverage, and effective governance, leaving it unprepared for deployable fair systems.
citing papers explorer
-
The Rise of AI Teammates in Software Engineering (SE) 3.0: How Autonomous Coding Agents Are Reshaping Software Engineering
AIDev is a new open dataset of 456k AI-agent pull requests showing agents submit code faster than humans but with lower acceptance rates and simpler changes.
-
AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
-
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests
AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.
-
Why Are Agentic Pull Requests Merged or Rejected? An Empirical Study
Analysis of 9,799 human-reviewed agentic PRs shows only 35.7% of rejections reflect clear agent failures, with 31.2% due to workflow constraints and 33.1% lacking clear rationale, plus notable interaction differences across agents.
-
More Is Different: Toward a Theory of Emergence in AI-Native Software Ecosystems
AI-native software ecosystems exhibit emergent behaviors best explained by complex adaptive systems theory, requiring new ecosystem-level monitoring and seven testable propositions that may extend or replace Lehman's laws.
-
IACDM: Interactive Adversarial Convergence Development Methodology -- A Structured Framework for AI-Assisted Software Development
IACDM is an 8-phase methodology using external verification agents and three pillars to close the verification gap in stochastic LLM-based software development.
-
From Cool Demos to Production-Ready FMware: Core Challenges and a Technology Roadmap
A semi-structured thematic synthesis identifies core challenges in FM selection, alignment, prompting, orchestration, testing, deployment, and cross-cutting concerns like observability for production-ready FMware.
-
Fairness in Multi-Agent Systems for Software Engineering: An SDLC-Oriented Rapid Review
A rapid review of fairness in LLM-enabled multi-agent systems for the software development lifecycle concludes that the field lacks standardized evaluations, broad coverage, and effective governance, leaving it unprepared for deployable fair systems.