Title resolution pending

Kosei Horikawa, Hao Li, Yutaro Kashiwa, Bram Adams, Hajimu Iida, Ahmed E · 2025 · arXiv 2511.04824

11 Pith papers cite this work. Polarity classification is still indexing.

11 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

A Dataset of Agentic AI Coding Tool Configurations

cs.SE · 2026-05-08 · accept · novelty 8.0

A publicly released dataset of 15,591 configuration artifacts for five agentic AI coding tools, drawn from 4,738 GitHub repositories along with associated files and AI-co-authored commits.

Foundation Models as Oracles for Refactoring Correctness Detection

cs.SE · 2026-05-03 · unverdicted · novelty 7.0

Foundation models serve as effective oracles for detecting refactoring correctness issues in Java programs, achieving up to 93.8% accuracy in zero-shot evaluations on 226 real bugs.

Do AI Coding Agents Log Like Humans? An Empirical Study

cs.SE · 2026-04-10 · unverdicted · novelty 7.0

AI agents modify logging less often than humans in 58.4% of repositories but produce higher log density when they change it; explicit logging instructions are rare (4.7%) and ignored 67% of the time, with humans performing 72.5% of post-generation log repairs.

AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub

cs.SE · 2026-04-04 · accept · novelty 7.0

AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.

How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests

cs.SE · 2026-01-24 · unverdicted · novelty 7.0

AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.

To What Extent Does Agent-generated Code Require Maintenance? An Empirical Study

cs.SE · 2026-05-07 · unverdicted · novelty 6.0 · 2 refs

AI-generated code requires less maintenance than human-written code, mostly involving feature additions by humans rather than bug fixes.

"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution

cs.SE · 2026-05-21 · unverdicted · novelty 5.0

Empirical study finds coding agents produce fewer and less intense tangled refactorings than humans on Multi-SWE-bench; a refactoring-aware refinement improves compilability from 19.34% to 38.33% and resolves 2.79% more issues.

KISS Sorcar: A Stupidly-Simple General-Purpose and Software Engineering AI Assistant

cs.SE · 2026-04-26 · unverdicted · novelty 5.0

KISS Sorcar introduces a simple layered agent framework and VS Code IDE that reaches 62.2% pass rate on Terminal Bench 2.0 by combining ReAct execution, summarization-based continuation, parallel tools, persistent history, and git worktree isolation while self-validating outputs.

Scaling Human-AI Coding Collaboration Requires a Governable Consensus Layer

cs.SE · 2026-04-20 · unverdicted · novelty 5.0

Agentic Consensus replaces code as the main artifact with a typed property graph world model that maintains commitments and evidence through synchronization operators, shifting evaluation to alignment fidelity and consensus entropy.

From Industry Claims to Empirical Reality: An Empirical Study of Code Review Agents in Pull Requests

cs.SE · 2026-04-03 · conditional · novelty 5.0

Code review agents achieve 45.20% merge rate on PRs versus 68.37% for humans, with 60.2% of agent-only closed PRs showing 0-30% signal quality.

Quality and Security Signals in AI-Generated Python Refactoring Pull Requests

cs.SE · 2026-05-20 · unverdicted · novelty 4.0

Empirical analysis of AI refactoring PRs shows quality attribute improvements in 22.5% of cases with new Pylint issues in 24.17% and Bandit findings in 4.7%, yet 73.5% developer acceptance.

citing papers explorer

Showing 11 of 11 citing papers.

A Dataset of Agentic AI Coding Tool Configurations cs.SE · 2026-05-08 · accept · none · ref 9
A publicly released dataset of 15,591 configuration artifacts for five agentic AI coding tools, drawn from 4,738 GitHub repositories along with associated files and AI-co-authored commits.
Foundation Models as Oracles for Refactoring Correctness Detection cs.SE · 2026-05-03 · unverdicted · none · ref 62
Foundation models serve as effective oracles for detecting refactoring correctness issues in Java programs, achieving up to 93.8% accuracy in zero-shot evaluations on 226 real bugs.
Do AI Coding Agents Log Like Humans? An Empirical Study cs.SE · 2026-04-10 · unverdicted · none · ref 7
AI agents modify logging less often than humans in 58.4% of repositories but produce higher log density when they change it; explicit logging instructions are rare (4.7%) and ignored 67% of the time, with humans performing 72.5% of post-generation log repairs.
AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub cs.SE · 2026-04-04 · accept · none · ref 33
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
How AI Coding Agents Modify Code: A Large-Scale Study of GitHub Pull Requests cs.SE · 2026-01-24 · unverdicted · none · ref 19
AI coding agents produce pull requests with substantially more commits and slightly higher description-to-diff similarity than human developers, based on analysis of 29,095 merged PRs.
To What Extent Does Agent-generated Code Require Maintenance? An Empirical Study cs.SE · 2026-05-07 · unverdicted · none · ref 28 · 2 links
AI-generated code requires less maintenance than human-written code, mostly involving feature additions by humans rather than bug fixes.
"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution cs.SE · 2026-05-21 · unverdicted · none · ref 20
Empirical study finds coding agents produce fewer and less intense tangled refactorings than humans on Multi-SWE-bench; a refactoring-aware refinement improves compilability from 19.34% to 38.33% and resolves 2.79% more issues.
KISS Sorcar: A Stupidly-Simple General-Purpose and Software Engineering AI Assistant cs.SE · 2026-04-26 · unverdicted · none · ref 10
KISS Sorcar introduces a simple layered agent framework and VS Code IDE that reaches 62.2% pass rate on Terminal Bench 2.0 by combining ReAct execution, summarization-based continuation, parallel tools, persistent history, and git worktree isolation while self-validating outputs.
Scaling Human-AI Coding Collaboration Requires a Governable Consensus Layer cs.SE · 2026-04-20 · unverdicted · none · ref 12
Agentic Consensus replaces code as the main artifact with a typed property graph world model that maintains commitments and evidence through synchronization operators, shifting evaluation to alignment fidelity and consensus entropy.
From Industry Claims to Empirical Reality: An Empirical Study of Code Review Agents in Pull Requests cs.SE · 2026-04-03 · conditional · none · ref 8
Code review agents achieve 45.20% merge rate on PRs versus 68.37% for humans, with 60.2% of agent-only closed PRs showing 0-30% signal quality.
Quality and Security Signals in AI-Generated Python Refactoring Pull Requests cs.SE · 2026-05-20 · unverdicted · none · ref 17
Empirical analysis of AI refactoring PRs shows quality attribute improvements in 22.5% of cases with new Pylint issues in 24.17% and Bandit findings in 4.7%, yet 73.5% developer acceptance.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer