Case study of 18,020 Kubernetes PRs shows label-diff congruence is prevalent and stable, with higher congruence linked to fewer review participants among core developers and more among one-time contributors.
2013.Applied logistic regression
3 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.SE 3years
2026 3verdicts
UNVERDICTED 3representative citing papers
Empirical study finds coding agents produce fewer and less intense tangled refactorings than humans on Multi-SWE-bench; a refactoring-aware refinement improves compilability from 19.34% to 38.33% and resolves 2.79% more issues.
High AUC does not guarantee a defect prediction model outperforms random guessing on true and false positive rates for all thresholds.
citing papers explorer
-
Efficiency for Experts, Visibility for Newcomers: A Case Study of Label-Code Alignment in Kubernetes
Case study of 18,020 Kubernetes PRs shows label-diff congruence is prevalent and stable, with higher congruence linked to fewer review participants among core developers and more among one-time contributors.
-
"Refactoring Runaway": Understanding and Mitigating Tangled Refactorings in Coding Agents for Issue Resolution
Empirical study finds coding agents produce fewer and less intense tangled refactorings than humans on Multi-SWE-bench; a refactoring-aware refinement improves compilability from 19.34% to 38.33% and resolves 2.79% more issues.
-
Evaluating Software Defect Prediction Models via the Area Under the ROC Curve Can Be Misleading
High AUC does not guarantee a defect prediction model outperforms random guessing on true and false positive rates for all thresholds.