RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.
On learning meaningful code changes via neural machine translation
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
PhantomRun standardizes CI build log retrieval and reproduction for embedded systems, reconstructing 91.8% of 4628 failing runs while preserving outcomes in 98% of cases.
Post-release defects concentrate in older, frequently modified high-churn components and require longer and more complex fixes than pre-release defects.
citing papers explorer
-
RAG-Reflect: Agentic Retrieval-Augmented Generation with Reflections for Comment-Driven Code Maintenance on Stack Overflow
RAG-Reflect achieves F1=0.78 on valid comment-edit prediction using retrieval-augmented reasoning and self-reflection, outperforming baselines and approaching fine-tuned models without retraining.
-
Where did we fail? -- Reproducing build failures in embedded open source software
PhantomRun standardizes CI build log retrieval and reproduction for embedded systems, reconstructing 91.8% of 4628 failing runs while preserving outcomes in 98% of cases.
-
What Makes Software Bugs Escape Testing? Evidence from a Large-Scale Empirical Study
Post-release defects concentrate in older, frequently modified high-churn components and require longer and more complex fixes than pre-release defects.