A new deforking map p2PFull for WoC V2604 collapses raw repositories into projects via shared-commit groups and Louvain clustering, recovering cross-forge fork families with 99.01% agreement to GitHub's graph.
Ger- man, and Daniela Damian
14 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.SE 14roles
background 1polarities
background 1representative citing papers
AgenticFlict is a public dataset of 29K+ textual merge conflicts from AI agent PRs, collected via merge simulation on 107K processed PRs and showing a 27.67% conflict rate with variation across agents.
Within-reviewer analysis of 11,429 reviews shows AI code approval rising from 30.1% to 36.8% with experience, with reduced inline comments and increased latency, consistent with habituation.
Empirical analysis of 2,984 dormant-revived scientific OSS projects shows fixed inactivity thresholds are insufficient for classifying abandonment, with lifecycle archetypes providing better discrimination.
Empirical analysis of 4707 MoltBook posts shows AI-only technical discourse focuses on security, trust, and abstract topics while lacking concrete runtime and project details found in human GitHub discussions.
A paraphrase-robust duplicate-step detector for Gherkin BDD suites, built on a new 1.1M-step public corpus, reports F1 scores up to 0.906 and estimates 893k eliminable step occurrences corpus-wide.
A composable DSL for describing sampling workflows on code repositories enables explicit specification and statistical reasoning about the generalizability of empirical software engineering findings.
Empirical study of 1,454 Java OSS projects finds weak correlations among quality assurance practices and greater intensity in mature projects for ASAT and code review but not CI.
UNICS pre-trains on a pseudocode dataset for cross-lingual logic then applies multi-task transfer learning with hard-positive mining and dynamic hard-negative sampling to reach claimed SOTA on multilingual code-search benchmarks.
Pre-registered protocol for a GitHub mining study examining distribution, evolution, Bounded Context violations, and maintenance links of Domain-Driven Design tactical patterns in open-source projects.
ML-specific code smells occur 41-94 times less often than general Python smells in 279 projects, with associations to commit frequency and domain but none for general smells or most other project characteristics.
Study of 362 Java projects finds MySQL and PostgreSQL dominate relational use while Redis and MongoDB lead non-relational, with frequent multi-DBM co-use and ORM mediation.
Developers most frequently reference the full Log4j migration guide in pull request descriptions (82.81% of cases) and continue consulting it during post-update maintenance tasks.
Specificity and Context predict actionable code generation while Verification predicts adoption and Context predicts integration depth in LLM-assisted PR workflows.
citing papers explorer
-
State-Of-The-Practice in Quality Assurance in Java-Based Open Source Software Development
Empirical study of 1,454 Java OSS projects finds weak correlations among quality assurance practices and greater intensity in mature projects for ASAT and code review but not CI.