A catalog of ten cache smells in GitLab CI/CD, an automated detector achieving 0.98 F1, and empirical evidence that the smells appear in 89% of 228 mature open-source projects.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.SE 2representative citing papers
Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.
citing papers explorer
-
Cache-Related Smells in GitLab CI/CD: Comprehensive Catalog, Automated Detection, and Empirical Evidence
A catalog of ten cache smells in GitLab CI/CD, an automated detector achieving 0.98 F1, and empirical evidence that the smells appear in 89% of 228 mature open-source projects.
-
Can Language Models Go Beyond Coding? Assessing the Capability of Language Models to Build Real-World Systems
Build-bench is the first architecture-aware benchmark that evaluates LLMs on repairing cross-ISA build failures via iterative tool-augmented reasoning, with the best model reaching 63.19% success.