Duet instrumentation uses LLM-driven code analysis to instrument performance-relevant changes between two app versions, detecting regressions at up to 5x lower severity than standard duet benchmarks in a testbed evaluation.
In: Proceedings of the 15th International ConferenceonMiningSoftwareRepositories(MSR).pp.181–191.ACM,NewYork, NY, USA (2018)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
A controlled benchmark on 2040 problems reveals poor generalization and high interference in model editing for API updates in code LLMs, with many successes being workarounds rather than true migrations.
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
Hidden dependencies and component variants in SBOMs cause inconsistent vulnerability reporting and VEX handling across scanners.
citing papers explorer
-
Duet instrumentation: An Agentic Approach to Improving Sensitivity in Cloud Service Benchmarking
Duet instrumentation uses LLM-driven code analysis to instrument performance-relevant changes between two app versions, detecting regressions at up to 5x lower severity than standard duet benchmarks in a testbed evaluation.
-
Understanding Robustness of Model Editing in Code LLMs
A controlled benchmark on 2040 problems reveals poor generalization and high interference in model editing for API updates in code LLMs, with many successes being workarounds rather than true migrations.
-
CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation
CodeXGLUE supplies a standardized collection of 10 code-related tasks, 14 datasets, an evaluation platform, and BERT-, GPT-, and encoder-decoder-style baselines.
-
Hidden Dependencies and Component Variants in SBOM-Based Software Composition Analysis
Hidden dependencies and component variants in SBOMs cause inconsistent vulnerability reporting and VEX handling across scanners.