MR-Coupler leverages functional coupling analysis and LLMs to generate valid metamorphic test cases for over 90% of tasks while detecting 44% of real bugs, outperforming baselines by 64.90% in validity and 36.56% in false-alarm reduction.
Title resolution pending
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 2representative citing papers
iCoRe improves Fail-to-Pass rates to 42.0% and 52.8% on two bug reproduction benchmarks by using correlation-aware iterative retrieval instead of standard semantic or BM25 methods.
MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.
citing papers explorer
-
MR-Coupler: Automated Metamorphic Test Generation via Functional Coupling Analysis
MR-Coupler leverages functional coupling analysis and LLMs to generate valid metamorphic test cases for over 90% of tasks while detecting 44% of real bugs, outperforming baselines by 64.90% in validity and 36.56% in false-alarm reduction.
-
iCoRe: An Iterative Correlation-Aware Retriever for Bug Reproduction Test Generation
iCoRe improves Fail-to-Pass rates to 42.0% and 52.8% on two bug reproduction benchmarks by using correlation-aware iterative retrieval instead of standard semantic or BM25 methods.
-
MR-Adopt: Automatic Deduction of Input Transformation Function for Metamorphic Testing
MR-Adopt deduces input transformations from hard-coded MR test cases using LLMs, data-flow refinement, and output-relation selection to enable reuse with new source inputs.
-
Benchmark Data Contamination of Large Language Models: A Survey
A survey reviewing benchmark data contamination in LLMs, its impact on evaluation, and alternative assessment approaches.