MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases

Congying Xu; Hengcheng Zhu; Jiarong Wu; Shing-Chi Cheung; Valerio Terragni

arxiv: 2304.07548 · v5 · submitted 2023-04-15 · 💻 cs.SE

MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases

Congying Xu , Valerio Terragni , Hengcheng Zhu , Jiarong Wu , Shing-Chi Cheung This is my paper

Pith reviewed 2026-05-24 09:25 UTC · model grok-4.3

classification 💻 cs.SE

keywords metamorphic testingmetamorphic relationstest case synthesisautomated test generationsoftware testingopen source projectstest coveragemutation testing

0 comments

The pith

MR-Scout automatically turns developer test cases into reusable metamorphic relations that generate new tests for similar programs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MR-Scout to mine test cases that already encode metamorphic relations from open-source projects and convert those relations into parameterized, reusable methods. These codified relations are then filtered for quality before being applied to generate additional tests. Experiments show the method located over 11,000 such relations across 701 projects, with more than 97 percent proving high quality. When the relations are used to create new tests, line coverage rises 13.52 percent and mutation scores rise 9.42 percent even on programs that already possess developer-written tests. A separate study finds that 55.76 to 76.92 percent of the relations are readily understandable by developers.

Core claim

MR-Scout discovers MR-encoded test cases in existing OSS test suites, synthesizes the embedded relations into codified parameterized methods, discards low-quality ones, and shows that the retained relations can be applied to other programs sharing similar functionality to produce new tests that measurably raise line coverage and mutation scores.

What carries the argument

The pipeline that identifies MR-encoded test cases (MTCs), synthesizes them into codified MRs, and filters them according to their effectiveness at generating new test cases.

If this is right

Codified MRs extracted from one program can be reused to test other programs with overlapping functionality.
Tests generated from the codified MRs raise line coverage by 13.52 percent and mutation score by 9.42 percent on programs that already have developer tests.
Over 97 percent of the synthesized relations pass quality checks for automated test generation.
Between 55.76 and 76.92 percent of the codified MRs are considered easily comprehensible by developers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

A shared repository of such codified relations could serve as a starting library for metamorphic testing across many projects.
The technique might surface implementation differences between programs that claim the same functionality.
The same mining approach could be tried on other forms of implicit test knowledge beyond metamorphic relations.

Load-bearing premise

Metamorphic relations discovered in test cases written for one program can be safely transferred to other programs that merely share similar functionality without introducing false positives or missing domain-specific constraints.

What would settle it

A case in which a codified MR, when used to generate tests for a similar program, either accepts an implementation that violates the intended relation or rejects a correct implementation.

Figures

Figures reproduced from arXiv: 2304.07548 by Congying Xu, Hengcheng Zhu, Jiarong Wu, Shing-Chi Cheung, Valerio Terragni.

**Figure 1.** Figure 1: A test case crafted from com.itextpdf.layout.renderer.TextRendererTest in project iText. Underlying MR: IF text2 = text1.setBold() THEN text1.width() ≤ text2 .width(). (the associated MR is |𝑃 (𝑎, 𝑏,𝐺)| = |𝑃 (𝑏, 𝑎,𝐺)|). An advantage of MT is that an MR can serve as an oracle that is applicable to many test inputs. It enables automated test case generation by integrating MRs with automatically generated te… view at source ↗

**Figure 2.** Figure 2: A test case crafted from com.conversantmedia.util.concurrent.ConcurrentStackTest in project Disruptor. Underlying MR: x = stack.push(x).pop() — IF an element 𝑥 is pushed onto a stack and the stack subsequently pops off the top element, THEN the element 𝑥 should be the one popped. MR-Scout: Automated Synthesis of Metamorphic Relations from Existing Test Cases 1:5 1 @Test 2 public void pushPopTest() throws E… view at source ↗

**Figure 4.** Figure 4: Overview of MR-Scout ⟨𝑥1, 𝑥2⟩ have the relation x2 .receObj = push(x1) (R𝑖) , THEN the output relation pop(x2) = x1.arg (R𝑜 ) is expected to be satisfied. In this test case, x1.receObj and x1.arg are implemented with stack1 and 3, and the invocation push(x1) is implemented as stack1.push(3). Similarly, x2 .receObj and pop(x2) are implemented with stack2 and stack2.pop() (pop() does not require any argument… view at source ↗

**Figure 5.** Figure 5: Procedure of MR-Scout operating on the MTC simulateWidth() [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 6.** Figure 6: Illustration of constructing a codified MR [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗

**Figure 7.** Figure 7: Distribution of 11,350 MR-encoded test cases (MTCs) in 701 projects w.r.t the number and percentage |MI|=2 64.13% |MI|>2 35.87% (a) Size of involved MI (|MI|) w/ IT 27.80% w/o IT 72.20% (b) Existence of IT, when |MI|=2 [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗

**Figure 8.** Figure 8: Distribution of 21,574 MR instances w.r.t the size of involved method invocations (|MI|) and the existence of an input transformation (IT) Distribution of MTCs. The distribution of MTCs provides insights into how MTCs are spread across projects. The distribution of 11,350 MTCs in the 701 projects varies significantly, ranging from a single MTC to 500 MTCs. As shown in Figure 7a, the majority of the project… view at source ↗

**Figure 10.** Figure 10: Distribution of generated valid inputs 4.3.2 Result. Out of 71 MR-Scout output MRs, we found that 97.18% (69) of MRs are high-quality and even applicable to all valid inputs. Two codified MRs are low-quality. 16 (out of 24) valid inputs of the two codified MRs result in AssertionError alarms. After manually analyzing, we found that the 2 codified MRs are indeed of low quality. For example, the simplified … view at source ↗

**Figure 11.** Figure 11: Enhancement of test adequacy by codified-MR-based test suites ( [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗

**Figure 12.** Figure 12: Comparison of covered and killed mutants by developer-written ( [PITH_FULL_IMAGE:figures/full_fig_p018_12.png] view at source ↗

**Figure 13.** Figure 13: Comprehensibiliy scores of 52 MR-Scout synthesized MRs (Score: 1. very difficult, 2. difficult, 3. easy 4. every easy to understand) 5 DISCUSSION 5.1 Threats to Validity We have identified potential threats to the validity of our experiments and have taken measures to mitigate them. Subjectivity in Human Judgment. The evaluation of precision (RQ1) and comprehensibility (RQ4) depends on human judgment. To … view at source ↗

read the original abstract

Metamorphic Testing (MT) alleviates the oracle problem by defining oracles based on metamorphic relations (MRs), that govern multiple related inputs and their outputs. However, designing MRs is challenging, as it requires domain-specific knowledge. This hinders the widespread adoption of MT. We observe that developer-written test cases can embed domain knowledge that encodes MRs. Such encoded MRs could be synthesized for testing not only their original programs but also other programs that share similar functionalities. In this paper, we propose MR-Scout to automatically synthesize MRs from test cases in open-source software (OSS) projects. MR-Scout first discovers MR-encoded test cases (MTCs), and then synthesizes the encoded MRs into parameterized methods (called codified MRs), and filters out MRs that demonstrate poor quality for new test case generation. MR-Scout discovered over 11,000 MTCs from 701 OSS projects. Experimental results show that over 97% of codified MRs are of high quality for automated test case generation, demonstrating the practical applicability of MR-Scout. Furthermore, codified-MRs-based tests effectively enhance the test adequacy of programs with developer-written tests, leading to 13.52% and 9.42% increases in line coverage and mutation score, respectively. Our qualitative study shows that 55.76% to 76.92% of codified MRs are easily comprehensible for developers.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MR-Scout mines test cases from 701 projects to extract and codify metamorphic relations as reusable generators, with reported lifts in coverage and mutation score.

read the letter

MR-Scout mines test cases from 701 projects to extract and codify metamorphic relations as reusable generators, with reported lifts in coverage and mutation score. The pipeline first identifies test cases that already encode MRs, turns them into parameterized methods, and drops the ones that perform poorly when used to generate fresh tests. They surface over 11,000 such cases and show that more than 97 percent of the codified versions pass the quality filter. When the survivors are applied to programs that already have developer tests, line coverage rises 13.52 percent and mutation score rises 9.42 percent. A qualitative check finds that 55 to 77 percent of them are easy for developers to understand. That combination of scale and measured improvement is the concrete contribution. The work turns an observation about embedded domain knowledge into something that can be run at volume and evaluated directly. The softer spot is the transfer step. The central results come from applying the extracted relations to new programs that share similar functionality, yet the evaluation judges success by the same coverage and mutation numbers used in the filter. There is no separate, independent check shown in the abstract that the relation actually holds on the target implementation rather than simply producing tests that improve the metrics. If domain differences cause the relation to be violated, the generated oracles could be invalid even while the numbers look better. The full paper needs to make the quality metric and the target-program selection explicit so readers can judge how much that risk was measured. This is for people working on metamorphic testing or automated test generation. Anyone already using or building MRs would find the mined examples and the synthesis approach useful. The empirical scale and the direct measurements are solid enough to send to peer review; the transferability question is worth a close look in revision but does not sink the core claim.

Referee Report

3 major / 2 minor

Summary. The paper presents MR-Scout, a technique to mine metamorphic relations (MRs) encoded in existing developer-written test cases from 701 OSS projects (yielding >11,000 MR-encoded test cases). It codifies these into parameterized methods, applies a quality filter for new test generation, and reports that >97% of the resulting codified MRs are high-quality; tests derived from them improve line coverage by 13.52% and mutation score by 9.42% on programs that already have developer tests. A qualitative study finds 55.76–76.92% of the MRs comprehensible to developers.

Significance. If the transferability and quality claims hold under independent validation, the work offers a scalable, artifact-driven route to MR acquisition that could materially increase adoption of metamorphic testing. The scale of the mining study and the inclusion of a developer-comprehensibility assessment are concrete strengths that distinguish it from purely synthetic MR generators.

major comments (3)

[Abstract and §5] Abstract and §5 (evaluation): the 97% 'high-quality' figure, the 13.52% coverage gain, and the 9.42% mutation-score gain are presented without an explicit, independent oracle or validity check that the synthesized MR actually holds on the target programs rather than merely producing additional passing tests; coverage and mutation metrics alone cannot distinguish a sound MR from one that silently accepts incorrect behavior on the new implementation.
[§4.2] §4.2 (transfer step): the criterion used to decide that a target program 'shares similar functionality' with the source of an MTC is not formalized, so it is impossible to assess whether domain-specific constraints present in the original tests but absent from the target are being violated by the transferred MR.
[§5.1] §5.1 (experimental design): the paper does not describe the baseline MR generators, the statistical tests applied to the coverage/mutation deltas, or the sampling procedure for the programs used in the transfer experiment; without these details the quantitative claims cannot be reproduced or compared.

minor comments (2)

[§3] The definition of 'codified MR' (parameterized method) should be accompanied by a small illustrative example in §3 so readers can see the exact syntactic form that is later filtered and reused.
[§5] Table or figure captions in the evaluation section should explicitly state the number of programs, number of MRs, and number of generated tests underlying each reported percentage.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight important aspects of evaluation validity, transfer criteria, and experimental reproducibility. We address each major comment below and indicate planned revisions.

read point-by-point responses

Referee: [Abstract and §5] Abstract and §5 (evaluation): the 97% 'high-quality' figure, the 13.52% coverage gain, and the 9.42% mutation-score gain are presented without an explicit, independent oracle or validity check that the synthesized MR actually holds on the target programs rather than merely producing additional passing tests; coverage and mutation metrics alone cannot distinguish a sound MR from one that silently accepts incorrect behavior on the new implementation.

Authors: We acknowledge the distinction between utility (measured via coverage/mutation gains) and semantic soundness of transferred MRs. Our quality filter verifies that codified MRs generate passing tests on source programs, and gains are observed on targets with similar functionality. However, we agree these metrics do not independently confirm the MR holds for the target. We will revise §5 to explicitly define the quality criteria, clarify that coverage/mutation serve as proxies for utility rather than soundness, and add a limitations discussion with suggestions for future oracle-based validation. revision: partial
Referee: [§4.2] §4.2 (transfer step): the criterion used to decide that a target program 'shares similar functionality' with the source of an MTC is not formalized, so it is impossible to assess whether domain-specific constraints present in the original tests but absent from the target are being violated by the transferred MR.

Authors: The transfer relies on a heuristic matching of method signatures (names and parameter types) between source and target. We agree this is not formally defined, which limits assessment of constraint preservation. We will revise §4.2 to formalize the similarity criterion, state its assumptions explicitly, and discuss potential risks regarding domain-specific constraints. revision: yes
Referee: [§5.1] §5.1 (experimental design): the paper does not describe the baseline MR generators, the statistical tests applied to the coverage/mutation deltas, or the sampling procedure for the programs used in the transfer experiment; without these details the quantitative claims cannot be reproduced or compared.

Authors: These details were inadvertently omitted. We will revise §5.1 to describe the baseline MR generators, the statistical tests used for the deltas, and the sampling procedure for the transfer experiment programs, enabling reproducibility and comparison. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical measurements are independent of the synthesis method

full rationale

The paper describes an empirical pipeline that extracts MTCs from OSS test suites, codifies MRs, filters them by a quality check for new test generation, and then measures line coverage and mutation score improvements on target programs. These percentages (97% high-quality, +13.52% coverage, +9.42% mutation) are presented as direct experimental outcomes rather than quantities defined in terms of the MR-Scout algorithm itself. No equations, fitted parameters, or self-citation chains are invoked to derive the central results; the evaluation relies on external program executions and standard coverage/mutation tools. The transferability claim is an empirical observation, not a self-referential definition.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields limited visibility into modeling choices. The approach implicitly assumes that test cases contain extractable MRs and that a quality filter can be defined without circular dependence on the target programs. No explicit free parameters, new entities, or non-standard axioms are stated.

axioms (1)

domain assumption Developer-written test cases encode domain-specific metamorphic relations that can be extracted and reused across programs with similar functionality.
Stated in the second sentence of the abstract as the foundational observation.

pith-pipeline@v0.9.0 · 5800 in / 1448 out tokens · 20927 ms · 2026-05-24T09:25:59.720362+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 61 canonical work pages

[1]

John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In 43rd IEEE/ACM International Conference on Software Engineering: Software Enginee...

work page doi:10.1109/icse-seip52600.2021.00023 2021
[2]

Leonhard Applis, Annibale Panichella, and Arie van Deursen. 2021. Assessing Robustness of ML-Based Program Analysis Tools using Metamorphic Program Transformations. In36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, Melbourne, Australia, November 15-19, 2021 . IEEE, 1377–1381. https://doi.org/10.1109/ ASE51524.2021.9678706

work page arXiv 2021
[3]

Andrea Arcuri and Lionel C. Briand. 2014. A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verification Reliab. 24, 3 (2014), 219–250. https://doi.org/10.1002/STVR.1486

work page doi:10.1002/stvr.1486 2014
[4]

Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2021. Generating metamorphic relations for cyber-physical systems with genetic programming: an industrial case study. InESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Gr...

work page doi:10.1145/3468264.3473920 2021
[5]

Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2022. Evolutionary generation of metamorphic relations for cyber-physical systems. In GECCO ’22: Genetic and Evolutionary Computation Conference, Companion Volume, Boston, Massachusetts, USA, July 9 - 13, 2022 , Jonathan E. Fieldsend and Markus Wagner (Eds.)...

work page doi:10.1145/3520304.3534077 2022
[6]

Ernst, Mauro Pezzè, and Antonio Carzaniga

Arianna Blasi, Alessandra Gorla, Michael D. Ernst, Mauro Pezzè, and Antonio Carzaniga. 2021. MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation. J. Syst. Softw. 181 (2021), 111041. https://doi.org/10.1016/j.jss.2021.111041

work page doi:10.1016/j.jss.2021.111041 2021
[7]

Hudson Borges and Marco Túlio Valente. 2018. What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform. J. Syst. Softw. 146 (2018), 112–129. https://doi.org/10.1016/j.jss.2018.09.016

work page doi:10.1016/j.jss.2018.09.016 2018
[8]

Cristian Cadar and Koushik Sen. 2013. Symbolic execution for software testing: three decades later. Commun. ACM 56, 2 (2013), 82–90. https://doi.org/10.1145/2408776.2408795

work page doi:10.1145/2408776.2408795 2013
[9]

Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi Cheung, and Haiming Chen. 2022. SemMT: A Semantic-Based Testing Approach for Machine Translation Systems. ACM Trans. Softw. Eng. Methodol. 31, 2 (2022), 34e:1–34e:36. https://doi.org/10.1145/3490488

work page doi:10.1145/3490488 2022
[10]

T. Y. Chen, S. C. Cheung, and S. M. Yiu. 1998. Metamorphic Testing: A New Approach for Generating Next Test Cases . Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, The Hong Kong University of Science and Technology

work page 1998
[11]

Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges and Opportunities. ACM Comput. Surv. 51, 1 (2018), 4:1–4:27. https: //doi.org/10.1145/3143561 ACM Trans. Softw. Eng. Methodol., Vol. 1, No. 1, Article 1. Publication date: March 2024. 1:26 Congying Xu, Valerio...

work page doi:10.1145/3143561 2018
[12]

Tsong Yueh Chen, Pak-Lok Poon, and Xiaoyuan Xie. 2016. METRIC: METamorphic Relation Identification based on the Category-choice framework. J. Syst. Softw. 116 (2016), 177–190. https://doi.org/10.1016/j.jss.2015.07.037

work page doi:10.1016/j.jss.2015.07.037 2016
[13]

Valle-Gómez, Inmaculada Medina-Bulo, and José Raúl Romero

Pedro Delgado-Pérez, Aurora Ramírez, Kevin J. Valle-Gómez, Inmaculada Medina-Bulo, and José Raúl Romero. 2023. InterEvo-TR: Interactive Evolutionary Test Generation With Readability Assessment. IEEE Trans. Software Eng. 49, 4 (2023), 2580–2596. https://doi.org/10.1109/TSE.2022.3227418

work page doi:10.1109/tse.2022.3227418 2023
[14]

Donaldson

Alastair F. Donaldson. 2019. Metamorphic testing of Android graphics drivers. In Proceedings of the 4th International Workshop on Metamorphic Testing, MET@ICSE 2019, Montreal, QC, Canada, May 26, 2019 , Xiaoyuan Xie, Pak-Lok Poon, and Laura L. Pullum (Eds.). IEEE / ACM, 1. https://doi.org/10.1109/MET.2019.00008

work page doi:10.1109/met.2019.00008 2019
[15]

Donaldson and Andrei Lascu

Alastair F. Donaldson and Andrei Lascu. 2016. Metamorphic testing for (graphics) compilers. In Proceedings of the 1st International Workshop on Metamorphic Testing, MET@ICSE 2016, Austin, Texas, USA, May 16, 2016 . ACM, 44–47. https://doi.org/10.1145/2896971.2896978

work page doi:10.1145/2896971.2896978 2016
[16]

EvoSuite. 2023. EvoSuite. Retrieved August 20, 2023 from https://www.evosuite.org/

work page 2023
[17]

Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: automatic test suite generation for object-oriented software. In SIGSOFT/FSE’11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC’11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011 , Tibor Gyimóthy and Andreas Zeller (Eds.)...

work page doi:10.1145/2025113.2025179 2011
[18]

Gordon Fraser and Andrea Arcuri. 2013. EvoSuite: On the Challenges of Test Case Generation in the Real World. In Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST 2013, Luxembourg, Luxembourg, March 18-22, 2013. IEEE Computer Society, 362–369. https://doi.org/10.1109/ICST.2013.51

work page doi:10.1109/icst.2013.51 2013
[19]

Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Trans. Software Eng. 39, 2, 276–291. https://doi.org/10.1109/TSE.2012.14

work page doi:10.1109/tse.2012.14 2013
[20]

Gordon Fraser and Andreas Zeller. 2011. Generating parameterized unit tests. In Proceedings of the 20th International Symposium on Software Testing and Analysis, ISSTA 2011, Toronto, ON, Canada, July 17-21, 2011 , Matthew B. Dwyer and Frank Tip (Eds.). ACM, 364–374. https://doi.org/10.1145/2001420.2001464

work page doi:10.1145/2001420.2001464 2011
[21]

Alessio Gambi, Gunel Jahangirova, Vincenzo Riccio, and Fiorella Zampetti. 2022. SBST Tool Competition 2022. In 15th IEEE/ACM International Workshop on Search-Based Software Testing, SBST@ICSE 2022, Pittsburgh, PA, USA, May 9, 2022 . IEEE, 25–32. https://doi.org/10.1145/3526072.3527538

work page doi:10.1145/3526072.3527538 2022
[22]

GitHub. 2023. GitHub. Retrieved August 20, 2023 from https://github.com/

work page 2023
[23]

Grammarly. 2023. Grammarly. Retrieved August 20, 2023 from http://grammarly.com

work page 2023
[24]

Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, Open Problems and Challenges for Search Based Software Testing. In 8th IEEE International Conference on Software Testing, Verification and Validation, ICST 2015, Graz, Austria, April 13-17, 2015. IEEE Computer Society, 1–12. https://doi.org/10.1109/ICST.2015.7102580

work page doi:10.1109/icst.2015.7102580 2015
[25]

N Alan Heckert, James J Filliben, C M Croarkin, B Hembree, William F Guthrie, P Tobias, and J Prinz. 2002. Handbook 151: Nist/sematech e-handbook of statistical methods. (2002)

work page 2002
[26]

Kaifeng Huang, Bihuan Chen, Congying Xu, Ying Wang, Bowen Shi, Xin Peng, Yijian Wu, and Yang Liu. 2022. Characterizing usages, updates and risks of third-party libraries in Java projects. Empir. Softw. Eng. 27, 4 (2022), 90. https://doi.org/10.1007/s10664-022-10131-8

work page doi:10.1007/s10664-022-10131-8 2022
[27]

Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2016. Test oracle assessment and improvement. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016, Andreas Zeller and Abhik Roychoudhury (Eds.). ACM, 247–258. https://doi.org/10.1145/2931037.2931062

work page doi:10.1145/2931037.2931062 2016
[28]

Junit. 2023. Junit4. Retrieved August 20, 2023 from https://junit.org/junit4/javadoc/4.13/org/junit/Assert.html

work page 2023
[29]

Junit. 2023. Junit5. Retrieved August 20, 2023 from https://junit.org/junit5/

work page 2023
[30]

Junit. 2023. Junit5 Assertions. Retrieved August 20, 2023 from https://junit.org/junit5/docs/5.0.3/api/org/junit/jupiter/ api/Assertions.html

work page 2023
[31]

Alexander Kampmann and Andreas Zeller. 2019. Carving parameterized unit tests. InProceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 248–249. https://doi.org/10.1109/ICSE-COMPANION.2019. 00098

work page doi:10.1109/icse-companion.2019 2019
[32]

Upulee Kanewala and James M. Bieman. 2013. Using machine learning techniques to detect metamorphic relations for programs without test oracles. In IEEE 24th International Symposium on Software Reliability Engineering, ISSRE 2013, Pasadena, CA, USA, November 4-7, 2013 . IEEE Computer Society, 1–10. https://doi.org/10.1109/ISSRE.2013.6698899

work page doi:10.1109/issre.2013.6698899 2013
[33]

Bieman, and Asa Ben-Hur

Upulee Kanewala, James M. Bieman, and Asa Ben-Hur. 2016. Predicting metamorphic relations for testing scientific software: a machine learning approach using graph kernels. Softw. Test. Verification Reliab. 26, 3 (2016), 245–269. https://doi.org/10.1002/stvr.1594

work page doi:10.1002/stvr.1594 2016
[34]

Yun Lin, You Sheng Ong, Jun Sun, Gordon Fraser, and Jin Song Dong. 2021. Graph-based seed object synthesis for search-based unit testing. In ESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, August 23-28, 2021 , Diomidis Spinellis, Georgios Gousios, ACM Trans. So...

work page doi:10.1145/3468264.3468619 2021
[35]

Porter, Gudjon Magnusson, and Christoph Schulze

Mikael Lindvall, Adam A. Porter, Gudjon Magnusson, and Christoph Schulze. 2017. Metamorphic Model-Based Testing of Autonomous Systems. In 2nd IEEE/ACM International Workshop on Metamorphic Testing, MET@ICSE 2017, Buenos Aires, Argentina, May 22, 2017 . IEEE Computer Society, 35–41. https://doi.org/10.1109/MET.2017.6

work page doi:10.1109/met.2017.6 2017
[36]

Haoyang Ma, Qingchao Shen, Yongqiang Tian, Junjie Chen, and Shing-Chi Cheung. 2023. Fuzzing Deep Learning Compilers with HirGen. , 248–260 pages. https://doi.org/10.1145/3597926.3598053

work page doi:10.1145/3597926.3598053 2023
[37]

Pingchuan Ma, Shuai Wang, and Jin Liu. 2020. Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020 , Christian Bessiere (Ed.). ijcai.org, 458–465. https://doi.org/10.24963/ijcai.2020/64

work page doi:10.24963/ijcai.2020/64 2020
[38]

OpenAI. 2023. ChatGPT. Retrieved August 20, 2023 from https://openai.com/blog/chatgpt

work page 2023
[39]

Oracle. 2023. Java Language Specification. Retrieved August 20, 2023 from https://docs.oracle.com/javase/specs/

work page 2023
[40]

Carlos Pacheco and Michael D. Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, October 21-25, 2007, Montreal, Quebec, Canada , Richard P. Gabriel, David F. Bacon, Cristina Videira Lopes, and Guy L. Steel...

work page doi:10.1145/1297846.1297902 2007
[41]

Matteo Paltenghi and Michael Pradel. 2023. MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform. In 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023 . IEEE, 2413–2424. https://doi.org/10.1109/ICSE48619.2023.00202

work page doi:10.1109/icse48619.2023.00202 2023
[42]

PITest. 2023. PITest. Retrieved August 20, 2023 from https://pitest.org/

work page 2023
[43]

Kun Qiu, Zheng Zheng, Tsong Yueh Chen, and Pak-Lok Poon. 2022. Theoretical and Empirical Analyses of the Effectiveness of Metamorphic Relation Composition. IEEE Trans. Software Eng. 48, 3 (2022), 1001–1017. https: //doi.org/10.1109/TSE.2020.3009698

work page doi:10.1109/tse.2020.3009698 2022
[44]

John A Rice. 2006. Mathematical statistics and data analysis . Cengage Learning

work page 2006
[45]

Sergio Segura, Amador Durán, Javier Troya, and Antonio Ruiz Cortés. 2017. A Template-Based Approach to Describing Metamorphic Relations. In 2nd IEEE/ACM International Workshop on Metamorphic Testing, MET@ICSE 2017, Buenos Aires, Argentina, May 22, 2017 . IEEE Computer Society, 3–9. https://doi.org/10.1109/MET.2017.3

work page doi:10.1109/met.2017.3 2017
[46]

Sergio Segura, Gordon Fraser, Ana Belén Sánchez, and Antonio Ruiz Cortés. 2016. A Survey on Metamorphic Testing. IEEE Trans. Software Eng. 42, 9 (2016), 805–824. https://doi.org/10.1109/TSE.2016.2532875

work page doi:10.1109/tse.2016.2532875 2016
[47]

Sergio Segura, José Antonio Parejo, Javier Troya, and Antonio Ruiz Cortés. 2018. Metamorphic testing of RESTful web APIs. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018 , Michel Chaudron, Ivica Crnkovic, Marsha Chechik, and Mark Harman (Eds.). ACM, 882. https://doi.org/10....

work page doi:10.1145/3180155.3182528 2018
[48]

Chang-Ai Sun, An Fu, Pak-Lok Poon, Xiaoyuan Xie, Huai Liu, and Tsong Yueh Chen. 2021. METRIC$ˆ{+}$+: A Metamorphic Relation Identification Technique Based on Input Plus Output Domains. IEEE Trans. Software Eng. 47, 9 (2021), 1764–1785. https://doi.org/10.1109/TSE.2019.2934848

work page doi:10.1109/tse.2019.2934848 2021
[49]

Chang-Ai Sun, Yiqiang Liu, Zuoyi Wang, and W. K. Chan. 2016. 𝜇MT: a data mutation directed metamorphic relation acquisition methodology. In Proceedings of the 1st International Workshop on Metamorphic Testing, MET@ICSE 2016, Austin, Texas, USA, May 16, 2016 . ACM, 12–18. https://doi.org/10.1145/2896971.2896974

work page doi:10.1145/2896971.2896974 2016
[50]

Valerio Terragni, Gunel Jahangirova, Paolo Tonella, and Mauro Pezzè. 2020. Evolutionary improvement of assertion oracles. In ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020 , Prem Devanbu, Myra B. Cohen, and Thomas Zimmermann (Eds.). ACM...

work page doi:10.1145/3368089.3409758 2020
[51]

TestNG. 2023. TestNG. Retrieved August 20, 2023 from https://testng.org/doc/

work page 2023
[52]

Marri, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux

Suresh Thummalapenta, Madhuri R. Marri, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux. 2011. Retrofitting Unit Tests for Parameterized Unit Testing. In Fundamental Approaches to Software Engineering - 14th International Conference, FASE 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, G...

work page doi:10.1007/978-3-642-19811-3_21 2011
[53]

Yongqiang Tian, Shiqing Ma, Ming Wen, Yepang Liu, Shing-Chi Cheung, and Xiangyu Zhang. 2021. To what extent do DNN-based image classification models make unreliable inferences? Empir. Softw. Eng. 26, 4 (2021), 84. https: //doi.org/10.1007/s10664-021-09985-1

work page doi:10.1007/s10664-021-09985-1 2021
[54]

MR-Scout. 2023. MR-Scout. Retrieved August 20, 2023 from https://mr-scout.github.io

work page 2023
[55]

Shuai Wang and Zhendong Su. 2020. Metamorphic Object Insertion for Testing Object Detection Systems. (2020), 1053–1065. https://doi.org/10.1145/3324884.3416584

work page doi:10.1145/3324884.3416584 2020
[56]

Ying Wang, Bihuan Chen, Kaifeng Huang, Bowen Shi, Congying Xu, Xin Peng, Yijian Wu, and Yang Liu. 2020. An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects. InIEEE International Conference on Software Maintenance and Evolution, ICSME 2020, Adelaide, Australia, September 28 - October 2, 2020 . IEEE, 35–45. ACM Trans. S...

work page doi:10.1109/icsme46990.2020.00014 2020
[57]

Dongwei Xiao, Zhibo Liu, Yuanyuan Yuan, Qi Pang, and Shuai Wang. 2022. Metamorphic Testing of Deep Learning Compilers. Proc. ACM Meas. Anal. Comput. Syst. 6, 1 (2022), 15:1–15:28. https://doi.org/10.1145/3508035

work page doi:10.1145/3508035 2022
[58]

Bo Zhang, Hongyu Zhang, Junjie Chen, Dan Hao, and Pablo Moscato. 2019. Automatic Discovery and Cleansing of Numerical Metamorphic Relations. In 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, Cleveland, OH, USA, September 29 - October 4, 2019 . IEEE, 235–245. https://doi.org/10.1109/ICSME.2019.00035

work page doi:10.1109/icsme.2019.00035 2019
[59]

Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-based inference of polynomial metamorphic relations. In ACM/IEEE International Conference on Automated Software Engineering, ASE ’14, Vasteras, Sweden - September 15 - 19, 2014 , Ivica Crnkovic, Marsha Chechik, and Paul Grünbacher (Eds.). ACM, 701–712. https://d...

work page doi:10.1145/2642937.2642994 2014
[60]

Zhi Quan Zhou, Liqun Sun, Tsong Yueh Chen, and Dave Towey. 2020. Metamorphic Relations for Enhancing System Understanding and Use. IEEE Trans. Software Eng. 46, 10 (2020), 1120–1154. https://doi.org/10.1109/TSE.2018.2876433

work page doi:10.1109/tse.2018.2876433 2020
[61]

Hengcheng Zhu, Lili Wei, Ming Wen, Yepang Liu, Shing-Chi Cheung, Qin Sheng, and Cui Zhou. 2020. MockSniffer: Characterizing and Recommending Mocking Decisions for Unit Tests. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020 . IEEE, 436–447. https: //doi.org/10.1145/3324884.3...

work page doi:10.1145/3324884.3416539 2020

[1] [1]

John Ahlgren, Maria Eugenia Berezin, Kinga Bojarczuk, Elena Dulskyte, Inna Dvortsova, Johann George, Natalija Gucevska, Mark Harman, Maria Lomeli, Erik Meijer, Silvia Sapora, and Justin Spahr-Summers. 2021. Testing Web Enabled Simulation at Scale Using Metamorphic Testing. In 43rd IEEE/ACM International Conference on Software Engineering: Software Enginee...

work page doi:10.1109/icse-seip52600.2021.00023 2021

[2] [2]

Leonhard Applis, Annibale Panichella, and Arie van Deursen. 2021. Assessing Robustness of ML-Based Program Analysis Tools using Metamorphic Program Transformations. In36th IEEE/ACM International Conference on Automated Software Engineering, ASE 2021, Melbourne, Australia, November 15-19, 2021 . IEEE, 1377–1381. https://doi.org/10.1109/ ASE51524.2021.9678706

work page arXiv 2021

[3] [3]

Andrea Arcuri and Lionel C. Briand. 2014. A Hitchhiker’s guide to statistical tests for assessing randomized algorithms in software engineering. Softw. Test. Verification Reliab. 24, 3 (2014), 219–250. https://doi.org/10.1002/STVR.1486

work page doi:10.1002/stvr.1486 2014

[4] [4]

Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2021. Generating metamorphic relations for cyber-physical systems with genetic programming: an industrial case study. InESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Gr...

work page doi:10.1145/3468264.3473920 2021

[5] [5]

Jon Ayerdi, Valerio Terragni, Aitor Arrieta, Paolo Tonella, Goiuria Sagardui, and Maite Arratibel. 2022. Evolutionary generation of metamorphic relations for cyber-physical systems. In GECCO ’22: Genetic and Evolutionary Computation Conference, Companion Volume, Boston, Massachusetts, USA, July 9 - 13, 2022 , Jonathan E. Fieldsend and Markus Wagner (Eds.)...

work page doi:10.1145/3520304.3534077 2022

[6] [6]

Ernst, Mauro Pezzè, and Antonio Carzaniga

Arianna Blasi, Alessandra Gorla, Michael D. Ernst, Mauro Pezzè, and Antonio Carzaniga. 2021. MeMo: Automatically identifying metamorphic relations in Javadoc comments for test automation. J. Syst. Softw. 181 (2021), 111041. https://doi.org/10.1016/j.jss.2021.111041

work page doi:10.1016/j.jss.2021.111041 2021

[7] [7]

Hudson Borges and Marco Túlio Valente. 2018. What’s in a GitHub Star? Understanding Repository Starring Practices in a Social Coding Platform. J. Syst. Softw. 146 (2018), 112–129. https://doi.org/10.1016/j.jss.2018.09.016

work page doi:10.1016/j.jss.2018.09.016 2018

[8] [8]

Cristian Cadar and Koushik Sen. 2013. Symbolic execution for software testing: three decades later. Commun. ACM 56, 2 (2013), 82–90. https://doi.org/10.1145/2408776.2408795

work page doi:10.1145/2408776.2408795 2013

[9] [9]

Jialun Cao, Meiziniu Li, Yeting Li, Ming Wen, Shing-Chi Cheung, and Haiming Chen. 2022. SemMT: A Semantic-Based Testing Approach for Machine Translation Systems. ACM Trans. Softw. Eng. Methodol. 31, 2 (2022), 34e:1–34e:36. https://doi.org/10.1145/3490488

work page doi:10.1145/3490488 2022

[10] [10]

T. Y. Chen, S. C. Cheung, and S. M. Yiu. 1998. Metamorphic Testing: A New Approach for Generating Next Test Cases . Technical Report. Technical Report HKUST-CS98-01, Department of Computer Science, The Hong Kong University of Science and Technology

work page 1998

[11] [11]

Tsong Yueh Chen, Fei-Ching Kuo, Huai Liu, Pak-Lok Poon, Dave Towey, T. H. Tse, and Zhi Quan Zhou. 2018. Metamorphic Testing: A Review of Challenges and Opportunities. ACM Comput. Surv. 51, 1 (2018), 4:1–4:27. https: //doi.org/10.1145/3143561 ACM Trans. Softw. Eng. Methodol., Vol. 1, No. 1, Article 1. Publication date: March 2024. 1:26 Congying Xu, Valerio...

work page doi:10.1145/3143561 2018

[12] [12]

Tsong Yueh Chen, Pak-Lok Poon, and Xiaoyuan Xie. 2016. METRIC: METamorphic Relation Identification based on the Category-choice framework. J. Syst. Softw. 116 (2016), 177–190. https://doi.org/10.1016/j.jss.2015.07.037

work page doi:10.1016/j.jss.2015.07.037 2016

[13] [13]

Valle-Gómez, Inmaculada Medina-Bulo, and José Raúl Romero

Pedro Delgado-Pérez, Aurora Ramírez, Kevin J. Valle-Gómez, Inmaculada Medina-Bulo, and José Raúl Romero. 2023. InterEvo-TR: Interactive Evolutionary Test Generation With Readability Assessment. IEEE Trans. Software Eng. 49, 4 (2023), 2580–2596. https://doi.org/10.1109/TSE.2022.3227418

work page doi:10.1109/tse.2022.3227418 2023

[14] [14]

Donaldson

Alastair F. Donaldson. 2019. Metamorphic testing of Android graphics drivers. In Proceedings of the 4th International Workshop on Metamorphic Testing, MET@ICSE 2019, Montreal, QC, Canada, May 26, 2019 , Xiaoyuan Xie, Pak-Lok Poon, and Laura L. Pullum (Eds.). IEEE / ACM, 1. https://doi.org/10.1109/MET.2019.00008

work page doi:10.1109/met.2019.00008 2019

[15] [15]

Donaldson and Andrei Lascu

Alastair F. Donaldson and Andrei Lascu. 2016. Metamorphic testing for (graphics) compilers. In Proceedings of the 1st International Workshop on Metamorphic Testing, MET@ICSE 2016, Austin, Texas, USA, May 16, 2016 . ACM, 44–47. https://doi.org/10.1145/2896971.2896978

work page doi:10.1145/2896971.2896978 2016

[16] [16]

EvoSuite. 2023. EvoSuite. Retrieved August 20, 2023 from https://www.evosuite.org/

work page 2023

[17] [17]

Gordon Fraser and Andrea Arcuri. 2011. EvoSuite: automatic test suite generation for object-oriented software. In SIGSOFT/FSE’11 19th ACM SIGSOFT Symposium on the Foundations of Software Engineering (FSE-19) and ESEC’11: 13th European Software Engineering Conference (ESEC-13), Szeged, Hungary, September 5-9, 2011 , Tibor Gyimóthy and Andreas Zeller (Eds.)...

work page doi:10.1145/2025113.2025179 2011

[18] [18]

Gordon Fraser and Andrea Arcuri. 2013. EvoSuite: On the Challenges of Test Case Generation in the Real World. In Sixth IEEE International Conference on Software Testing, Verification and Validation, ICST 2013, Luxembourg, Luxembourg, March 18-22, 2013. IEEE Computer Society, 362–369. https://doi.org/10.1109/ICST.2013.51

work page doi:10.1109/icst.2013.51 2013

[19] [19]

Gordon Fraser and Andrea Arcuri. 2013. Whole Test Suite Generation. IEEE Trans. Software Eng. 39, 2, 276–291. https://doi.org/10.1109/TSE.2012.14

work page doi:10.1109/tse.2012.14 2013

[20] [20]

Gordon Fraser and Andreas Zeller. 2011. Generating parameterized unit tests. In Proceedings of the 20th International Symposium on Software Testing and Analysis, ISSTA 2011, Toronto, ON, Canada, July 17-21, 2011 , Matthew B. Dwyer and Frank Tip (Eds.). ACM, 364–374. https://doi.org/10.1145/2001420.2001464

work page doi:10.1145/2001420.2001464 2011

[21] [21]

Alessio Gambi, Gunel Jahangirova, Vincenzo Riccio, and Fiorella Zampetti. 2022. SBST Tool Competition 2022. In 15th IEEE/ACM International Workshop on Search-Based Software Testing, SBST@ICSE 2022, Pittsburgh, PA, USA, May 9, 2022 . IEEE, 25–32. https://doi.org/10.1145/3526072.3527538

work page doi:10.1145/3526072.3527538 2022

[22] [22]

GitHub. 2023. GitHub. Retrieved August 20, 2023 from https://github.com/

work page 2023

[23] [23]

Grammarly. 2023. Grammarly. Retrieved August 20, 2023 from http://grammarly.com

work page 2023

[24] [24]

Mark Harman, Yue Jia, and Yuanyuan Zhang. 2015. Achievements, Open Problems and Challenges for Search Based Software Testing. In 8th IEEE International Conference on Software Testing, Verification and Validation, ICST 2015, Graz, Austria, April 13-17, 2015. IEEE Computer Society, 1–12. https://doi.org/10.1109/ICST.2015.7102580

work page doi:10.1109/icst.2015.7102580 2015

[25] [25]

N Alan Heckert, James J Filliben, C M Croarkin, B Hembree, William F Guthrie, P Tobias, and J Prinz. 2002. Handbook 151: Nist/sematech e-handbook of statistical methods. (2002)

work page 2002

[26] [26]

Kaifeng Huang, Bihuan Chen, Congying Xu, Ying Wang, Bowen Shi, Xin Peng, Yijian Wu, and Yang Liu. 2022. Characterizing usages, updates and risks of third-party libraries in Java projects. Empir. Softw. Eng. 27, 4 (2022), 90. https://doi.org/10.1007/s10664-022-10131-8

work page doi:10.1007/s10664-022-10131-8 2022

[27] [27]

Gunel Jahangirova, David Clark, Mark Harman, and Paolo Tonella. 2016. Test oracle assessment and improvement. In Proceedings of the 25th International Symposium on Software Testing and Analysis, ISSTA 2016, Saarbrücken, Germany, July 18-20, 2016, Andreas Zeller and Abhik Roychoudhury (Eds.). ACM, 247–258. https://doi.org/10.1145/2931037.2931062

work page doi:10.1145/2931037.2931062 2016

[28] [28]

Junit. 2023. Junit4. Retrieved August 20, 2023 from https://junit.org/junit4/javadoc/4.13/org/junit/Assert.html

work page 2023

[29] [29]

Junit. 2023. Junit5. Retrieved August 20, 2023 from https://junit.org/junit5/

work page 2023

[30] [30]

Junit. 2023. Junit5 Assertions. Retrieved August 20, 2023 from https://junit.org/junit5/docs/5.0.3/api/org/junit/jupiter/ api/Assertions.html

work page 2023

[31] [31]

Alexander Kampmann and Andreas Zeller. 2019. Carving parameterized unit tests. InProceedings of the 41st International Conference on Software Engineering: Companion Proceedings, ICSE 2019, Montreal, QC, Canada, May 25-31, 2019, Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (Eds.). IEEE / ACM, 248–249. https://doi.org/10.1109/ICSE-COMPANION.2019. 00098

work page doi:10.1109/icse-companion.2019 2019

[32] [32]

Upulee Kanewala and James M. Bieman. 2013. Using machine learning techniques to detect metamorphic relations for programs without test oracles. In IEEE 24th International Symposium on Software Reliability Engineering, ISSRE 2013, Pasadena, CA, USA, November 4-7, 2013 . IEEE Computer Society, 1–10. https://doi.org/10.1109/ISSRE.2013.6698899

work page doi:10.1109/issre.2013.6698899 2013

[33] [33]

Bieman, and Asa Ben-Hur

Upulee Kanewala, James M. Bieman, and Asa Ben-Hur. 2016. Predicting metamorphic relations for testing scientific software: a machine learning approach using graph kernels. Softw. Test. Verification Reliab. 26, 3 (2016), 245–269. https://doi.org/10.1002/stvr.1594

work page doi:10.1002/stvr.1594 2016

[34] [34]

Yun Lin, You Sheng Ong, Jun Sun, Gordon Fraser, and Jin Song Dong. 2021. Graph-based seed object synthesis for search-based unit testing. In ESEC/FSE ’21: 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, August 23-28, 2021 , Diomidis Spinellis, Georgios Gousios, ACM Trans. So...

work page doi:10.1145/3468264.3468619 2021

[35] [35]

Porter, Gudjon Magnusson, and Christoph Schulze

Mikael Lindvall, Adam A. Porter, Gudjon Magnusson, and Christoph Schulze. 2017. Metamorphic Model-Based Testing of Autonomous Systems. In 2nd IEEE/ACM International Workshop on Metamorphic Testing, MET@ICSE 2017, Buenos Aires, Argentina, May 22, 2017 . IEEE Computer Society, 35–41. https://doi.org/10.1109/MET.2017.6

work page doi:10.1109/met.2017.6 2017

[36] [36]

Haoyang Ma, Qingchao Shen, Yongqiang Tian, Junjie Chen, and Shing-Chi Cheung. 2023. Fuzzing Deep Learning Compilers with HirGen. , 248–260 pages. https://doi.org/10.1145/3597926.3598053

work page doi:10.1145/3597926.3598053 2023

[37] [37]

Pingchuan Ma, Shuai Wang, and Jin Liu. 2020. Metamorphic Testing and Certified Mitigation of Fairness Violations in NLP Models. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI 2020 , Christian Bessiere (Ed.). ijcai.org, 458–465. https://doi.org/10.24963/ijcai.2020/64

work page doi:10.24963/ijcai.2020/64 2020

[38] [38]

OpenAI. 2023. ChatGPT. Retrieved August 20, 2023 from https://openai.com/blog/chatgpt

work page 2023

[39] [39]

Oracle. 2023. Java Language Specification. Retrieved August 20, 2023 from https://docs.oracle.com/javase/specs/

work page 2023

[40] [40]

Carlos Pacheco and Michael D. Ernst. 2007. Randoop: feedback-directed random testing for Java. In Companion to the 22nd Annual ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages, and Applications, OOPSLA 2007, October 21-25, 2007, Montreal, Quebec, Canada , Richard P. Gabriel, David F. Bacon, Cristina Videira Lopes, and Guy L. Steel...

work page doi:10.1145/1297846.1297902 2007

[41] [41]

Matteo Paltenghi and Michael Pradel. 2023. MorphQ: Metamorphic Testing of the Qiskit Quantum Computing Platform. In 45th IEEE/ACM International Conference on Software Engineering, ICSE 2023, Melbourne, Australia, May 14-20, 2023 . IEEE, 2413–2424. https://doi.org/10.1109/ICSE48619.2023.00202

work page doi:10.1109/icse48619.2023.00202 2023

[42] [42]

PITest. 2023. PITest. Retrieved August 20, 2023 from https://pitest.org/

work page 2023

[43] [43]

Kun Qiu, Zheng Zheng, Tsong Yueh Chen, and Pak-Lok Poon. 2022. Theoretical and Empirical Analyses of the Effectiveness of Metamorphic Relation Composition. IEEE Trans. Software Eng. 48, 3 (2022), 1001–1017. https: //doi.org/10.1109/TSE.2020.3009698

work page doi:10.1109/tse.2020.3009698 2022

[44] [44]

John A Rice. 2006. Mathematical statistics and data analysis . Cengage Learning

work page 2006

[45] [45]

Sergio Segura, Amador Durán, Javier Troya, and Antonio Ruiz Cortés. 2017. A Template-Based Approach to Describing Metamorphic Relations. In 2nd IEEE/ACM International Workshop on Metamorphic Testing, MET@ICSE 2017, Buenos Aires, Argentina, May 22, 2017 . IEEE Computer Society, 3–9. https://doi.org/10.1109/MET.2017.3

work page doi:10.1109/met.2017.3 2017

[46] [46]

Sergio Segura, Gordon Fraser, Ana Belén Sánchez, and Antonio Ruiz Cortés. 2016. A Survey on Metamorphic Testing. IEEE Trans. Software Eng. 42, 9 (2016), 805–824. https://doi.org/10.1109/TSE.2016.2532875

work page doi:10.1109/tse.2016.2532875 2016

[47] [47]

Sergio Segura, José Antonio Parejo, Javier Troya, and Antonio Ruiz Cortés. 2018. Metamorphic testing of RESTful web APIs. In Proceedings of the 40th International Conference on Software Engineering, ICSE 2018, Gothenburg, Sweden, May 27 - June 03, 2018 , Michel Chaudron, Ivica Crnkovic, Marsha Chechik, and Mark Harman (Eds.). ACM, 882. https://doi.org/10....

work page doi:10.1145/3180155.3182528 2018

[48] [48]

Chang-Ai Sun, An Fu, Pak-Lok Poon, Xiaoyuan Xie, Huai Liu, and Tsong Yueh Chen. 2021. METRIC$ˆ{+}$+: A Metamorphic Relation Identification Technique Based on Input Plus Output Domains. IEEE Trans. Software Eng. 47, 9 (2021), 1764–1785. https://doi.org/10.1109/TSE.2019.2934848

work page doi:10.1109/tse.2019.2934848 2021

[49] [49]

Chang-Ai Sun, Yiqiang Liu, Zuoyi Wang, and W. K. Chan. 2016. 𝜇MT: a data mutation directed metamorphic relation acquisition methodology. In Proceedings of the 1st International Workshop on Metamorphic Testing, MET@ICSE 2016, Austin, Texas, USA, May 16, 2016 . ACM, 12–18. https://doi.org/10.1145/2896971.2896974

work page doi:10.1145/2896971.2896974 2016

[50] [50]

Valerio Terragni, Gunel Jahangirova, Paolo Tonella, and Mauro Pezzè. 2020. Evolutionary improvement of assertion oracles. In ESEC/FSE ’20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Virtual Event, USA, November 8-13, 2020 , Prem Devanbu, Myra B. Cohen, and Thomas Zimmermann (Eds.). ACM...

work page doi:10.1145/3368089.3409758 2020

[51] [51]

TestNG. 2023. TestNG. Retrieved August 20, 2023 from https://testng.org/doc/

work page 2023

[52] [52]

Marri, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux

Suresh Thummalapenta, Madhuri R. Marri, Tao Xie, Nikolai Tillmann, and Jonathan de Halleux. 2011. Retrofitting Unit Tests for Parameterized Unit Testing. In Fundamental Approaches to Software Engineering - 14th International Conference, FASE 2011, Held as Part of the Joint European Conferences on Theory and Practice of Software, ETAPS 2011, Saarbrücken, G...

work page doi:10.1007/978-3-642-19811-3_21 2011

[53] [53]

Yongqiang Tian, Shiqing Ma, Ming Wen, Yepang Liu, Shing-Chi Cheung, and Xiangyu Zhang. 2021. To what extent do DNN-based image classification models make unreliable inferences? Empir. Softw. Eng. 26, 4 (2021), 84. https: //doi.org/10.1007/s10664-021-09985-1

work page doi:10.1007/s10664-021-09985-1 2021

[54] [54]

MR-Scout. 2023. MR-Scout. Retrieved August 20, 2023 from https://mr-scout.github.io

work page 2023

[55] [55]

Shuai Wang and Zhendong Su. 2020. Metamorphic Object Insertion for Testing Object Detection Systems. (2020), 1053–1065. https://doi.org/10.1145/3324884.3416584

work page doi:10.1145/3324884.3416584 2020

[56] [56]

Ying Wang, Bihuan Chen, Kaifeng Huang, Bowen Shi, Congying Xu, Xin Peng, Yijian Wu, and Yang Liu. 2020. An Empirical Study of Usages, Updates and Risks of Third-Party Libraries in Java Projects. InIEEE International Conference on Software Maintenance and Evolution, ICSME 2020, Adelaide, Australia, September 28 - October 2, 2020 . IEEE, 35–45. ACM Trans. S...

work page doi:10.1109/icsme46990.2020.00014 2020

[57] [57]

Dongwei Xiao, Zhibo Liu, Yuanyuan Yuan, Qi Pang, and Shuai Wang. 2022. Metamorphic Testing of Deep Learning Compilers. Proc. ACM Meas. Anal. Comput. Syst. 6, 1 (2022), 15:1–15:28. https://doi.org/10.1145/3508035

work page doi:10.1145/3508035 2022

[58] [58]

Bo Zhang, Hongyu Zhang, Junjie Chen, Dan Hao, and Pablo Moscato. 2019. Automatic Discovery and Cleansing of Numerical Metamorphic Relations. In 2019 IEEE International Conference on Software Maintenance and Evolution, ICSME 2019, Cleveland, OH, USA, September 29 - October 4, 2019 . IEEE, 235–245. https://doi.org/10.1109/ICSME.2019.00035

work page doi:10.1109/icsme.2019.00035 2019

[59] [59]

Jie Zhang, Junjie Chen, Dan Hao, Yingfei Xiong, Bing Xie, Lu Zhang, and Hong Mei. 2014. Search-based inference of polynomial metamorphic relations. In ACM/IEEE International Conference on Automated Software Engineering, ASE ’14, Vasteras, Sweden - September 15 - 19, 2014 , Ivica Crnkovic, Marsha Chechik, and Paul Grünbacher (Eds.). ACM, 701–712. https://d...

work page doi:10.1145/2642937.2642994 2014

[60] [60]

Zhi Quan Zhou, Liqun Sun, Tsong Yueh Chen, and Dave Towey. 2020. Metamorphic Relations for Enhancing System Understanding and Use. IEEE Trans. Software Eng. 46, 10 (2020), 1120–1154. https://doi.org/10.1109/TSE.2018.2876433

work page doi:10.1109/tse.2018.2876433 2020

[61] [61]

Hengcheng Zhu, Lili Wei, Ming Wen, Yepang Liu, Shing-Chi Cheung, Qin Sheng, and Cui Zhou. 2020. MockSniffer: Characterizing and Recommending Mocking Decisions for Unit Tests. In 35th IEEE/ACM International Conference on Automated Software Engineering, ASE 2020, Melbourne, Australia, September 21-25, 2020 . IEEE, 436–447. https: //doi.org/10.1145/3324884.3...

work page doi:10.1145/3324884.3416539 2020