Minimum Complete MR Subsets under Semantic-Mutation Fault Models: A Support-Set Domination Boundary

Jie Liu; Meng Li; Shiyu Yan; Xiaohua Yang

arxiv: 2606.08269 · v1 · pith:4TFG2QIEnew · submitted 2026-06-06 · 💻 cs.SE · cs.DS

Minimum Complete MR Subsets under Semantic-Mutation Fault Models: A Support-Set Domination Boundary

Meng Li , Xiaohua Yang , Jie Liu , Shiyu Yan This is my paper

Pith reviewed 2026-06-27 19:15 UTC · model grok-4.3

classification 💻 cs.SE cs.DS

keywords metamorphic testingmutant selectionsupport-set domination boundarykill-signature heterogeneityminimum complete MR subsetssemantic mutation fault modelsset cover equivalence

0 comments

The pith

Kill-signature heterogeneity draws a support-set domination boundary that decides when class-level abstraction suffices for minimum complete MR subsets or when mutant-level minimization is required.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper asks when minimum complete evidence in metamorphic testing demands real mutant-level MR subset selection instead of ordinary fault-class counting. It defines a layer-relative completeness criterion over an admitted mutant-draw coverage universe and derives a support-set domination boundary governed by kill-signature heterogeneity. This boundary produces a scoped fault-signature kernel that cleanly separates the MR-specific minimization question from coarser class-level abstraction. If the boundary holds, testers can safely use class abstraction in homogeneous kill-signature regimes and must perform mutant-level minimization only when heterogeneity exceeds the boundary. The Min-MR-Complete problem is shown to be equivalent to set cover over the chosen coverage universe.

Core claim

The central result is a support-set domination boundary that states when class-level abstraction is safe and when mutant-level MR minimization is necessary. The boundary is governed by kill-signature heterogeneity, which yields a scoped fault-signature kernel and separates the MR-specific question from ordinary fault-class counting. The resulting Min-MR-Complete problem is Set-Cover-equivalent over the selected coverage universe, giving NP-hardness, the classical logarithmic approximation boundary, a greedy approximation, an exact ILP formulation, and an SMS-rank upper bound.

What carries the argument

The support-set domination boundary, which uses kill-signature heterogeneity to separate safe class-level abstraction from cases requiring mutant-level MR minimization.

If this is right

The Min-MR-Complete problem inherits NP-hardness and the classical logarithmic approximation bound from set cover.
A greedy algorithm provides the standard logarithmic approximation for finding minimum complete MR subsets.
An exact integer linear programming formulation solves the minimization exactly over the coverage universe.
Artifact lanes supply lane-local minimization and separate audit evidence rather than pooled population statistics.
Route witnesses instantiate both collapse and non-collapse regimes for the boundary theorem.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Tools could monitor kill-signature heterogeneity on the fly and switch abstraction levels only when the boundary is crossed.
The scoped fault-signature kernel idea may apply to other selection problems where signatures determine when coarse categories lose completeness.
The separation of route witnesses from population experiments suggests a template for validating boundary claims without statistical pooling.

Load-bearing premise

The layer-relative completeness criterion over an admitted mutant-draw coverage universe accurately captures the real requirements for minimum complete evidence.

What would settle it

A concrete counter-example set of mutants and MRs in which kill-signature heterogeneity fails to produce the predicted domination boundary under the layer-relative completeness criterion.

read the original abstract

This paper asks when MR-subset selection is a real mutant-level requirement for minimum complete evidence in metamorphic testing rather than a coarse fault-class counting artifact. We define a layer-relative completeness criterion over an admitted mutant--draw coverage universe. The central result is a support-set domination boundary: it states when class-level abstraction is safe and when mutant-level MR minimization is necessary. The boundary is governed by kill-signature heterogeneity, which yields a scoped fault-signature kernel and separates the MR-specific question from ordinary fault-class counting. The resulting Min-MR-Complete problem is Set-Cover-equivalent over the selected coverage universe, giving NP-hardness, the classical logarithmic approximation boundary, a greedy approximation, an exact ILP formulation, and an SMS-rank upper bound that is not a lower bound or tight predictor. Artifact lanes provide lane-local minimization and audit evidence; separately, route witnesses instantiate both collapse and non-collapse regimes for the boundary theorem and are not pooled as population-level experiments. Other MR-class-proxy rows remain intermediate signals rather than route-admitted witness evidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The support-set domination boundary is the actual new piece, but it is scoped to an admitted mutant-draw universe whose match to real completeness needs is untested.

read the letter

The paper introduces a support-set domination boundary that marks when kill-signature heterogeneity makes class-level abstraction unsafe and forces mutant-level MR minimization for complete evidence.

It does the reduction to set cover cleanly. Once Min-MR-Complete is cast that way, the NP-hardness, logarithmic approximation bound, greedy algorithm, and ILP formulation follow directly. The route witnesses show both collapse and non-collapse cases without pooling them into aggregate statistics, and the artifact lanes separate local minimization from audit evidence.

The soft spot is the modeling premise. The boundary and the claimed separation from ordinary fault-class counting are derived inside a layer-relative completeness criterion over an explicitly admitted mutant-draw coverage universe. If practical requirements include interactions or oracles outside that universe, the boundary stays inside the chosen model rather than applying more generally. The SMS-rank bound is stated as neither lower nor tight, which limits its usefulness. Derivation steps for the central theorem are not visible in the abstract.

This is for people already working on metamorphic testing and precise mutant-level test selection. A reader inside that niche who wants a formal way to decide when class abstraction is safe will get a usable reduction and boundary. Outside that sub-area the paper is too narrow to matter.

The reduction itself is standard once granted, but the boundary is a new distinction worth checking. I would send it to peer review.

Referee Report

3 major / 3 minor

Summary. The paper defines a layer-relative completeness criterion over an admitted mutant-draw coverage universe for metamorphic testing. Its central result is a support-set domination boundary governed by kill-signature heterogeneity that separates cases where class-level abstraction is safe from those requiring mutant-level MR minimization. Min-MR-Complete is shown equivalent to Set-Cover over this universe, yielding NP-hardness, the classical logarithmic approximation bound, a greedy algorithm, an exact ILP formulation, and an SMS-rank upper bound (explicitly not a lower bound or tight predictor). Artifact lanes supply lane-local minimization evidence; route witnesses illustrate both collapse and non-collapse regimes for the boundary.

Significance. If the modeling premise holds, the work supplies a precise boundary distinguishing MR-specific minimization from ordinary fault-class counting, together with standard algorithmic consequences of the Set-Cover reduction. The explicit route witnesses for both regimes and the separation of artifact lanes from population-level pooling constitute concrete, falsifiable illustrations rather than pooled experiments.

major comments (3)

[Abstract / central result] Abstract and central result: the derivation steps establishing that the support-set domination boundary is governed by kill-signature heterogeneity (and yields a scoped fault-signature kernel) are not shown, so it is impossible to verify that the boundary theorem follows from the layer-relative completeness criterion rather than being stipulated by it.
[Layer-relative completeness criterion] Layer-relative completeness criterion (defined over the admitted mutant-draw coverage universe): this premise is load-bearing for the claim that the boundary separates MR-specific questions from fault-class counting, yet no external benchmark, independent validation data, or comparison against practical metamorphic-testing requirements (e.g., unmodeled interactions or integration faults) is supplied; the boundary therefore risks being scoped only to the paper's chosen modeling choices.
[Support-set domination boundary] Support-set domination boundary definition: the boundary is introduced in terms of the paper's own kill-signature heterogeneity and coverage universe; without an independent notion of minimum-complete evidence, the separation result is at risk of circularity with the modeling assumptions.

minor comments (3)

The SMS-rank upper bound is stated to be neither a lower bound nor a tight predictor; a short clarifying sentence on its intended diagnostic use would help readers.
New terms (scoped fault-signature kernel, support-set domination boundary) should receive explicit first-use definitions before being used in the central claim.
The abstract states that other MR-class-proxy rows remain intermediate signals; a brief note on why they are excluded from route-admitted witness evidence would improve clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major comment point by point below, providing clarifications on the derivations and scope of the modeling framework. Where the presentation can be strengthened without altering the core results, we indicate revisions.

read point-by-point responses

Referee: [Abstract / central result] Abstract and central result: the derivation steps establishing that the support-set domination boundary is governed by kill-signature heterogeneity (and yields a scoped fault-signature kernel) are not shown, so it is impossible to verify that the boundary theorem follows from the layer-relative completeness criterion rather than being stipulated by it.

Authors: The full manuscript derives the boundary theorem from the layer-relative completeness criterion by establishing that kill-signature heterogeneity over the coverage universe determines the point at which class-level support sets dominate mutant-level ones, producing the scoped fault-signature kernel. The steps appear after the criterion definition. To make the logical flow immediately verifiable, we will insert a concise derivation outline into the abstract and add a short explanatory paragraph in the introduction of the revised version. revision: yes
Referee: [Layer-relative completeness criterion] Layer-relative completeness criterion (defined over the admitted mutant-draw coverage universe): this premise is load-bearing for the claim that the boundary separates MR-specific questions from fault-class counting, yet no external benchmark, independent validation data, or comparison against practical metamorphic-testing requirements (e.g., unmodeled interactions or integration faults) is supplied; the boundary therefore risks being scoped only to the paper's chosen modeling choices.

Authors: The layer-relative completeness criterion is presented explicitly as a modeling premise over the admitted mutant-draw coverage universe; the paper makes no claim of external empirical validation against unmodeled interactions or integration faults. The separation between MR-specific minimization and fault-class counting is shown theoretically within this universe via the route witnesses. We will add an explicit limitations paragraph in the discussion section stating the modeling scope and noting that practical metamorphic testing may involve additional factors outside the current framework. revision: partial
Referee: [Support-set domination boundary] Support-set domination boundary definition: the boundary is introduced in terms of the paper's own kill-signature heterogeneity and coverage universe; without an independent notion of minimum-complete evidence, the separation result is at risk of circularity with the modeling assumptions.

Authors: The minimum-complete evidence notion is defined first and independently via the layer-relative completeness criterion over the coverage universe. The support-set domination boundary is then derived from that criterion by applying the kill-signature heterogeneity metric. This ordering ensures the separation result is not circular. We will insert a brief clarifying sentence in the boundary definition section of the revised manuscript to highlight the logical precedence. revision: yes

Circularity Check

1 steps flagged

Support-set domination boundary and fault-signature kernel defined within paper's own kill-signature heterogeneity and layer-relative completeness criterion

specific steps

self definitional [Abstract / Central result]
"We define a layer-relative completeness criterion over an admitted mutant--draw coverage universe. The central result is a support-set domination boundary: it states when class-level abstraction is safe and when mutant-level MR minimization is necessary. The boundary is governed by kill-signature heterogeneity, which yields a scoped fault-signature kernel and separates the MR-specific question from ordinary fault-class counting."

The boundary is claimed as a derived result that separates MR-specific questions from fault-class counting, but it is explicitly governed by kill-signature heterogeneity inside the completeness criterion and coverage universe that the paper itself defines and admits. The separation and kernel are therefore equivalent to the modeling premises by construction, with no independent external validation or reduction shown.

full rationale

The paper defines its layer-relative completeness criterion and admitted mutant-draw coverage universe, then presents the support-set domination boundary as governed by kill-signature heterogeneity within that same framework. The separation of MR-specific minimization from fault-class counting is thereby a direct consequence of these modeling choices rather than an independent derivation from external evidence. This constitutes a self-definitional reduction with no equations or external benchmarks shown to break the loop. The result is scoped to the paper's admitted universe by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The paper rests on several newly introduced modeling constructs whose independent grounding is not supplied in the abstract; the set-cover equivalence is asserted but its mapping details are not shown.

axioms (1)

domain assumption Layer-relative completeness criterion over the admitted mutant-draw coverage universe correctly models minimum complete evidence requirements.
Invoked to define when the domination boundary applies; stated in the abstract as the foundation for the central result.

invented entities (3)

support-set domination boundary no independent evidence
purpose: Determines when class-level MR abstraction is safe versus when mutant-level minimization is required.
Newly defined construct that separates MR-specific questions from fault-class counting; no external validation mentioned.
kill-signature heterogeneity no independent evidence
purpose: Governs the boundary and yields the scoped fault-signature kernel.
Introduced as the governing quantity; treated as observable within the coverage universe but without independent measurement protocol.
scoped fault-signature kernel no independent evidence
purpose: Separates the MR-specific minimization question from ordinary fault-class counting.
Derived entity whose existence depends on the heterogeneity measure; no external corroboration supplied.

pith-pipeline@v0.9.1-grok · 5721 in / 1558 out tokens · 41384 ms · 2026-06-27T19:15:29.672137+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 1 canonical work pages

[1]

Metamorphic testing: A new ap- proach for generating next test cases,

T. Chen, S. Cheung, and S. Yiu, “Metamorphic testing: A new ap- proach for generating next test cases,”ArXiv, vol. abs/2002.12543, 2020

arXiv 2002
[2]

Metamorphic testing,

T. Chen, F.-C. Kuo, H. Liu, P . Poon, D. Towey, T. H. Tse, and Z. Zhou, “Metamorphic testing,”ACM Computing Surveys (CSUR), vol. 51, pp. 1 – 27, 2018

2018
[3]

A survey on metamorphic testing,

S. Segura, G. Fraser, A. B. Sánchez, and A. Ruiz-Cortés, “A survey on metamorphic testing,”IEEE Transactions on Software Engineer- ing, vol. 42, pp. 805–824, 2016

2016
[4]

An empirical study on the selection of good metamorphic relations,

J. Mayer and R. Guderlei, “An empirical study on the selection of good metamorphic relations,” in30th Annual International Com- puter Software and Applications Conference (COMPSAC’06), vol. 1, 2006, pp. 475–484

2006
[5]

Feedback-directed meta- morphic testing,

C. ai Sun, H. Dai, H. Liu, and T. Chen, “Feedback-directed meta- morphic testing,”ACM Transactions on Software Engineering and Methodology, vol. 32, pp. 1 – 34, 2022

2022
[6]

Test adequacy criteria for metamorphic testing,

Y. Liu, R. Li, H. Tao, and Z. Zheng, “Test adequacy criteria for metamorphic testing,” in2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C), 2024, pp. 527–534

2024
[7]

Theoretical and empirical analyses of the effectiveness of metamorphic relation composi- tion,

K. Qiu, Z. Zheng, T. Chen, and P . Poon, “Theoretical and empirical analyses of the effectiveness of metamorphic relation composi- tion,”IEEE Transactions on Software Engineering, vol. 48, pp. 1001– 1017, 2022

2022
[8]

A semantic mutation metric for metamorphic relation adequacy in scientific computing pro- grams,

M. Li, X. Yang, J. Liu, and S. Yan, “A semantic mutation metric for metamorphic relation adequacy in scientific computing pro- grams,” 2026, companion paper, under review at Information and Software Technology; Zenodo DOI 10.5281/zenodo.20250664

work page doi:10.5281/zenodo.20250664 2026
[9]

Reducibility among combinatorial problems,

R. M. Karp, “Reducibility among combinatorial problems,” in Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, Eds. New York: Plenum Press, 1972, pp. 85–103

1972
[10]

A greedy heuristic for the set-covering problem,

V . Chvátal, “A greedy heuristic for the set-covering problem,” Mathematics of Operations Research, vol. 4, no. 3, pp. 233–235, 1979

1979
[11]

A threshold oflnnfor approximating set cover,

U. Feige, “A threshold oflnnfor approximating set cover,”Journal of the ACM, vol. 45, no. 4, pp. 634–652, 1998

1998
[12]

AIM: Automated input set minimization for metamor- phic security testing,

N. Bayati Chaleshtari, Y. Marquer, F. Pastore, and L. C. Briand, “AIM: Automated input set minimization for metamor- phic security testing,”IEEE Transactions on Software Engineering, vol. 50, no. 12, pp. 3403–3434, 2024, arXiv preprint available at arXiv:2402.10773

arXiv 2024
[13]

Metamorphic testing on scientific programs for solving second-order elliptic differential equations,

S. Yan and H. Zhu, “Metamorphic testing on scientific programs for solving second-order elliptic differential equations,”Software Testing, Verification and Reliability, vol. 35, no. 1, p. e1912, 2025

2025
[14]

A selection method of effective metamorphic relations,

J. Zhang, J. Hong, D. Huang, M. Li, S. Yan, and H. Gong, “A selection method of effective metamorphic relations,” in2022 13th International Conference on Reliability, Maintainability, and Safety (ICRMS), 2022, pp. 75–80

2022
[15]

Metamorphic relations prioritiza- tion and selection based on test adequacy criteria,

D. Huang, Y. Luo, and M. Li, “Metamorphic relations prioritiza- tion and selection based on test adequacy criteria,” in2022 4th International Academic Exchange Conference on Science and Technology Innovation (IAECST), 2022, pp. 503–508

2022
[16]

A similarity-based metamorphic relations selection strategy for numerical computation programs,

S. Liu, S. Yan, and X. Yang, “A similarity-based metamorphic relations selection strategy for numerical computation programs,” in2022 4th International Conference on Frontiers Technology of Infor- mation and Computer (ICFTIC), 2022, pp. 290–294

2022
[17]

MRGS-ART: Metamorphic relation and group selection based on adaptive random testing,

Z. Ying, D. Towey, A. G. Bellotti, and Z. Q. Zhou, “MRGS-ART: Metamorphic relation and group selection based on adaptive random testing,”Software Testing, Verification and Reliability, vol. 35, no. 1, 2024

2024
[18]

Search-based selection of metamorphic relations for optimized robustness testing of large language models,

J. Hyun, M. Ali, and M. A. Babar, “Search-based selection of metamorphic relations for optimized robustness testing of large language models,” 2025. [Online]. Available: https: //arxiv.org/abs/2507.05565

arXiv 2025
[19]

Metamorphic relation automation: State of the art in detection, selection, and generation over two decades,

Z. Li, T. Wu, D. Xiang, M. Jiang, J. Huang, Z. Ding, and Y. Dong, “Metamorphic relation automation: State of the art in detection, selection, and generation over two decades,” 2025

2025
[20]

Tarski,A Decision Method for Elementary Algebra and Geometry, 2nd ed

A. Tarski,A Decision Method for Elementary Algebra and Geometry, 2nd ed. Berkeley and Los Angeles: University of California Press, 1951, original RAND Corporation report 1948; revised edition 1951 (no DOI); classic reference for first-order theory of real-closed fields decidability

1951
[21]

T. D. Cook and D. T. Campbell,Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston: Houghton Mifflin, 1979

1979
[22]

DeepCrime: Mu- tation testing of deep learning systems based on real faults,

N. Humbatova, G. Jahangirova, and P . Tonella, “DeepCrime: Mu- tation testing of deep learning systems based on real faults,” in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM, 2021, pp. 67–78

2021

[1] [1]

Metamorphic testing: A new ap- proach for generating next test cases,

T. Chen, S. Cheung, and S. Yiu, “Metamorphic testing: A new ap- proach for generating next test cases,”ArXiv, vol. abs/2002.12543, 2020

arXiv 2002

[2] [2]

Metamorphic testing,

T. Chen, F.-C. Kuo, H. Liu, P . Poon, D. Towey, T. H. Tse, and Z. Zhou, “Metamorphic testing,”ACM Computing Surveys (CSUR), vol. 51, pp. 1 – 27, 2018

2018

[3] [3]

A survey on metamorphic testing,

S. Segura, G. Fraser, A. B. Sánchez, and A. Ruiz-Cortés, “A survey on metamorphic testing,”IEEE Transactions on Software Engineer- ing, vol. 42, pp. 805–824, 2016

2016

[4] [4]

An empirical study on the selection of good metamorphic relations,

J. Mayer and R. Guderlei, “An empirical study on the selection of good metamorphic relations,” in30th Annual International Com- puter Software and Applications Conference (COMPSAC’06), vol. 1, 2006, pp. 475–484

2006

[5] [5]

Feedback-directed meta- morphic testing,

C. ai Sun, H. Dai, H. Liu, and T. Chen, “Feedback-directed meta- morphic testing,”ACM Transactions on Software Engineering and Methodology, vol. 32, pp. 1 – 34, 2022

2022

[6] [6]

Test adequacy criteria for metamorphic testing,

Y. Liu, R. Li, H. Tao, and Z. Zheng, “Test adequacy criteria for metamorphic testing,” in2024 IEEE 24th International Conference on Software Quality, Reliability, and Security Companion (QRS-C), 2024, pp. 527–534

2024

[7] [7]

Theoretical and empirical analyses of the effectiveness of metamorphic relation composi- tion,

K. Qiu, Z. Zheng, T. Chen, and P . Poon, “Theoretical and empirical analyses of the effectiveness of metamorphic relation composi- tion,”IEEE Transactions on Software Engineering, vol. 48, pp. 1001– 1017, 2022

2022

[8] [8]

A semantic mutation metric for metamorphic relation adequacy in scientific computing pro- grams,

M. Li, X. Yang, J. Liu, and S. Yan, “A semantic mutation metric for metamorphic relation adequacy in scientific computing pro- grams,” 2026, companion paper, under review at Information and Software Technology; Zenodo DOI 10.5281/zenodo.20250664

work page doi:10.5281/zenodo.20250664 2026

[9] [9]

Reducibility among combinatorial problems,

R. M. Karp, “Reducibility among combinatorial problems,” in Complexity of Computer Computations, R. E. Miller and J. W. Thatcher, Eds. New York: Plenum Press, 1972, pp. 85–103

1972

[10] [10]

A greedy heuristic for the set-covering problem,

V . Chvátal, “A greedy heuristic for the set-covering problem,” Mathematics of Operations Research, vol. 4, no. 3, pp. 233–235, 1979

1979

[11] [11]

A threshold oflnnfor approximating set cover,

U. Feige, “A threshold oflnnfor approximating set cover,”Journal of the ACM, vol. 45, no. 4, pp. 634–652, 1998

1998

[12] [12]

AIM: Automated input set minimization for metamor- phic security testing,

N. Bayati Chaleshtari, Y. Marquer, F. Pastore, and L. C. Briand, “AIM: Automated input set minimization for metamor- phic security testing,”IEEE Transactions on Software Engineering, vol. 50, no. 12, pp. 3403–3434, 2024, arXiv preprint available at arXiv:2402.10773

arXiv 2024

[13] [13]

Metamorphic testing on scientific programs for solving second-order elliptic differential equations,

S. Yan and H. Zhu, “Metamorphic testing on scientific programs for solving second-order elliptic differential equations,”Software Testing, Verification and Reliability, vol. 35, no. 1, p. e1912, 2025

2025

[14] [14]

A selection method of effective metamorphic relations,

J. Zhang, J. Hong, D. Huang, M. Li, S. Yan, and H. Gong, “A selection method of effective metamorphic relations,” in2022 13th International Conference on Reliability, Maintainability, and Safety (ICRMS), 2022, pp. 75–80

2022

[15] [15]

Metamorphic relations prioritiza- tion and selection based on test adequacy criteria,

D. Huang, Y. Luo, and M. Li, “Metamorphic relations prioritiza- tion and selection based on test adequacy criteria,” in2022 4th International Academic Exchange Conference on Science and Technology Innovation (IAECST), 2022, pp. 503–508

2022

[16] [16]

A similarity-based metamorphic relations selection strategy for numerical computation programs,

S. Liu, S. Yan, and X. Yang, “A similarity-based metamorphic relations selection strategy for numerical computation programs,” in2022 4th International Conference on Frontiers Technology of Infor- mation and Computer (ICFTIC), 2022, pp. 290–294

2022

[17] [17]

MRGS-ART: Metamorphic relation and group selection based on adaptive random testing,

Z. Ying, D. Towey, A. G. Bellotti, and Z. Q. Zhou, “MRGS-ART: Metamorphic relation and group selection based on adaptive random testing,”Software Testing, Verification and Reliability, vol. 35, no. 1, 2024

2024

[18] [18]

Search-based selection of metamorphic relations for optimized robustness testing of large language models,

J. Hyun, M. Ali, and M. A. Babar, “Search-based selection of metamorphic relations for optimized robustness testing of large language models,” 2025. [Online]. Available: https: //arxiv.org/abs/2507.05565

arXiv 2025

[19] [19]

Metamorphic relation automation: State of the art in detection, selection, and generation over two decades,

Z. Li, T. Wu, D. Xiang, M. Jiang, J. Huang, Z. Ding, and Y. Dong, “Metamorphic relation automation: State of the art in detection, selection, and generation over two decades,” 2025

2025

[20] [20]

Tarski,A Decision Method for Elementary Algebra and Geometry, 2nd ed

A. Tarski,A Decision Method for Elementary Algebra and Geometry, 2nd ed. Berkeley and Los Angeles: University of California Press, 1951, original RAND Corporation report 1948; revised edition 1951 (no DOI); classic reference for first-order theory of real-closed fields decidability

1951

[21] [21]

T. D. Cook and D. T. Campbell,Quasi-Experimentation: Design and Analysis Issues for Field Settings. Boston: Houghton Mifflin, 1979

1979

[22] [22]

DeepCrime: Mu- tation testing of deep learning systems based on real faults,

N. Humbatova, G. Jahangirova, and P . Tonella, “DeepCrime: Mu- tation testing of deep learning systems based on real faults,” in Proceedings of the 30th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM, 2021, pp. 67–78

2021