Delta Debugging in the Absence of Test Oracles Through Metamorphic Testing
Pith reviewed 2026-07-02 08:28 UTC · model grok-4.3
The pith
Delta debugging can minimize inputs for programs without test oracles by using metamorphic testing to check property preservation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DDMT redesigns the test function inside delta debugging by establishing metamorphic relations that decide whether a candidate input still preserves the original property, then substitutes this oracle-independent function for the usual test function so the entire reduction procedure runs without access to any test oracle.
What carries the argument
The metamorphic testing-based test function that replaces the original test function and decides property preservation by checking relations between the original input and each reduced candidate.
If this is right
- DDMT applies delta debugging to oracle-deficient programs where output correctness cannot be checked directly.
- Reduction effectiveness measured by final input size is often preserved or improved compared with standard delta debugging.
- Number of test queries required during reduction is often preserved or improved.
- Proper configuration choices yield performance gains over the compared delta debugging approaches.
Where Pith is reading between the lines
- Teams working on programs whose outputs resist direct checking could define domain-specific metamorphic relations and immediately gain automated input minimization.
- The same substitution pattern might let other search-based debugging techniques operate without oracles.
- If metamorphic relations are incomplete for some property, the reduced input may still contain irrelevant parts that a perfect oracle would have removed.
Load-bearing premise
Metamorphic relations can be written so the resulting test function correctly decides whether a reduced input still exhibits the target property.
What would settle it
A reduced input produced by DDMT that no longer exhibits the target property yet the metamorphic relation reports preservation, or a preserving input that the relation rejects.
Figures
read the original abstract
Delta debugging provides an automatic way to minimize a program input while preserving a certain property. However, its effectiveness fundamentally relies on the availability of test oracles to determine whether a reduced input still preserves the specific property. Consequently, the oracle problem substantially limits the applicability of existing delta debugging techniques, particularly for oracle-deficient programs where output correctness cannot be directly determined. To address this problem, this paper proposes a novel approach, DDMT, to enhance the applicability of delta debugging, especially facilitating its application to oracle-deficient programs. Our key insight is to redesign an oracle-independent test function and incorporate it into the reduction procedure of delta debugging such that the property-preservation validation can be accomplished without requiring a test oracle. To this end, DDMT employs the technique of metamorphic testing, which is a property-based and oracle-independent testing method. It establishes a metamorphic testing-based test function, using it as a replacement for the original test function adopted by delta debugging. The experiments evaluate DDMT on 66 subjects across both oracle-available and oracle-deficient scenarios, with different delta debugging approaches. The results positively confirm that DDMT can enhance the applicability of delta debugging while often preserving or improving reduction effectiveness and query efficiency. Furthermore, compared to the relevant delta debugging approaches, DDMT is also able to achieve performance improvements with proper configurations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes DDMT, a technique that substitutes a metamorphic-testing-based test function for the oracle-dependent test predicate in delta debugging. This allows input minimization to proceed in oracle-deficient settings by using user-defined metamorphic relations to check property preservation. Experiments on 66 subjects across oracle-available and oracle-deficient scenarios with multiple delta-debugging variants are reported to show that DDMT extends applicability while often preserving or improving reduction effectiveness and query efficiency.
Significance. If the metamorphic relations can be shown to faithfully proxy the target properties, the work would meaningfully broaden delta debugging to the large class of programs where oracles are unavailable, addressing a well-known practical barrier. The approach is conceptually direct and the experimental scale (66 subjects) is reasonable, but the absence of any demonstration that the chosen MRs correctly encode the intended properties limits the strength of the claimed improvements.
major comments (2)
- [Abstract] Abstract: the central claim that DDMT 'establishes a metamorphic testing-based test function' as a replacement rests on the unexamined premise that suitable metamorphic relations exist and correctly decide property preservation for arbitrary oracle-deficient properties; no argument, construction procedure, or validation that the MR returns true exactly when the reduced input still satisfies the original property is supplied.
- [Abstract] Abstract (experimental claim): the statement that DDMT 'often preserv[es] or improv[es] reduction effectiveness and query efficiency' on 66 subjects is presented without any metrics, statistical tests, or discussion of confounds, so the quantitative support for the central claim cannot be evaluated.
minor comments (1)
- The abstract would be clearer if it briefly indicated how the metamorphic relations are constructed or selected for the evaluated subjects.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major comment below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that DDMT 'establishes a metamorphic testing-based test function' as a replacement rests on the unexamined premise that suitable metamorphic relations exist and correctly decide property preservation for arbitrary oracle-deficient properties; no argument, construction procedure, or validation that the MR returns true exactly when the reduced input still satisfies the original property is supplied.
Authors: DDMT is designed as a framework that integrates user-defined metamorphic relations (MRs) into delta debugging, following the standard practice in metamorphic testing where MRs are constructed from domain knowledge of the target property. The paper's contribution is the redesign of the test function and its embedding in the reduction algorithm rather than a general procedure for constructing MRs for arbitrary properties (which would be infeasible without property-specific insight). We acknowledge that the abstract does not sufficiently qualify this assumption. We will revise the abstract to state explicitly that DDMT's correctness depends on the fidelity of the supplied MRs and that the work assumes users provide appropriate relations based on their understanding of the property. revision: yes
-
Referee: [Abstract] Abstract (experimental claim): the statement that DDMT 'often preserv[es] or improv[es] reduction effectiveness and query efficiency' on 66 subjects is presented without any metrics, statistical tests, or discussion of confounds, so the quantitative support for the central claim cannot be evaluated.
Authors: The abstract is intended as a concise summary; the full experimental section reports concrete metrics (reduction ratios, query counts), applies statistical tests such as the Wilcoxon signed-rank test for significance, and discusses potential confounds including subject diversity and MR selection. To address the concern directly, we will revise the abstract to include representative quantitative results and a brief reference to the statistical analysis performed. revision: yes
Circularity Check
No circularity; direct methodological substitution evaluated empirically.
full rationale
The paper proposes DDMT by replacing delta debugging's oracle-dependent test function with a metamorphic-testing-based predicate. No equations, fitted parameters, self-citations, or uniqueness theorems appear in the provided text. The central claim rests on the empirical results across 66 subjects rather than reducing by construction to its own definitions or inputs. This is a standard non-circular presentation of a technique substitution.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Metamorphic relations can be defined to capture the property-preservation decision without an oracle
Reference graph
Works this paper leans on
-
[1]
Simplifying and isolating failure-inducing input,
A. Zeller and R. Hildebrandt, “Simplifying and isolating failure-inducing input,”IEEE Transactions on Software Engineering, vol. 28, no. 2, pp. 183–200, 2002
2002
-
[2]
Zeller,Why programs fail: a guide to systematic debugging
A. Zeller,Why programs fail: a guide to systematic debugging. Morgan Kaufmann, 2009
2009
-
[3]
Simplifying and isolating failure-inducing input: A retrospective on delta debugging,
A. Zeller and R. Hildebrandt, “Simplifying and isolating failure-inducing input: A retrospective on delta debugging,”IEEE Transactions on Software Engineering, vol. 51, no. 3, pp. 820–824, 2025
2025
-
[4]
Isolating cause-effect chains from computer programs,
A. Zeller, “Isolating cause-effect chains from computer programs,”ACM SIGSOFT Software Engineering Notes, vol. 27, no. 6, pp. 1–10, 2002
2002
-
[5]
Minimizing reproduction of software failures,
M. Burger and A. Zeller, “Minimizing reproduction of software failures,” inProceedings of the 2011 International Symposium on Software Testing and Analysis, 2011, pp. 221–231
2011
-
[6]
Minimizing GUI event traces,
L. Clapp, O. Bastani, S. Anand, and A. Aiken, “Minimizing GUI event traces,” inProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, ser. FSE 2016, 2016, pp. 422–434
2016
-
[7]
On the use of delta debugging to reduce recordings and facilitate debugging of web applications,
M. Hammoudi, B. Burg, G. Bae, and G. Rothermel, “On the use of delta debugging to reduce recordings and facilitate debugging of web applications,” inProceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, ser. ESEC/FSE 2015, 2015, pp. 333–344
2015
-
[8]
C2d2: Extracting critical changes for real-world bugs with dependency-sensitive delta debugging,
X. Song, Y . Wu, S. Liu, B. Chen, Y . Lin, and X. Peng, “C2d2: Extracting critical changes for real-world bugs with dependency-sensitive delta debugging,” inProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2024, pp. 300–312
2024
-
[9]
Delta debugging microservice systems,
X. Zhou, X. Peng, T. Xie, J. Sun, W. Li, C. Ji, and D. Ding, “Delta debugging microservice systems,” inProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, ser. ASE
-
[10]
New York, NY , USA: Association for Computing Machinery, 2018, pp. 802–807
2018
-
[11]
Delta debugging for llm-integrated systems,
H.-N. Zhu, M. N. Mansur, M. Sch ¨af, Z. Chen, T. Lepoint, and W. Visser, “Delta debugging for llm-integrated systems,” inProceedings of the IEEE/ACM 48th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2026, pp. 70–80
2026
-
[12]
HDD: Hierarchical delta debugging,
G. Misherghi and Z. Su, “HDD: Hierarchical delta debugging,” inPro- ceedings of the 28th International Conference on Software Engineering, 2006, pp. 142–151
2006
-
[13]
Automatically reducing tree- structured test inputs,
S. Herfert, J. Patra, and M. Pradel, “Automatically reducing tree- structured test inputs,” inProceedings of the 32nd IEEE/ACM Interna- tional Conference on Automated Software Engineering, 2017, pp. 861 – 871
2017
-
[14]
Perses: Syntax-guided program reduction,
C. Sun, Y . Li, Q. Zhang, T. Gu, and Z. Su, “Perses: Syntax-guided program reduction,” inProceedings of the 40th International Conference on Software Engineering, 2018, pp. 361–371
2018
-
[15]
Probabilistic delta debugging,
G. Wang, R. Shen, J. Chen, Y . Xiong, and L. Zhang, “Probabilistic delta debugging,” inProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2021, pp. 881–892
2021
-
[16]
Wdd: Weighted delta debugging,
X. Zhou, Z. Xu, M. Zhang, Y . Tian, and C. Sun, “Wdd: Weighted delta debugging,” in2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025, pp. 1592–1603
2025
-
[17]
Validity-preserving delta debugging via generator trace reduction,
L. Ren, X. Zhang, Z. Hua, Y . Jiang, X. He, Y . Xiong, and T. Xie, “Validity-preserving delta debugging via generator trace reduction,” ACM Transactions on Software Engineering and Methodology, vol. 34, no. 3, pp. 1–33, 2025
2025
-
[18]
Coarse hierarchical delta debugging,
R. Hodovan, ´A. Kiss, and T. Gyimothy, “Coarse hierarchical delta debugging,” in2017 IEEE International Conference on Software Main- tenance and Evolution (ICSME), Sep. 2017, pp. 194–203
2017
-
[19]
Cause reduction: Delta debugging, even without bugs,
A. Groce, M. A. Alipour, C. Zhang, Y . Chen, and J. Regehr, “Cause reduction: Delta debugging, even without bugs,”Software Testing, Ver- ification and Reliability, vol. 26, no. 1, pp. 40–68, Jan. 2016
2016
-
[20]
Reduce before you localize: Delta-debugging and spectrum-based fault localization,
A. Christi, M. L. Olson, M. A. Alipour, and A. Groce, “Reduce before you localize: Delta-debugging and spectrum-based fault localization,” in 2018 IEEE International Symposium on Software Reliability Engineer- ing Workshops (ISSREW), Oct 2018, pp. 184–191
2018
-
[21]
The oracle problem in software testing: A survey,
E. T. Barr, M. Harman, P. McMinn, M. Shahbaz, and S. Yoo, “The oracle problem in software testing: A survey,”IEEE Transactions on Software Engineering, vol. 41, no. 5, pp. 507–525, 2015
2015
-
[22]
Metamorphic testing: A new approach for generating next test cases,
T. Y . Chen, S. C. Cheung, and S. M. Yiu, “Metamorphic testing: A new approach for generating next test cases,” Department of Computer Science, Hong Kong University of Science and Technology, Hong Kong, Tech. Rep. HKUST-CS98-01, 1998
1998
-
[23]
A survey on metamorphic testing,
S. Segura, G. Fraser, A. B. Sanchez, and A. Ruiz-Cort ´es, “A survey on metamorphic testing,”IEEE Transactions on Software Engineering, vol. 42, no. 9, pp. 805–824, 2016
2016
-
[24]
Metamorphic testing: A review of challenges and opportunities,
T. Y . Chen, F.-C. Kuo, H. Liu, P.-L. Poon, D. Towey, T. H. Tse, and Z. Q. Zhou, “Metamorphic testing: A review of challenges and opportunities,” ACM Computing Surveys, vol. 51, no. 1, pp. 4:1–4:27, Jan. 2018
2018
-
[25]
Metamorphic slice: An application in spectrum-based fault localization,
X. Xie, W. E. Wong, T. Y . Chen, and B. W. Xu, “Metamorphic slice: An application in spectrum-based fault localization,”Information and Software Technology, vol. 55, no. 5, pp. 866–879, 2013
2013
-
[26]
Semi-proving: An integrated method for program proving, testing and debugging,
T. Y . Chen, T. H. Tse, and Z. Q. Zhou, “Semi-proving: An integrated method for program proving, testing and debugging,”IEEE Transactions on Software Engineering, vol. 37, no. 1, pp. 109 – 125, 2011
2011
-
[27]
Input test suites for program repair: A novel construction method based on metamorphic relations,
M. Jiang, T. Y . Chen, Z. Q. Zhou, and Z. Ding, “Input test suites for program repair: A novel construction method based on metamorphic relations,”IEEE Transactions on Reliability, vol. 70, no. 1, pp. 285– 303, 2021
2021
-
[28]
Toward a better understanding of probabilistic delta debugging,
M. Zhang, Z. Xu, Y . Tian, X. Cheng, and C. Sun, “Toward a better understanding of probabilistic delta debugging,” in2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025, pp. 2024–2035
2025
-
[29]
Compiler validation via equivalence modulo inputs,
V . Le, M. Afshari, and Z. Su, “Compiler validation via equivalence modulo inputs,”ACM Sigplan Notices, vol. 49, no. 6, pp. 216–226, 2014
2014
-
[30]
Con- tinuous variable analyses: T-test, mann–whitney, wilcoxin rank,
M. D. Riina, C. Stambaugh, N. Stambaugh, and K. E. Huber, “Con- tinuous variable analyses: T-test, mann–whitney, wilcoxin rank,” in Translational radiation oncology. Elsevier, 2023, pp. 153–163
2023
-
[31]
How effectively does metamorphic testing alleviate the oracle problem?
H. Liu, F.-C. Kuo, D. Towey, and T. Y . Chen, “How effectively does metamorphic testing alleviate the oracle problem?”IEEE Transactions on Software Engineering, vol. 40, no. 1, pp. 4–22, 2013
2013
-
[32]
Metamorphic relation generation: State of the art and research directions,
R. Li, H. Liu, P.-L. Poon, D. Towey, C.-A. Sun, Z. Zheng, Z. Q. Zhou, and T. Y . Chen, “Metamorphic relation generation: State of the art and research directions,”ACM Transactions on Software Engineering and Methodology, vol. 34, no. 5, pp. 1–25, 2025
2025
-
[33]
Metamorphic testing of deep learning compilers,
D. Xiao, Z. Liu, Y . Yuan, Q. Pang, and S. Wang, “Metamorphic testing of deep learning compilers,”Proceedings of the ACM on Measurement and Analysis of Computing Systems, vol. 6, no. 1, pp. 1–28, 2022
2022
-
[34]
Contextual understanding and im- provement of metamorphic testing in scientific software development,
Z. Peng, U. Kanewala, and N. Niu, “Contextual understanding and im- provement of metamorphic testing in scientific software development,” inProceedings of the 15th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), 2021, pp. 1–6
2021
-
[35]
Identifying implementation bugs in ma- chine learning based image classifiers using metamorphic testing,
A. Dwarakanath, M. Ahuja, S. Sikand, R. M. Rao, R. J. C. Bose, N. Dubash, and S. Podder, “Identifying implementation bugs in ma- chine learning based image classifiers using metamorphic testing,” in Proceedings of the 27th ACM SIGSOFT international symposium on software testing and analysis, 2018, pp. 118–128
2018
-
[36]
Metamor- phic testing for web system security,
N. B. Chaleshtari, F. Pastore, A. Goknil, and L. C. Briand, “Metamor- phic testing for web system security,”IEEE Transactions on Software Engineering, vol. 49, no. 6, pp. 3430–3471, 2023
2023
-
[37]
Qtran: Extending metamorphic-oracle based logical bug detection techniques for multiple-dbms dialect support,
L. Lin, Q. Zhu, H. Chen, Z. Wang, R. Wu, and X. Xie, “Qtran: Extending metamorphic-oracle based logical bug detection techniques for multiple-dbms dialect support,”Proceedings of the ACM on Software Engineering, vol. 2, no. ISSTA, pp. 731–752, 2025
2025
-
[38]
Modernizing hierarchical delta debugging,
R. Hodov ´an and ´A. Kiss, “Modernizing hierarchical delta debugging,” inProceedings of the 7th International Workshop on Automating Test Case Design, Selection, and Evaluation, 2016, pp. 31–37
2016
-
[39]
Hddr: a recursive variant of the hierarchical delta debugging algorithm,
´A. Kiss, R. Hodov´an, and T. Gyim´othy, “Hddr: a recursive variant of the hierarchical delta debugging algorithm,” inProceedings of the 9th ACM SIGSOFT International Workshop on Automating TEST Case Design, Selection, and Evaluation, 2018, pp. 16–22
2018
-
[40]
LPR: Large language models-aided program reduction,
M. Zhang, Y . Tian, Z. Xu, Y . Dong, S. H. Tan, and C. Sun, “LPR: Large language models-aided program reduction,” inProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2024, pp. 261–273
2024
-
[41]
Fault-based testing in the absence of an oracle,
T. Y . Chen, T. H. Tse, and Z. Q. Zhou, “Fault-based testing in the absence of an oracle,” inProceedings of the 25th Annual International Computer Software and Applications Conference (COMPSAC’01), 2001, pp. 172–178
2001
-
[42]
A metamorphic testing approach for supporting program repair without the need for a test oracle,
M. Jiang, T. Y . Chen, F.-C. Kuo, D. Towey, and Z. Ding, “A metamorphic testing approach for supporting program repair without the need for a test oracle,”Journal of Systems and Software, vol. 126, pp. 127–140, 2017
2017
-
[43]
Enhance combinatorial testing with metamorphic relations,
X. Niu, Y . Sun, H. Wu, G. Li, C. Nie, L. Yu, and X. Wang, “Enhance combinatorial testing with metamorphic relations,”IEEE Transactions on Software Engineering, vol. 48, no. 12, pp. 5007–5029, 2021
2021
-
[44]
Mtgp: Combining metamorphic testing and genetic programming,
D. Sobania, M. Briesch, P. R ¨ochner, and F. Rothlauf, “Mtgp: Combining metamorphic testing and genetic programming,” inEuropean Conference 13 on Genetic Programming (Part of EvoStar). Springer, 2023, pp. 324– 338
2023
-
[45]
Experimental results. ddmt
“Experimental results. ddmt.” [Online]. Available: https://github.com/ymxl85/DDMT
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.