DRReduce: Enhancing Syntax-Guided Program Reduction with Dependency Reconstruction

Peng Liang; Qiong Feng; Wei Song; Xiaotian Ma; Yongqiang Tian

arxiv: 2605.19412 · v1 · pith:EWUU4HMRnew · submitted 2026-05-19 · 💻 cs.SE · cs.PL

DRReduce: Enhancing Syntax-Guided Program Reduction with Dependency Reconstruction

Qiong Feng , Xiaotian Ma , Yongqiang Tian , Wei Song , Peng Liang This is my paper

Pith reviewed 2026-05-20 04:44 UTC · model grok-4.3

classification 💻 cs.SE cs.PL

keywords program reductionsyntax-guided reductiondependency reconstructiontest case minimizationsoftware debugginglanguage-agnostic toolsC programsJava programs

0 comments

The pith

DRReduce repairs broken semantic dependencies after syntax deletions to boost acceptance rates and shrink programs more effectively than pure syntax-guided reducers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Program reduction simplifies large failure-inducing code into minimal test cases, but syntax-guided tools like Perses often reject deletions that break semantic dependencies, forcing costly backtracking. DRReduce adds a lightweight semantic layer that builds a dependency graph from the input program and reconstructs dependencies after each deletion to keep intermediate versions valid for the property checker. This hybrid approach then hands off to standard syntax-guided minimization. The result is substantially smaller final programs and shorter runtimes than other language-agnostic methods, while matching the quality of language-specific tools such as CReduce without any hardcoded language rules. Readers care because the technique makes general, grammar-based reduction practical for debugging across many languages.

Core claim

DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction to preserve the semantic validity of intermediate programs, and delegates further minimization to a syntax-guided reducer. On real-world bug-triggering programs this yields average size reductions of 51.9 percent over Perses, 14.9 percent over WDD, and 19.8 percent over CDD while finishing faster on most cases; it produces results comparable to CReduce and Latra yet runs 3.3 times faster than CReduce and 1.2 times faster than Latra on average.

What carries the argument

Dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker.

If this is right

DRReduce produces final reduced programs that are on average more than 50 percent smaller than those from Perses.
The method finishes reduction faster than Perses, WDD, and CDD on the majority of tested programs.
Dependency reconstruction cuts the number of property-checker queries by roughly 80 percent and total reduction time by about 59 percent.
DRReduce reaches reduction quality comparable to language-specific tools while remaining fully language-agnostic.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same dependency-reconstruction idea could be inserted into other syntax-driven transformation tasks such as code simplification or refactoring pipelines.
Extending the dependency-graph construction to additional languages would directly test whether the efficiency gains generalize beyond C and Java.
If the overhead of graph construction remains low on very large inputs, the technique might enable reduction of programs that current syntax-only reducers cannot finish in reasonable time.

Load-bearing premise

A lightweight semantic dependency graph can be built from the input program and dependency reconstruction can reliably restore semantic validity after deletions without introducing new failures or excessive overhead.

What would settle it

Applying DRReduce and Perses to the same collection of real bug-triggering programs and observing that DRReduce does not produce smaller final token counts or require fewer property-checker queries on average would falsify the performance claims.

Figures

Figures reproduced from arXiv: 2605.19412 by Peng Liang, Qiong Feng, Wei Song, Xiaotian Ma, Yongqiang Tian.

**Figure 2.** Figure 2: The overall reduction result of WDD and DRReduce. effectiveness and efficiency compared to other language-agnostic methods, it still exhibits two key limitations on this example. Limitation 1: WDD generates invalid intermediate results, which limits efficiency. During node deletion, operations must often follow a strict order: removing elements out of sequence can violate syntactic or semantic dependencies… view at source ↗

**Figure 5.** Figure 5: DRReduce’s overall workflow a clean, semantically coherent core, Stage 3 encounters far fewer query invocations and converges faster than it would on the original input. The two stages are complementary, but the majority of DRReduce’s effectiveness gain over prior reducers comes from Stage 2’s dependency reconstruction— the core contribution of this work — which performs reductions that no purely syntactic… view at source ↗

**Figure 6.** Figure 6: Ablation study of dependency reconstruction across all 28 bug-triggering programs (w: with dependency [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

read the original abstract

Program reduction is a technique for simplifying large, failure-inducing programs into minimal reproducible test cases. Language-specific tools such as CReduce achieve strong performance by leveraging deep semantic knowledge of C/C++, but are tightly coupled to a single language family. Language-agnostic reducers such as Perses address this by applying syntax-guided search across any grammar, yet share a fundamental limitation: deleting a node or subtree in isolation often breaks semantic coherence causing the property checker to reject the deletion and forcing the reducer to backtrack, limiting overall reduction effectiveness and efficiency. In this paper, we propose DRReduce, a framework that bridges this gap by augmenting language-agnostic syntactic reduction with a lightweight semantic layer: dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker. DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction, and delegates further minimization to a syntax-guided reducer. We implement DRReduce for C and Java and evaluate it on real-world bug-triggering programs. Compared to SOTA syntax-guided reducers, DRReduce achieves average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, while completing reduction faster on the majority of programs. Compared to language-specific tools, DRReduce achieves results comparable to CReduce and Latra without any language-specific transformation rules, at 3.3x and 1.2x higher efficiency than CReduce and Latra on average, respectively. An ablation study confirms that dependency reconstruction reduces query invocations by 80.2%, reduction time by 58.7%, and final token count by over 55.1%.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DRReduce adds dependency reconstruction to syntax-guided reducers to raise deletion acceptance rates, with decent reported gains over Perses but thin evaluation details and an open question on whether reconstructions preserve the original bug trigger.

read the letter

DRReduce layers dependency reconstruction on syntax-guided program reduction. It builds a semantic dependency graph from the input program and repairs dependencies broken by deletions. This keeps intermediate programs valid for the property checker and allows more aggressive reduction without as many rejections. The paper evaluates this on C and Java programs. It reports average size reductions of 51.9% over Perses, 14.9% over WDD, and 19.8% over CDD, plus faster run times on most cases. It also matches the results of language-specific tools like CReduce and Latra but with better efficiency and no need for custom rules. The ablation study shows the reconstruction step cuts query count by 80%, time by 59%, and final size by 55%. This approach directly tackles a known limit in tools like Perses: isolated deletions often break semantics even when syntax is okay. Adding a repair mechanism is a clear, incremental step that stays mostly language-agnostic. The evaluation could be stronger. The reported gains are averages, but there is no information on the number of test programs, how they were selected, or any measure of variance or statistical significance. That makes it difficult to judge how consistent the improvements are. The stress-test concern also matters: reconstruction might fix dependencies in a way that removes the original bug trigger. The paper should demonstrate that the reduced programs still reproduce the reported failure rather than just passing the checker for a different reason. This paper is for researchers and practitioners working on automated testing and debugging tools. The idea is practical and the comparisons are broad enough to be worth discussing. It deserves peer review so the community can examine the implementation and push on the evaluation gaps.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces DRReduce, a framework that augments language-agnostic syntax-guided program reduction with dependency reconstruction. It constructs a semantic dependency graph from the input program, performs deletions followed by reconstruction to repair broken dependencies and preserve semantic validity of intermediate programs, then delegates further minimization to a syntax-guided reducer. The authors claim average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, faster completion on most programs, and results comparable to CReduce and Latra (at 3.3x and 1.2x higher efficiency) without language-specific rules; an ablation study reports 80.2% fewer queries, 58.7% less time, and over 55.1% smaller final token count.

Significance. If the results hold and reductions preserve original failure-inducing behavior, the work would be significant for bridging syntax-guided and language-specific reduction without custom rules, potentially improving minimal test-case generation for debugging across languages. The ablation directly attributes gains to the reconstruction component.

major comments (2)

[Evaluation] Evaluation section: the abstract and reported results give concrete percentage improvements and ablation numbers, yet supply no details on the number of subject programs, selection criteria, variance, or statistical tests. This makes it impossible to assess whether the 51.9%, 14.9%, and 19.8% gains are robust.
[Dependency reconstruction description] Dependency reconstruction description (abstract and §3): the central claim rests on reconstruction increasing property-checker acceptance while preserving the original bug-triggering condition. No section verifies that reconstructed programs still reproduce the reported failure; if reconstruction alters data/control flow enough to eliminate the trigger, the checker may accept a non-triggering program, producing invalid reductions.

minor comments (1)

[Abstract] The abstract could more explicitly state the exact number and characteristics of the real-world bug-triggering programs used in the evaluation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and completeness.

read point-by-point responses

Referee: [Evaluation] Evaluation section: the abstract and reported results give concrete percentage improvements and ablation numbers, yet supply no details on the number of subject programs, selection criteria, variance, or statistical tests. This makes it impossible to assess whether the 51.9%, 14.9%, and 19.8% gains are robust.

Authors: We agree that the evaluation section would benefit from additional details to allow readers to assess robustness. The current manuscript reports results on real-world bug-triggering programs but does not explicitly state the exact count, selection criteria, variance, or statistical tests. In the revised version we will add these: the number of programs, how they were selected, standard deviations or ranges for the reduction metrics, and results of statistical significance tests such as the Wilcoxon signed-rank test. revision: yes
Referee: [Dependency reconstruction description] Dependency reconstruction description (abstract and §3): the central claim rests on reconstruction increasing property-checker acceptance while preserving the original bug-triggering condition. No section verifies that reconstructed programs still reproduce the reported failure; if reconstruction alters data/control flow enough to eliminate the trigger, the checker may accept a non-triggering program, producing invalid reductions.

Authors: We appreciate this concern regarding the validity of reductions. By design, DRReduce applies the property checker (which verifies reproduction of the original failure) after each reconstruction step; only programs that pass the checker are accepted for further reduction. This ensures that accepted programs continue to trigger the reported failure. However, the manuscript does not contain an explicit verification subsection or additional results confirming this for all cases. We will revise §3 to describe the post-reconstruction checker invocation more clearly and add a short verification paragraph in the evaluation section reporting that all final reduced programs reproduce the original failures. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on external baselines

full rationale

The paper proposes DRReduce as a syntactic reducer augmented by dependency reconstruction and validates it solely via direct experimental comparison against independent external tools (Perses, WDD, CDD, CReduce, Latra) on real-world bug-triggering programs. No equations, fitted parameters, or predictions are defined inside the paper and then re-used as outputs; the acceptance-rate and size-reduction numbers are measured quantities, not tautological re-statements of the method itself. The dependency-reconstruction step is described as an algorithmic addition whose benefit is quantified by ablation and cross-tool runs rather than by self-definition or self-citation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical software-engineering contribution rather than a mathematical derivation. No free parameters, axioms, or invented entities are introduced in the abstract.

pith-pipeline@v0.9.0 · 5863 in / 1216 out tokens · 37561 ms · 2026-05-20T04:44:13.993628+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

DRReduce constructs a semantic dependency graph... performs semantically coherent deletions with dependency reconstruction... delegates further minimization to a syntax-guided reducer

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Stefanos Chaliasos, Thodoris Sotiropoulos, Diomidis Spinellis, Arthur Gervais, Benjamin Livshits, and Dimitris Mitropoulos. 2022. Finding typing compiler bugs. InProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 183–198

work page 2022
[2]

GNU Compiler Collection Contributors. 2026. Reporting Bugs. https://gcc.gnu.org/bugs/

work page 2026
[3]

Qiong Feng, Xiaotian Ma, Ziyuan Feng, Marat Akhin, Wei Song, and Peng Liang. 2025. Finding Compiler Bugs through Cross-Language Code Generator and Differential Testing.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 2843–2869

work page 2025
[4]

Golnaz Gharachorlu and Nick Sumner. 2021. Leveraging models to reduce test cases in software repositories. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories (MSR). IEEE, 230–241

work page 2021
[5]

JetBrains. 2024. Program Structure Interface. https://plugins.jetbrains.com/docs/intellij/psi.html

work page 2024
[6]

Christian Gram Kalhauge and Jens Palsberg. 2019. Binary reduction of dependency graphs. InProceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 556–566

work page 2019
[7]

Christian Gram Kalhauge and Jens Palsberg. 2021. Logical bytecode reduction. InProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 1003–1016

work page 2021
[8]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices50, 6 (2015), 65–76

work page 2015
[9]

LLVM Project. 2026. How to Submit a Bug Report. https://llvm.org/docs/HowToSubmitABug.html

work page 2026
[10]

Ghassan Misherghi and Zhendong Su. 2006. HDD: Hierarchical delta debugging. InProceedings of the 28th International Conference on Software Engineering (ICSE). ACM, 142–151

work page 2006
[11]

Loi Ngo Duc Nguyen, Tahiatul Islam, Theron Wang, Sam Lenz, and Martin Kellogg. 2025. Static Program Reduction via Type-Directed Slicing.Proceedings of the ACM on Software Engineering2, ISSTA (2025), 2068–2090

work page 2025
[12]

Diego Trevi no Ferrer. 2019. LLVM-Reduce for testcase reduction. https://llvm.org/devmtg/2019-10/talk-abstracts. html#tech22

work page 2019
[13]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. InProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 335–346

work page 2012
[14]

Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, and Tao Xie. 2025. Validity-Preserving Delta Debugging via Generator Trace Reduction.ACM Transactions on Software Engineering and Methodology34, 3 (2025), 1–33

work page 2025
[15]

Daniil Stepanov, Marat Akhin, and Mikhail Belyaev. 2019. ReduKtor: How We Stopped Worrying About Bugs in Kotlin Compiler. InProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 317–326

work page 2019
[16]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. InProceedings of the 31st ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 849–863

work page 2016
[17]

Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. InProceedings of the 40th International Conference on Software Engineering (ICSE). ACM, 361–371

work page 2018
[18]

Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang. 2021. Probabilistic Delta Debugging. InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 881–892. 22 Feng et al

work page 2021
[19]

Ohlsson, Björn Regnell, and Anders Wesslén

Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012.Experimentation in Software Engineering. Springer

work page 2012
[20]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Boosting Program Reduction with the Missing Piece of Syntax-Guided Transformation.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 86–112

work page 2025
[21]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Jiarui Zhang, Puzhuo Liu, Yu Jiang, and Chengnian Sun. 2025. T-rec: Fine-grained language-agnostic program reduction guided by lexical syntax.ACM Transactions on Software Engineering and Methodology34, 2 (2025), 1–31

work page 2025
[22]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Gaosen Zhao, Yu Jiang, and Chengnian Sun. 2023. Pushing the limit of 1-minimality of language-agnostic program reduction.Proceedings of the ACM on Programming Languages7, OOPSLA1 (2023), 636–664

work page 2023
[23]

Zhenyang Xu, Yiran Wang, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Latra: A Template-Based Language-Agnostic Transformation Framework for Effective Program Reduction. InProceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2274–2285

work page 2025
[24]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 283–294

work page 2011
[25]

Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input.IEEE Transactions on Software Engineering28, 2 (2002), 183–200

work page 2002
[26]

Jiang Zhang, Shuai Wang, Manuel Rigger, Pinjia He, and Zhendong Su. 2021. SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs. InProceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Association, 479–494

work page 2021
[27]

Mengxiao Zhang, Yongqiang Tian, Zhenyang Xu, Yiwen Dong, Shin Hwei Tan, and Chengnian Sun. 2024. LPR: Large language models-aided program reduction. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM, 261–273

work page 2024
[28]

Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Xinru Cheng, and Chengnian Sun. 2025. Toward a Better Under- standing of Probabilistic Delta Debugging. InProceedings of the 47th International Conference on Software Engineering (ICSE). ACM, 2024–2035

work page 2025
[29]

Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Yu Jiang, and Chengnian Sun. 2023. PPR: Pairwise Program Reduction. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 338–349

work page 2023
[30]

Xintong Zhou, Zhenyang Xu, Mengxiao Zhang, Yongqiang Tian, and Chengnian Sun. 2025. WDD: Weighted Delta Debugging. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE). IEEE, 1592– 1603

work page 2025

[1] [1]

Stefanos Chaliasos, Thodoris Sotiropoulos, Diomidis Spinellis, Arthur Gervais, Benjamin Livshits, and Dimitris Mitropoulos. 2022. Finding typing compiler bugs. InProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 183–198

work page 2022

[2] [2]

GNU Compiler Collection Contributors. 2026. Reporting Bugs. https://gcc.gnu.org/bugs/

work page 2026

[3] [3]

Qiong Feng, Xiaotian Ma, Ziyuan Feng, Marat Akhin, Wei Song, and Peng Liang. 2025. Finding Compiler Bugs through Cross-Language Code Generator and Differential Testing.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 2843–2869

work page 2025

[4] [4]

Golnaz Gharachorlu and Nick Sumner. 2021. Leveraging models to reduce test cases in software repositories. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories (MSR). IEEE, 230–241

work page 2021

[5] [5]

JetBrains. 2024. Program Structure Interface. https://plugins.jetbrains.com/docs/intellij/psi.html

work page 2024

[6] [6]

Christian Gram Kalhauge and Jens Palsberg. 2019. Binary reduction of dependency graphs. InProceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 556–566

work page 2019

[7] [7]

Christian Gram Kalhauge and Jens Palsberg. 2021. Logical bytecode reduction. InProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 1003–1016

work page 2021

[8] [8]

Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices50, 6 (2015), 65–76

work page 2015

[9] [9]

LLVM Project. 2026. How to Submit a Bug Report. https://llvm.org/docs/HowToSubmitABug.html

work page 2026

[10] [10]

Ghassan Misherghi and Zhendong Su. 2006. HDD: Hierarchical delta debugging. InProceedings of the 28th International Conference on Software Engineering (ICSE). ACM, 142–151

work page 2006

[11] [11]

Loi Ngo Duc Nguyen, Tahiatul Islam, Theron Wang, Sam Lenz, and Martin Kellogg. 2025. Static Program Reduction via Type-Directed Slicing.Proceedings of the ACM on Software Engineering2, ISSTA (2025), 2068–2090

work page 2025

[12] [12]

Diego Trevi no Ferrer. 2019. LLVM-Reduce for testcase reduction. https://llvm.org/devmtg/2019-10/talk-abstracts. html#tech22

work page 2019

[13] [13]

John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. InProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 335–346

work page 2012

[14] [14]

Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, and Tao Xie. 2025. Validity-Preserving Delta Debugging via Generator Trace Reduction.ACM Transactions on Software Engineering and Methodology34, 3 (2025), 1–33

work page 2025

[15] [15]

Daniil Stepanov, Marat Akhin, and Mikhail Belyaev. 2019. ReduKtor: How We Stopped Worrying About Bugs in Kotlin Compiler. InProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 317–326

work page 2019

[16] [16]

Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. InProceedings of the 31st ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 849–863

work page 2016

[17] [17]

Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. InProceedings of the 40th International Conference on Software Engineering (ICSE). ACM, 361–371

work page 2018

[18] [18]

Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang. 2021. Probabilistic Delta Debugging. InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 881–892. 22 Feng et al

work page 2021

[19] [19]

Ohlsson, Björn Regnell, and Anders Wesslén

Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012.Experimentation in Software Engineering. Springer

work page 2012

[20] [20]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Boosting Program Reduction with the Missing Piece of Syntax-Guided Transformation.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 86–112

work page 2025

[21] [21]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Jiarui Zhang, Puzhuo Liu, Yu Jiang, and Chengnian Sun. 2025. T-rec: Fine-grained language-agnostic program reduction guided by lexical syntax.ACM Transactions on Software Engineering and Methodology34, 2 (2025), 1–31

work page 2025

[22] [22]

Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Gaosen Zhao, Yu Jiang, and Chengnian Sun. 2023. Pushing the limit of 1-minimality of language-agnostic program reduction.Proceedings of the ACM on Programming Languages7, OOPSLA1 (2023), 636–664

work page 2023

[23] [23]

Zhenyang Xu, Yiran Wang, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Latra: A Template-Based Language-Agnostic Transformation Framework for Effective Program Reduction. InProceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2274–2285

work page 2025

[24] [24]

Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 283–294

work page 2011

[25] [25]

Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input.IEEE Transactions on Software Engineering28, 2 (2002), 183–200

work page 2002

[26] [26]

Jiang Zhang, Shuai Wang, Manuel Rigger, Pinjia He, and Zhendong Su. 2021. SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs. InProceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Association, 479–494

work page 2021

[27] [27]

Mengxiao Zhang, Yongqiang Tian, Zhenyang Xu, Yiwen Dong, Shin Hwei Tan, and Chengnian Sun. 2024. LPR: Large language models-aided program reduction. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM, 261–273

work page 2024

[28] [28]

Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Xinru Cheng, and Chengnian Sun. 2025. Toward a Better Under- standing of Probabilistic Delta Debugging. InProceedings of the 47th International Conference on Software Engineering (ICSE). ACM, 2024–2035

work page 2025

[29] [29]

Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Yu Jiang, and Chengnian Sun. 2023. PPR: Pairwise Program Reduction. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 338–349

work page 2023

[30] [30]

Xintong Zhou, Zhenyang Xu, Mengxiao Zhang, Yongqiang Tian, and Chengnian Sun. 2025. WDD: Weighted Delta Debugging. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE). IEEE, 1592– 1603

work page 2025