DRReduce: Enhancing Syntax-Guided Program Reduction with Dependency Reconstruction
Pith reviewed 2026-05-20 04:44 UTC · model grok-4.3
The pith
DRReduce repairs broken semantic dependencies after syntax deletions to boost acceptance rates and shrink programs more effectively than pure syntax-guided reducers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction to preserve the semantic validity of intermediate programs, and delegates further minimization to a syntax-guided reducer. On real-world bug-triggering programs this yields average size reductions of 51.9 percent over Perses, 14.9 percent over WDD, and 19.8 percent over CDD while finishing faster on most cases; it produces results comparable to CReduce and Latra yet runs 3.3 times faster than CReduce and 1.2 times faster than Latra on average.
What carries the argument
Dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker.
If this is right
- DRReduce produces final reduced programs that are on average more than 50 percent smaller than those from Perses.
- The method finishes reduction faster than Perses, WDD, and CDD on the majority of tested programs.
- Dependency reconstruction cuts the number of property-checker queries by roughly 80 percent and total reduction time by about 59 percent.
- DRReduce reaches reduction quality comparable to language-specific tools while remaining fully language-agnostic.
Where Pith is reading between the lines
- The same dependency-reconstruction idea could be inserted into other syntax-driven transformation tasks such as code simplification or refactoring pipelines.
- Extending the dependency-graph construction to additional languages would directly test whether the efficiency gains generalize beyond C and Java.
- If the overhead of graph construction remains low on very large inputs, the technique might enable reduction of programs that current syntax-only reducers cannot finish in reasonable time.
Load-bearing premise
A lightweight semantic dependency graph can be built from the input program and dependency reconstruction can reliably restore semantic validity after deletions without introducing new failures or excessive overhead.
What would settle it
Applying DRReduce and Perses to the same collection of real bug-triggering programs and observing that DRReduce does not produce smaller final token counts or require fewer property-checker queries on average would falsify the performance claims.
Figures
read the original abstract
Program reduction is a technique for simplifying large, failure-inducing programs into minimal reproducible test cases. Language-specific tools such as CReduce achieve strong performance by leveraging deep semantic knowledge of C/C++, but are tightly coupled to a single language family. Language-agnostic reducers such as Perses address this by applying syntax-guided search across any grammar, yet share a fundamental limitation: deleting a node or subtree in isolation often breaks semantic coherence causing the property checker to reject the deletion and forcing the reducer to backtrack, limiting overall reduction effectiveness and efficiency. In this paper, we propose DRReduce, a framework that bridges this gap by augmenting language-agnostic syntactic reduction with a lightweight semantic layer: dependency reconstruction, which repairs program dependencies broken by a deletion in order to preserve the semantic validity of intermediate programs and increase the acceptance rate of the property checker. DRReduce constructs a semantic dependency graph from the input program, performs semantically coherent deletions with dependency reconstruction, and delegates further minimization to a syntax-guided reducer. We implement DRReduce for C and Java and evaluate it on real-world bug-triggering programs. Compared to SOTA syntax-guided reducers, DRReduce achieves average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, while completing reduction faster on the majority of programs. Compared to language-specific tools, DRReduce achieves results comparable to CReduce and Latra without any language-specific transformation rules, at 3.3x and 1.2x higher efficiency than CReduce and Latra on average, respectively. An ablation study confirms that dependency reconstruction reduces query invocations by 80.2%, reduction time by 58.7%, and final token count by over 55.1%.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces DRReduce, a framework that augments language-agnostic syntax-guided program reduction with dependency reconstruction. It constructs a semantic dependency graph from the input program, performs deletions followed by reconstruction to repair broken dependencies and preserve semantic validity of intermediate programs, then delegates further minimization to a syntax-guided reducer. The authors claim average size reductions of 51.9%, 14.9%, and 19.8% over Perses, WDD, and CDD respectively, faster completion on most programs, and results comparable to CReduce and Latra (at 3.3x and 1.2x higher efficiency) without language-specific rules; an ablation study reports 80.2% fewer queries, 58.7% less time, and over 55.1% smaller final token count.
Significance. If the results hold and reductions preserve original failure-inducing behavior, the work would be significant for bridging syntax-guided and language-specific reduction without custom rules, potentially improving minimal test-case generation for debugging across languages. The ablation directly attributes gains to the reconstruction component.
major comments (2)
- [Evaluation] Evaluation section: the abstract and reported results give concrete percentage improvements and ablation numbers, yet supply no details on the number of subject programs, selection criteria, variance, or statistical tests. This makes it impossible to assess whether the 51.9%, 14.9%, and 19.8% gains are robust.
- [Dependency reconstruction description] Dependency reconstruction description (abstract and §3): the central claim rests on reconstruction increasing property-checker acceptance while preserving the original bug-triggering condition. No section verifies that reconstructed programs still reproduce the reported failure; if reconstruction alters data/control flow enough to eliminate the trigger, the checker may accept a non-triggering program, producing invalid reductions.
minor comments (1)
- [Abstract] The abstract could more explicitly state the exact number and characteristics of the real-world bug-triggering programs used in the evaluation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback. We address each major comment below and will revise the manuscript to improve clarity and completeness.
read point-by-point responses
-
Referee: [Evaluation] Evaluation section: the abstract and reported results give concrete percentage improvements and ablation numbers, yet supply no details on the number of subject programs, selection criteria, variance, or statistical tests. This makes it impossible to assess whether the 51.9%, 14.9%, and 19.8% gains are robust.
Authors: We agree that the evaluation section would benefit from additional details to allow readers to assess robustness. The current manuscript reports results on real-world bug-triggering programs but does not explicitly state the exact count, selection criteria, variance, or statistical tests. In the revised version we will add these: the number of programs, how they were selected, standard deviations or ranges for the reduction metrics, and results of statistical significance tests such as the Wilcoxon signed-rank test. revision: yes
-
Referee: [Dependency reconstruction description] Dependency reconstruction description (abstract and §3): the central claim rests on reconstruction increasing property-checker acceptance while preserving the original bug-triggering condition. No section verifies that reconstructed programs still reproduce the reported failure; if reconstruction alters data/control flow enough to eliminate the trigger, the checker may accept a non-triggering program, producing invalid reductions.
Authors: We appreciate this concern regarding the validity of reductions. By design, DRReduce applies the property checker (which verifies reproduction of the original failure) after each reconstruction step; only programs that pass the checker are accepted for further reduction. This ensures that accepted programs continue to trigger the reported failure. However, the manuscript does not contain an explicit verification subsection or additional results confirming this for all cases. We will revise §3 to describe the post-reconstruction checker invocation more clearly and add a short verification paragraph in the evaluation section reporting that all final reduced programs reproduce the original failures. revision: yes
Circularity Check
No circularity: empirical claims rest on external baselines
full rationale
The paper proposes DRReduce as a syntactic reducer augmented by dependency reconstruction and validates it solely via direct experimental comparison against independent external tools (Perses, WDD, CDD, CReduce, Latra) on real-world bug-triggering programs. No equations, fitted parameters, or predictions are defined inside the paper and then re-used as outputs; the acceptance-rate and size-reduction numbers are measured quantities, not tautological re-statements of the method itself. The dependency-reconstruction step is described as an algorithmic addition whose benefit is quantified by ablation and cross-tool runs rather than by self-definition or self-citation chains.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
DRReduce constructs a semantic dependency graph... performs semantically coherent deletions with dependency reconstruction... delegates further minimization to a syntax-guided reducer
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Stefanos Chaliasos, Thodoris Sotiropoulos, Diomidis Spinellis, Arthur Gervais, Benjamin Livshits, and Dimitris Mitropoulos. 2022. Finding typing compiler bugs. InProceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 183–198
work page 2022
-
[2]
GNU Compiler Collection Contributors. 2026. Reporting Bugs. https://gcc.gnu.org/bugs/
work page 2026
-
[3]
Qiong Feng, Xiaotian Ma, Ziyuan Feng, Marat Akhin, Wei Song, and Peng Liang. 2025. Finding Compiler Bugs through Cross-Language Code Generator and Differential Testing.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 2843–2869
work page 2025
-
[4]
Golnaz Gharachorlu and Nick Sumner. 2021. Leveraging models to reduce test cases in software repositories. In Proceedings of the 18th IEEE/ACM International Conference on Mining Software Repositories (MSR). IEEE, 230–241
work page 2021
-
[5]
JetBrains. 2024. Program Structure Interface. https://plugins.jetbrains.com/docs/intellij/psi.html
work page 2024
-
[6]
Christian Gram Kalhauge and Jens Palsberg. 2019. Binary reduction of dependency graphs. InProceedings of the 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 556–566
work page 2019
-
[7]
Christian Gram Kalhauge and Jens Palsberg. 2021. Logical bytecode reduction. InProceedings of the 42nd ACM SIGPLAN International Conference on Programming Language Design and Implementation (PLDI). ACM, 1003–1016
work page 2021
-
[8]
Christopher Lidbury, Andrei Lascu, Nathan Chong, and Alastair F Donaldson. 2015. Many-core compiler fuzzing. ACM SIGPLAN Notices50, 6 (2015), 65–76
work page 2015
-
[9]
LLVM Project. 2026. How to Submit a Bug Report. https://llvm.org/docs/HowToSubmitABug.html
work page 2026
-
[10]
Ghassan Misherghi and Zhendong Su. 2006. HDD: Hierarchical delta debugging. InProceedings of the 28th International Conference on Software Engineering (ICSE). ACM, 142–151
work page 2006
-
[11]
Loi Ngo Duc Nguyen, Tahiatul Islam, Theron Wang, Sam Lenz, and Martin Kellogg. 2025. Static Program Reduction via Type-Directed Slicing.Proceedings of the ACM on Software Engineering2, ISSTA (2025), 2068–2090
work page 2025
-
[12]
Diego Trevi no Ferrer. 2019. LLVM-Reduce for testcase reduction. https://llvm.org/devmtg/2019-10/talk-abstracts. html#tech22
work page 2019
-
[13]
John Regehr, Yang Chen, Pascal Cuoq, Eric Eide, Chucky Ellison, and Xuejun Yang. 2012. Test-case reduction for C compiler bugs. InProceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 335–346
work page 2012
-
[14]
Luyao Ren, Xing Zhang, Ziyue Hua, Yanyan Jiang, Xiao He, Yingfei Xiong, and Tao Xie. 2025. Validity-Preserving Delta Debugging via Generator Trace Reduction.ACM Transactions on Software Engineering and Methodology34, 3 (2025), 1–33
work page 2025
-
[15]
Daniil Stepanov, Marat Akhin, and Mikhail Belyaev. 2019. ReduKtor: How We Stopped Worrying About Bugs in Kotlin Compiler. InProceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 317–326
work page 2019
-
[16]
Chengnian Sun, Vu Le, and Zhendong Su. 2016. Finding compiler bugs via live code mutation. InProceedings of the 31st ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA). ACM, 849–863
work page 2016
-
[17]
Chengnian Sun, Yuanbo Li, Qirun Zhang, Tianxiao Gu, and Zhendong Su. 2018. Perses: Syntax-guided program reduction. InProceedings of the 40th International Conference on Software Engineering (ICSE). ACM, 361–371
work page 2018
-
[18]
Guancheng Wang, Ruobing Shen, Junjie Chen, Yingfei Xiong, and Lu Zhang. 2021. Probabilistic Delta Debugging. InProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 881–892. 22 Feng et al
work page 2021
-
[19]
Ohlsson, Björn Regnell, and Anders Wesslén
Claes Wohlin, Per Runeson, Martin Höst, Magnus C. Ohlsson, Björn Regnell, and Anders Wesslén. 2012.Experimentation in Software Engineering. Springer
work page 2012
-
[20]
Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Boosting Program Reduction with the Missing Piece of Syntax-Guided Transformation.Proceedings of the ACM on Programming Languages9, OOPSLA2 (2025), 86–112
work page 2025
-
[21]
Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Jiarui Zhang, Puzhuo Liu, Yu Jiang, and Chengnian Sun. 2025. T-rec: Fine-grained language-agnostic program reduction guided by lexical syntax.ACM Transactions on Software Engineering and Methodology34, 2 (2025), 1–31
work page 2025
-
[22]
Zhenyang Xu, Yongqiang Tian, Mengxiao Zhang, Gaosen Zhao, Yu Jiang, and Chengnian Sun. 2023. Pushing the limit of 1-minimality of language-agnostic program reduction.Proceedings of the ACM on Programming Languages7, OOPSLA1 (2023), 636–664
work page 2023
-
[23]
Zhenyang Xu, Yiran Wang, Yongqiang Tian, Mengxiao Zhang, and Chengnian Sun. 2025. Latra: A Template-Based Language-Agnostic Transformation Framework for Effective Program Reduction. InProceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2274–2285
work page 2025
-
[24]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, 283–294
work page 2011
-
[25]
Andreas Zeller and Ralf Hildebrandt. 2002. Simplifying and isolating failure-inducing input.IEEE Transactions on Software Engineering28, 2 (2002), 183–200
work page 2002
-
[26]
Jiang Zhang, Shuai Wang, Manuel Rigger, Pinjia He, and Zhendong Su. 2021. SANRAZOR: Reducing Redundant Sanitizer Checks in C/C++ Programs. InProceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Association, 479–494
work page 2021
-
[27]
Mengxiao Zhang, Yongqiang Tian, Zhenyang Xu, Yiwen Dong, Shin Hwei Tan, and Chengnian Sun. 2024. LPR: Large language models-aided program reduction. InProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA). ACM, 261–273
work page 2024
-
[28]
Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Xinru Cheng, and Chengnian Sun. 2025. Toward a Better Under- standing of Probabilistic Delta Debugging. InProceedings of the 47th International Conference on Software Engineering (ICSE). ACM, 2024–2035
work page 2025
-
[29]
Mengxiao Zhang, Zhenyang Xu, Yongqiang Tian, Yu Jiang, and Chengnian Sun. 2023. PPR: Pairwise Program Reduction. InProceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE). ACM, 338–349
work page 2023
-
[30]
Xintong Zhou, Zhenyang Xu, Mengxiao Zhang, Yongqiang Tian, and Chengnian Sun. 2025. WDD: Weighted Delta Debugging. InProceedings of the 47th IEEE/ACM International Conference on Software Engineering (ICSE). IEEE, 1592– 1603
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.