pith. sign in

arxiv: 2403.06568 · v2 · submitted 2024-03-11 · 💻 cs.AI

Better Understandings and Configurations in MaxSAT Local Search Solvers via Anytime Performance Analysis

Pith reviewed 2026-05-24 02:55 UTC · model grok-4.3

classification 💻 cs.AI
keywords MaxSATlocal searchanytime performanceECDFautomatic configurationhyperparameter optimizationstochastic solversconvergence analysis
0
0 comments X

The pith

Tracking full convergence with ECDFs lets automatic configurators find better parameter settings for MaxSAT local search solvers than final solution quality alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Traditional MaxSAT solver assessments measure only the quality of the best solution reached inside a fixed time budget. This paper shows that Empirical Cumulative Distribution Functions applied to entire convergence traces across many instances and budgets expose how solver advantages shift with allowed runtime. It further demonstrates that feeding this anytime performance measure into an automatic configurator produces superior parameter settings compared with optimizing for end-of-run solution fitness.

Core claim

Empirical Cumulative Distribution Functions can be used to compare MaxSAT stochastic local search solvers' anytime performance across multiple problem instances and various time budgets. The assessment reveals distinctions in solvers' performance and displays that the (dis)advantages of solvers adjust along different running times. This quantitative assessment of anytime performance can guide machines, i.e., automatic configurators, to search for better parameter settings, with SMAC achieving better results when using anytime performance as the cost function.

What carries the argument

Empirical Cumulative Distribution Functions (ECDF) computed over solver convergence traces and used directly as the cost function inside hyperparameter optimization.

If this is right

  • Solver performance distinctions appear that final-solution metrics alone do not capture.
  • Automatic configurators locate better parameter settings when the objective is anytime performance rather than best-found fitness.
  • Solver (dis)advantages vary systematically with the allowed running time.
  • The same ECDF-based cost can be substituted into other hyperparameter tools for local search MaxSAT solvers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach supplies a practical way to trade off solution quality against speed when designing or tuning stochastic solvers.
  • Applying the same ECDF cost to configuration of local search methods on related problems such as MaxCSP could be tested directly.
  • Collecting longer traces on harder instances would reveal whether the observed time-dependent advantages persist or invert at larger scales.

Load-bearing premise

The chosen MaxSAT instances, time budgets, and ECDF aggregation produce a cost function whose superiority generalizes beyond the tested cases.

What would settle it

Re-run the configuration experiment on a disjoint collection of MaxSAT instances never seen during tuning; if the anytime-based configuration no longer outperforms the final-quality configuration on those instances, the claim fails.

Figures

Figures reproduced from arXiv: 2403.06568 by Chuan Luo, Furong Ye, Shaowei Cai.

Figure 1
Figure 1. Figure 1: Heatmap illustrating the scores for individual instances. Each row represents an instance, while each column represents [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Given a set Φ of solutions plotted by red stars, we present on the right subfigure the number of instances in Φ that are not better than the corresponding ϕo for each t, and the values normalized by the size of Φ are ECDFs. Because ECDF is with a ratio scale, we can evaluate the ECDFs of a solver A for a set of optimization times t to mea￾sure A’s anytime performance. ECDFs can also be combined [PITH_FULL… view at source ↗
Figure 3
Figure 3. Figure 3: Aggregated ECDFs of each solver for different sets [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Heatmap illustrating aggregated ECDFs for individual instances. It follows the layout of Figure 1, but the color depicts [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Aggregated ECDFs of the configurations obtained [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 1
Figure 1. Figure 1: Heatmap of scores of four solvers on groups of instances. [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Heatmap of ECDFs of four solvers on groups of instances. [PITH_FULL_IMAGE:figures/full_fig_p015_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: ECDFs of the configurations obtained by tuning [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
read the original abstract

Though numerous solvers have been proposed for the MaxSAT problem, and the benchmark environment such as MaxSAT Evaluations provides a platform for the comparison of the state-of-the-art solvers, existing assessments were usually evaluated based on the quality, e.g., fitness, of the best-found solutions obtained within a given running time budget. However, concerning solely the final obtained solutions regarding specific time budgets may restrict us from comprehending the behavior of the solvers along the convergence process. This paper demonstrates that Empirical Cumulative Distribution Functions can be used to compare MaxSAT stochastic local search solvers' anytime performance across multiple problem instances and various time budgets. The assessment reveals distinctions in solvers' performance and displays that the (dis)advantages of solvers adjust along different running times. This work also exhibits that the quantitative and high variance assessment of anytime performance can guide machines, i.e., automatic configurators, to search for better parameter settings. Our experimental results show that the hyperparameter optimization tool, i.e., SMAC, can achieve better parameter settings of solvers when using the anytime performance as the cost function, compared to using the metrics based on the fitness of the best-found solutions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper argues that standard MaxSAT solver evaluations based solely on final solution fitness within fixed time budgets miss important distinctions in solver behavior during search. It proposes using Empirical Cumulative Distribution Functions (ECDF) aggregated over multiple instances and time budgets to quantify anytime performance, shows that this reveals time-dependent (dis)advantages among stochastic local search solvers, and reports that configuring solvers with SMAC using an ECDF-based cost function yields better parameter settings than using best-found fitness as the cost.

Significance. If the empirical superiority holds under proper controls, the work would strengthen the case for anytime metrics in automatic configuration of heuristic solvers, moving beyond single-point fitness to capture convergence dynamics. The ECDF approach itself is a standard tool in other domains and its application here is a clear methodological contribution that could be adopted more widely if the configuration results generalize.

major comments (2)
  1. [Experimental results / abstract] The central claim that SMAC produces superior configurations with the ECDF cost function (abstract and experimental results section) rests on an empirical comparison whose details are insufficient: no information is given on the number of independent SMAC runs, statistical significance tests between the two cost functions, variance across runs, or benchmark selection criteria.
  2. [Experimental results] The comparison between ECDF and fitness-based cost functions does not state whether the MaxSAT instances (or time budgets) used during SMAC configuration are disjoint from those used to evaluate the final configured solvers. Without a hold-out set, any reported advantage could result from overfitting to the specific benchmark collection rather than from a property of the ECDF metric.
minor comments (2)
  1. [Introduction] The paper should include references to prior uses of ECDF for anytime analysis in SAT or other combinatorial search domains to better situate the contribution.
  2. [Figures] Figure captions and axis labels for the ECDF plots should explicitly state the aggregation method (e.g., how multiple instances and time budgets are combined) for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the referee's thorough review and constructive feedback. We provide point-by-point responses to the major comments and outline the revisions we intend to make.

read point-by-point responses
  1. Referee: [Experimental results / abstract] The central claim that SMAC produces superior configurations with the ECDF cost function (abstract and experimental results section) rests on an empirical comparison whose details are insufficient: no information is given on the number of independent SMAC runs, statistical significance tests between the two cost functions, variance across runs, or benchmark selection criteria.

    Authors: We concur that more details are required for the empirical comparison. The revised manuscript will specify the number of independent SMAC runs, include statistical significance tests between the configurations obtained with the two cost functions, report the variance across runs, and detail the benchmark selection criteria. These changes will be made in the experimental results section to strengthen the central claim. revision: yes

  2. Referee: [Experimental results] The comparison between ECDF and fitness-based cost functions does not state whether the MaxSAT instances (or time budgets) used during SMAC configuration are disjoint from those used to evaluate the final configured solvers. Without a hold-out set, any reported advantage could result from overfitting to the specific benchmark collection rather than from a property of the ECDF metric.

    Authors: We agree that the lack of a hold-out set is a limitation in the current experimental design. The manuscript does not indicate disjoint sets for configuration and evaluation. To address this, we will revise the experimental setup to include a hold-out set of MaxSAT instances and report the results of the configuration comparison on this unseen set. This will help demonstrate whether the advantage generalizes beyond the training benchmarks. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical comparison of cost functions with no derivations or self-referential reductions.

full rationale

The paper contains no equations, derivations, or mathematical claims that could reduce to inputs by construction. The central result is an empirical observation that SMAC produces better solver configurations when the cost function is an ECDF-based anytime performance metric rather than final fitness. This is a direct experimental comparison inside an off-the-shelf configurator; no fitted parameters are redefined as predictions, no self-citations are load-bearing for a uniqueness theorem, and no ansatz is smuggled. The absence of any derivation chain means the patterns in the circularity checklist do not apply. Generalization concerns (e.g., benchmark overlap) are validity issues, not circularity.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central empirical claim depends on the representativeness of the chosen benchmarks and time budgets plus the assumption that ECDF aggregation captures the performance aspects relevant to users.

free parameters (1)
  • Solver hyperparameters tuned by SMAC
    The parameters whose settings are optimized; their values are discovered by the configurator rather than fixed in advance.
axioms (1)
  • domain assumption The MaxSAT instances and time budgets used in the experiments are representative of practical use cases.
    The superiority claim for the ECDF cost function is only meaningful if the test set reflects real workloads.

pith-pipeline@v0.9.0 · 5733 in / 1224 out tokens · 22948 ms · 2026-05-24T02:55:18.125820+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages · 1 internal anchor

  1. [1]

    SCIP: solving constraint integer programs

    Tobias Achterberg. SCIP: solving constraint integer programs. Mathematical Programming Computation , 1(1):1--41, 2009

  2. [2]

    Stochastic local search for partial max-sat: an experimental evaluation

    Haifa Hamad AlKasem and Mohamed El Bachir Menai. Stochastic local search for partial max-sat: an experimental evaluation. Artificial Intelligence Review , 54(4):2525--2566, 2021

  3. [3]

    Maxsat evaluation 2023

    Jeremias Berg, Matti J \"a rvisalo, Ruben Martins, and Andreas Niskanen. Maxsat evaluation 2023. Department of Computer Science Series of Publications B , B-2023-2, 2023

  4. [4]

    Old techniques in new ways: Clause weighting, unit propagation and hybridization for maximum satisfiability

    Shaowei Cai and Zhendong Lei. Old techniques in new ways: Clause weighting, unit propagation and hybridization for maximum satisfiability. Artificial Intelligence , 287:103354, 2020

  5. [5]

    Tailoring local search for partial maxsat

    Shaowei Cai, Chuan Luo, John Thornton, and Kaile Su. Tailoring local search for partial maxsat. In Carla E. Brodley and Peter Stone, editors, Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27 -31, 2014, Qu \' e bec City, Qu \' e bec, Canada , pages 2623--2629. AAAI Press, 2014

  6. [6]

    From decimation to local search and back: A new approach to maxsat

    Shaowei Cai, Chuan Luo, and Haochen Zhang. From decimation to local search and back: A new approach to maxsat. In Carles Sierra, editor, Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI 2017, Melbourne, Australia, August 19-25, 2017 , pages 571--577. ijcai.org, 2017

  7. [7]

    Nuwls: Improving local search for (weighted) partial maxsat by new weighting techniques

    Yi Chu, Shaowei Cai, and Chuan Luo. Nuwls: Improving local search for (weighted) partial maxsat by new weighting techniques. In Brian Williams, Yiling Chen, and Jennifer Neville, editors, Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteent...

  8. [8]

    IOHexperimenter : Benchmarking platform for iterative optimization heuristics

    Jacob de Nobel, Furong Ye, Diederick Vermetten, Hao Wang, Carola Doerr, and Thomas B \"a ck. IOHexperimenter : Benchmarking platform for iterative optimization heuristics. Evolutionary Computation , pages 1--6, 2023

  9. [9]

    Hall, Pietro S

    George T. Hall, Pietro S. Oliveto, and Dirk Sudholt. On the impact of the performance metric on efficient algorithm configuration. Artificial Intelligence , 303:103629, 2022

  10. [10]

    COCO: Performance Assessment

    Nikolaus Hansen, Anne Auger, Dimo Brockhoff, Dejan Tusar, and Tea Tusar. COCO: performance assessment. CoRR , abs/1605.03560, 2016

  11. [11]

    Anytime performance assessment in blackbox optimization benchmarking

    Nikolaus Hansen, Anne Auger, Dimo Brockhoff, and Tea Tusar. Anytime performance assessment in blackbox optimization benchmarking. IEEE Transactions on Evolutionary Computation , 26(6):1293--1305, 2022

  12. [12]

    Large neighbourhood search for anytime maxsat solving

    Randy Hickey and Fahiem Bacchus. Large neighbourhood search for anytime maxsat solving. In Luc De Raedt, editor, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022 , pages 1818--1824. ijcai.org, 2022

  13. [13]

    Solving (weighted) partial maxsat by dynamic local search for SAT

    Zhendong Lei and Shaowei Cai. Solving (weighted) partial maxsat by dynamic local search for SAT . In J \' e r \^ o me Lang, editor, Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI 2018, July 13-19, 2018, Stockholm, Sweden , pages 1346--1352. ijcai.org, 2018

  14. [14]

    SMAC3: A versatile bayesian optimization package for hyperparameter optimization

    Marius Lindauer, Katharina Eggensperger, Matthias Feurer, Andr \' e Biedenkapp, Difan Deng, Carolin Benjamins, Tim Ruhkopf, Ren \' e Sass, and Frank Hutter. SMAC3: A versatile bayesian optimization package for hyperparameter optimization. Journal of Machine Learning Research , 23:54:1--54:9, 2022

  15. [15]

    Automatically improving the anytime behaviour of optimisation algorithms

    Manuel L \' o pez - Ib \' a \ n ez and Thomas St \" u tzle. Automatically improving the anytime behaviour of optimisation algorithms. European Journal Of Operational Research , 235(3):569--582, 2014

  16. [16]

    The irace package: Iterated racing for automatic algorithm configuration

    Manuel L \'o pez-Ib \'a \ n ez, J \'e r \'e mie Dubois-Lacoste, Leslie P \'e rez C \'a ceres, Mauro Birattari, and Thomas St \"u tzle. The irace package: Iterated racing for automatic algorithm configuration. Operations Research Perspectives , 3:43--58, 2016

  17. [17]

    IOHanalyzer : Detailed performance analyses for iterative optimization heuristics

    Hao Wang, Diederick Vermetten, Furong Ye, Carola Doerr, and Thomas B \" a ck. IOHanalyzer : Detailed performance analyses for iterative optimization heuristics. ACM Transactions on Evolutionary Learning and Optimization , 2(1):3:1--3:29, 2022

  18. [18]

    Automated configuration of genetic algorithms by tuning for anytime performance

    Furong Ye, Carola Doerr, Hao Wang, and Thomas B \" a ck. Automated configuration of genetic algorithms by tuning for anytime performance. IEEE Transactions on Evolutionary Computation , 26(6):1526--1538, 2022

  19. [19]

    Bandmaxsat: A local search maxsat solver with multi-armed bandit

    Jiongzhi Zheng, Kun He, Jianrong Zhou, Yan Jin, Chu - Min Li, and Felip Many \` a . Bandmaxsat: A local search maxsat solver with multi-armed bandit. In Luc De Raedt, editor, Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI 2022, Vienna, Austria, 23-29 July 2022 , pages 1901--1907. ijcai.org, 2022

  20. [20]

    Farsighted probabilistic sampling: A general strategy for boosting local search maxsat solvers

    Jiongzhi Zheng, Kun He, and Jianrong Zhou. Farsighted probabilistic sampling: A general strategy for boosting local search maxsat solvers. In Brian Williams, Yiling Chen, and Jennifer Neville, editors, Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2...

  21. [21]

    write newline

    " write newline "" before.all 'output.state := FUNCTION fin.entry add.period write newline FUNCTION new.block output.state before.all = 'skip after.block 'output.state := if FUNCTION new.sentence output.state after.block = 'skip output.state before.all = 'skip after.sentence 'output.state := if if FUNCTION not #0 #1 if FUNCTION and 'skip pop #0 if FUNCTIO...