Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth
Pith reviewed 2026-05-25 03:05 UTC · model grok-4.3
The pith
Node scanwidth enables O*(2^nsw B^2) algorithms for budgeted PD on networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Node scanwidth decompositions enable dynamic programming to optimize two budgeted PD variants on networks in O*(2^nsw B^2) time and to compute PD scores for the third variant in O*(3^nsw) time, together with the first exact methods to compute the scanwidth itself.
What carries the argument
Node scanwidth (nsw), a measure of the tree-likeness of a phylogenetic network used as the width of a decomposition for dynamic programming.
If this is right
- Two budgeted PD variants become solvable via DP when nsw is bounded.
- PD scores for the harder variant are computable in O*(3^nsw) time.
- Node scanwidth can be computed exactly for the first time using data reduction, DP and ILP.
- The methods handle heterogeneous costs on networks with several hundred taxa.
Where Pith is reading between the lines
- If many empirical phylogenetic networks have small node scanwidth, the algorithms could enable practical conservation analyses involving reticulation.
- The same parameterization may extend to other selection or diversity problems on networks.
- Pairing exact nsw computation with the PD solver yields an end-to-end pipeline for real datasets.
Load-bearing premise
The input networks admit a node scanwidth decomposition whose width is small enough to be practical and the three problem variants are correctly captured by the DP recurrences.
What would settle it
On a small network, compare the maximum PD value returned by the DP against the value from exhaustive enumeration of all budget-feasible subsets; any mismatch falsifies the correctness of the recurrences.
read the original abstract
Identifying a subset of taxa that maximizes Phylogenetic Diversity (PD) is a cornerstone of quantitative conservation planning. Traditionally, PD is defined over a phylogenetic tree in which leaves resemble present-day taxa and the branch lengths capture the estimated evolutionary distinctiveness. While PD maximization is computationally tractable on trees with unit costs, the problem becomes NP-hard when transitioning to phylogenetic networks or to budgeted versions in which protecting taxa incurs non-homogeneous costs. In this paper, we address these two challenges by providing definitions and a comprehensive analysis of three distinct variants of budgeted PD on networks. We conduct our study through the lens of a small structural parameter, node scanwidth (nsw), which measures the "tree-likeness" of a phylogenetic network. We show that two of the considered variants can be optimized in O*(2^nsw B^2) time, where B is the budget. For the computationally harder, third variant, we provide an algorithm to compute PD scores in O*(3^nsw) time. We further contribute the first exact algorithms to compute node scanwidth, recognizing that the utility of algorithms based on nsw depends on the ability to compute nsw and its corresponding decomposition. Our approaches integrate data reduction rules, dynamic programming, and an Integer Linear Programming formulation. We validate our theoretical results through extensive experiments on highly reticulated, simulated networks containing several hundred taxa, using heterogeneous costs. Our implementation computes PD scores and optimal nsw in fractions of a second, even on the most challenging instances. Furthermore, our budgeted optimization algorithms significantly outperform existing benchmarks for computing PD on networks, which were previously limited to unit-cost scenarios. The software makes analyses even on networks with a thousand taxa tracta...
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript defines three variants of budgeted phylogenetic diversity (PD) maximization on phylogenetic networks, introduces node scanwidth (nsw) as a structural parameter, and gives dynamic programming algorithms running in O*(2^{nsw} B^2) time for two variants and O*(3^{nsw}) time to compute PD scores for the third. It also supplies the first exact algorithms for computing nsw (via data reduction, DP, and ILP) and reports experiments on simulated reticulate networks with hundreds of taxa showing small nsw values, fast runtimes, and improvement over unit-cost baselines.
Significance. If the claims hold, the work is significant for providing the first parameterized, practical algorithms for budgeted PD on networks. The explicit DP tables, nice-decomposition correctness arguments, and exact nsw computation directly address the NP-hardness barrier while demonstrating tractability on instances with hundreds of taxa; the reproducible experimental validation on heterogeneous costs further strengthens the contribution to computational phylogenetics.
Simulated Author's Rebuttal
We thank the referee for the positive review, detailed summary of our contributions, and recommendation to accept. There are no major comments requiring a point-by-point response.
Circularity Check
No significant circularity identified
full rationale
The derivation chain consists of an explicit definition of node scanwidth (nsw) as a width parameter on phylogenetic networks, followed by new dynamic-programming recurrences whose state spaces are bounded by 2^nsw or 3^nsw and whose correctness is argued by standard induction on nice decompositions. The claimed running times follow directly from the size of those tables; the algorithms for computing nsw itself are presented as independent contributions using data reduction and ILP. No step reduces a claimed result to a fitted quantity, a self-citation chain, or a redefinition of the input; the central claims remain independent of the paper's own outputs.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Dynamic programming over a node scanwidth decomposition correctly solves the budgeted PD variants when the decomposition is given.
Reference graph
Works this paper leans on
-
[1]
Jones, Mark and Schestag, Jannik , booktitle =. 2023 , organization =
work page 2023
-
[2]
van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias , booktitle =. 2025 , organization =
work page 2025
-
[3]
Average-Tree Phylogenetic Diversity Parameterized by Scanwidth and Invisibility
van Iersel, Leo and Jones, Mark and Schestag, Jannik and Scornavacca, Celine and Weller, Mathias , year =. arXiv , jorunal =:2604.27745 , TODOdoi =
work page internal anchor Pith review Pith/arXiv arXiv
-
[4]
Jones, Mark and Schestag, Jannik , booktitle =. 2025 , organization =
work page 2025
-
[5]
bioRxiv: 10.1101/2025.11.14.688467 , doi =
Holtgrefe, Niels and van Iersel, Leo and Meuwese, Ruben and Murakami, Yukihiro and Schestag, Jannik , year =. bioRxiv: 10.1101/2025.11.14.688467 , doi =
-
[6]
Komusiewicz, Christian and Schestag, Jannik , booktitle =. 2024 , pages =
work page 2024
- [7]
-
[8]
arXiv , jorunal =:2602.06903 , TODOdoi =
Jannik Schestag and Norbert Zeh , year =. arXiv , jorunal =:2602.06903 , TODOdoi =
-
[9]
Niels Holtgrefe and Jannik Schestag and Norbert Zeh , booktitle =. 2026 , organization =. arXiv , jorunal =:2602.12959 , TODOdoi =
-
[10]
Komusiewicz, Christian and Schestag, Jannik , journal =. 2025 , publisher =
work page 2025
- [11]
-
[12]
and Matzke, Nicholas and Tomiya, Susumu and others , journal =
Barnosky, Anthony D. and Matzke, Nicholas and Tomiya, Susumu and others , journal =. 2011 , publisher =
work page 2011
-
[13]
and Bouchet, Philippe and Fontaine, Beno
Cowie, Robert H. and Bouchet, Philippe and Fontaine, Beno. Biological Reviews , volume =. 2022 , publisher =
work page 2022
-
[14]
Vane-Wright, Richard I. and Humphries, Christopher J. and Williams, Paul H. , journal =. 1991 , publisher =
work page 1991
-
[15]
Margules, Christopher R. and Pressey, Robert L. , journal =. 2000 , publisher =
work page 2000
-
[16]
Sarkar, Sahotra and Pressey, Robert L. and Faith, Daniel P. and others , journal =. 2006 , publisher =
work page 2006
- [17]
- [18]
- [19]
-
[20]
Minh, Bui Quang and Klaere, Steffen and von Haeseler, Arndt , journal =. 2006 , month =
work page 2006
- [21]
- [22]
-
[23]
Bordewich, Magnus and Semple, Charles and Wicke, Kristina , journal =. 2022 , publisher =
work page 2022
- [24]
- [25]
- [26]
-
[27]
Proceedings of the 39th annual ACM Symposium on Theory of Computing (STOC 2007) , pages =
Bj. Proceedings of the 39th annual ACM Symposium on Theory of Computing (STOC 2007) , pages =. 2007 , doi =
work page 2007
-
[28]
and Kowalik, Lukasz and Lokshtanov, Daniel and Marx, D
Cygan, Marek and Fomin, Fedor V. and Kowalik, Lukasz and Lokshtanov, Daniel and Marx, D. 2015 , doi =
work page 2015
-
[29]
Concrete mathematics: a foundation for computer science , author =. 1994 , url =
work page 1994
-
[30]
Journal of Computer and System Sciences , doi=
Holtgrefe, Niels and van Iersel, Leo and Jones, Mark , title =. Journal of Computer and System Sciences , doi=
- [31]
-
[32]
and Solis-Lemus, Claudia and Heath, Tracy A
Justison, Joshua A. and Solis-Lemus, Claudia and Heath, Tracy A. , journal =. 2023 , publisher =
work page 2023
- [33]
- [34]
-
[35]
Koster, Arie M.C.A. and Bodlaender, Hans L. and Van Hoesel, Stan P.M. , journal=. 2001 , publisher=
work page 2001
-
[36]
and Kaski, Petteri and Komusiewicz, Christian and Rosamond, Frances A
Dell, Holger and Husfeldt, Thore and Jansen, Bart M.P. and Kaski, Petteri and Komusiewicz, Christian and Rosamond, Frances A. , booktitle =. 2017 , organization=
work page 2017
-
[37]
BMC Evolutionary Biology , volume =
Koblm. BMC Evolutionary Biology , volume =. 2007 , publisher =
work page 2007
-
[38]
Huson, Daniel H and Rupp, Regula and Scornavacca, Celine , year =
-
[39]
Moulton, Vincent and Semple, Charles and Steel, Mike , journal =. 2007 , publisher =
work page 2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.