The parental parsimony problem on binary, tree-child phylogenetic networks
Pith reviewed 2026-06-26 03:07 UTC · model grok-4.3
The pith
The parental parsimony score problem admits a constant-factor approximation on binary semi-simplex tree-child phylogenetic networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We provide the first constant-factor approximation for the PPS on arbitrary but fixed leaf labels and a class of networks on which the PPS remains NP-hard, namely binary, semi-simplex, tree-child phylogenetic networks. Furthermore, we introduce a novel exact solution algorithm for the PPS on binary, tree-child phylogenetic networks and analyze its performance on simulated data.
What carries the argument
The parental parsimony score (PPS) that extends a fixed leaf labelling to a most-parsimonious tree inside the network while accounting for reticulation events.
If this is right
- PPS on binary semi-simplex tree-child networks can be solved to within a constant factor of optimality in polynomial time.
- An exact algorithm computes optimal solutions for the PPS restricted to binary tree-child networks.
- The exact algorithm's practical performance is measurable on simulated phylogenetic network data.
Where Pith is reading between the lines
- The semi-simplex restriction may be the precise boundary that separates constant-factor approximability from inapproximability inside the tree-child class.
- Hybrid exact-plus-approximation schemes become feasible once the exact solver is limited to the tree-child case and the approximation handles the semi-simplex case.
- The same algorithmic pattern could be tested on other parsimony variants that incorporate allopolyploidy or incomplete lineage sorting.
Load-bearing premise
The input networks belong to the restricted class of binary, semi-simplex, tree-child phylogenetic networks.
What would settle it
A family of binary semi-simplex tree-child networks on which every polynomial-time algorithm for PPS returns solutions whose cost exceeds any fixed multiple of the optimum would falsify the constant-factor claim.
Figures
read the original abstract
Phylogenetic reconstruction is one of the major challenges in computational biology. Among existing reconstruction methods for phylogenetic networks, an important subtask emerges in extending a leaf-labelling on a phylogenetic network to determine a most parsimonious tree inside the network. There exist different variants of this subtask depending on the biological model assumptions for which distinct evolutionary phenomena are captured by the network. In this article we assume that next to hybridization or recombination events, also allopolyploidy or incomplete lineage sorting are present. Then, finding the most parsimonious tree inside the network is called the parental parsimony score problem (PPS), a NP-hard combinatorial optimization problem. We provide the first constant-factor approximation for the PPS on arbitrary but fixed leaf labels and a class of networks on which the PPS remains NP-hard, namely binary, semi-simplex, tree-child phylogenetic networks. Furthermore, we introduce a novel exact solution algorithm for the PPS on binary, tree-child phylogenetic networks and analyze its performance on simulated data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript studies the parental parsimony score (PPS) problem on phylogenetic networks under models that include hybridization, recombination, allopolyploidy and incomplete lineage sorting. It proves NP-hardness of PPS on the class of binary, semi-simplex, tree-child networks, gives the first constant-factor approximation algorithm for the problem with arbitrary but fixed leaf labels on this class, presents a novel exact algorithm for the PPS on binary tree-child networks, and reports an experimental evaluation of the exact algorithm on simulated data.
Significance. If the stated approximation ratio is constant and the exact algorithm is correct, the results supply the first non-trivial algorithmic guarantees for an NP-hard variant of the PPS on a natural, restricted network class where hardness is also established. The empirical analysis of the exact method on simulated instances further supports practical utility in phylogenetic reconstruction.
minor comments (2)
- [Abstract] The abstract states that a constant-factor approximation is provided but does not name the factor; the introduction or the statement of the main theorem should give the explicit ratio.
- The precise definition of the semi-simplex property and its role in both the hardness proof and the approximation should be recalled or cross-referenced at the beginning of the algorithmic sections.
Simulated Author's Rebuttal
We thank the referee for their positive summary of our work on the parental parsimony score problem, for recognizing its significance, and for recommending minor revision. No major comments were listed in the report.
Circularity Check
No significant circularity; algorithmic result is self-contained
full rationale
The paper states an algorithmic result: a constant-factor approximation for PPS on binary, semi-simplex, tree-child networks (where NP-hardness also holds) plus an exact algorithm for binary tree-child networks. The abstract and claim formulation contain no equations, fitted parameters, predictions, or self-citations that reduce the central claim to its own inputs by construction. The network class restriction is explicitly part of the stated result rather than an unacknowledged circular premise. No load-bearing step matches any enumerated circularity pattern.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Branching rules revisited
Achterberg, T., Koch, T., Martin, A., 2005. Branching rules revisited. Operations Research Letters 33, 42–54
2005
-
[2]
Finding cuts in the TSP (A preliminary report)
Applegate, D., Bixby, R., Chvátal, V., Cook, W., 1995. Finding cuts in the TSP (A preliminary report). Technical Report 95-05. DIMACS
1995
-
[3]
Inferring phylogenies
Felsenstein, J., 2004. Inferring phylogenies. Sinauer associates Sunderland, MA
2004
-
[4]
On computing the maximum parsimon score of a phylogenetic network
Fischer, M., van Iersel, L., Kelk, S., Scornavacca, C., 2015. On computing the maximum parsimon score of a phylogenetic network. SIAM Journal on Discrete Mathematics 29, 559–585
2015
-
[5]
Towarddefiningthecourseofevolution:Minimumchangeforaspecifiedtreetopology
Fitch,W.M.,1971. Towarddefiningthecourseofevolution:Minimumchangeforaspecifiedtreetopology. SystematicZoology20,406–416
1971
-
[6]
A 2-approximation algorithm for the softwired parsimony problem on binary, tree-child phylogenetic networks
Frohn, M., Kelk, S., 2025. A 2-approximation algorithm for the softwired parsimony problem on binary, tree-child phylogenetic networks. Annals of Operations Research 345, 125–145
2025
-
[7]
A dichotomy law for certain classes of phylogenetic networks
Fuchs, M., Steel, M., 2025. A dichotomy law for certain classes of phylogenetic networks. Bulletin of Mathematical Biology 87, 153
2025
-
[8]
Phylogenetic networks from multi-labelled trees
Huber, K., Moulton, V., 2006. Phylogenetic networks from multi-labelled trees. Journal of Mathematical Biology 52, 613–632
2006
-
[9]
Folding and unfolding phylogenetic trees and networks
Huber, K., Moulton, V., Steel, M., Wu, T., 2016. Folding and unfolding phylogenetic trees and networks. Journal of Mathematical Biology 73, 1761–1780
2016
-
[10]
Apracticalfixed-parameteralgorithmforconstructingtree-childnetworks from multiple binary trees
vanIersel,L.,Janssen,R.,Jones,M.,Murakami,Y.,Zeh,N.,2022. Apracticalfixed-parameteralgorithmforconstructingtree-childnetworks from multiple binary trees. Algorithmica 84, 917–960
2022
-
[11]
Maximum likelihood of phylogenetic networks
Jin, G., Nakhleh, L., Snir, S., Tuller, T., 2006. Maximum likelihood of phylogenetic networks. Bioinformatics 22, 2604–2611
2006
-
[12]
Approximationalgorithmsforclassificationproblemswithpairwiserelationships:MetriclabelingandMarkov random fields
Kleinberg,J.,Tardos,E.,2002. Approximationalgorithmsforclassificationproblemswithpairwiserelationships:MetriclabelingandMarkov random fields. Journal of the ACM 49, 616–639. Frohnet al.:Preprint submitted to ElsevierPage 18 of 19 The PPS on binary, tree-child networks
2002
-
[13]
Classes of explicit phylogenetic networks and their biological and mathematical significance
Kong, S., Pons, J.C., Kubatko, L., Wicke, K., 2022. Classes of explicit phylogenetic networks and their biological and mathematical significance. Journal of Mathematical Biology 84
2022
-
[14]
NetRAX:accurateandfastmaximumlikelihoodphylogenetic network inference
Lutteropp,S.,Scornavacca,C.,Kozlov,A.M.,Morel,B.,Stamatakis,A.,2022. NetRAX:accurateandfastmaximumlikelihoodphylogenetic network inference. Bioinformatics 38, 3725–3733
2022
-
[15]
Reconstructing phylogenetic networks using maximum parsimony, in: 2005 IEEE Computational Systems Bioinformatics Conference (CSB’05), IEEE
Nakhleh, L., Jin, G., Zhao, F., Mellor-Crummey, J., 2005. Reconstructing phylogenetic networks using maximum parsimony, in: 2005 IEEE Computational Systems Bioinformatics Conference (CSB’05), IEEE. pp. 93–102
2005
-
[16]
Reconstructing recombination network from sequence data: the small parsimony problem
Nguyen, C.T., Nguyen, N.B., Sung, W.K., Zhang, L., 2007. Reconstructing recombination network from sequence data: the small parsimony problem. IEEE/ACM Transactions on Computational Biology and Bioinformatics 4, 394–402
2007
-
[17]
Machine learning augmented branch and bound for mixed integer linear programming
Scavuzzo, L., Aardal, K., Lodi, A., Yorke-Smith, N., 2024. Machine learning augmented branch and bound for mixed integer linear programming. Mathematical Programming , 1–44
2024
-
[18]
Schmidt, H., Raphael, B.J., 2025. The tree labeling polytope: a unified approach to ancestral reconstruction problems, in: International Conference on Research in Computational Molecular Biology, Stringer Nature Switzerland. pp. 273–276
2025
-
[19]
Ngesh: a Python library for synthetic phylogenetic data
Tresoldi, T., 2021. Ngesh: a Python library for synthetic phylogenetic data. Journal of Open Source Software 6, 3173
2021
-
[20]
Improved maximum parsimony models for phylogenetic networks
Van Iersel, L., Jones, M., Scornavacca, C., 2018. Improved maximum parsimony models for phylogenetic networks. Systematic biology 67, 518–542
2018
-
[21]
Locating a tree in a phylogenetic network
Van Iersel, L., Semple, C., Steel, M., 2010. Locating a tree in a phylogenetic network. Information Processing Letters 110, 1037–1043
2010
-
[22]
Polyploidyinfungi:evolutionafterwhole-genomeduplication
Warren,A.,Marullo,P.,2012. Polyploidyinfungi:evolutionafterwhole-genomeduplication. ProceedingsoftheRoyalSocietyB:Biological Sciences 279, 2497–2509
2012
-
[23]
Parsimoniousinferenceofhybridizationinthepresenceofincompletelineagesorting
Y.,Y.,M.,B.R.,Nakhleh,L.,2013. Parsimoniousinferenceofhybridizationinthepresenceofincompletelineagesorting. SystematicBiology 62, 738–751. Frohnet al.:Preprint submitted to ElsevierPage 19 of 19
2013
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.