pith. sign in

arxiv: 2605.23126 · v1 · pith:U3QI4S5Tnew · submitted 2026-05-22 · 🧬 q-bio.PE

Asymptotic Counting of Binary Phylogenetic Networks

Pith reviewed 2026-05-25 02:55 UTC · model grok-4.3

classification 🧬 q-bio.PE
keywords phylogenetic networksbinary networksreticulationsasymptotic enumerationtree-child networksedge insertion
0
0 comments X

The pith

When k=o(sqrt n), binary phylogenetic networks with k reticulations on n taxa number asymptotically binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n}.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives an asymptotic formula for counting binary phylogenetic networks that allow reticulations to grow slowly with the number of taxa. It starts from known counts for the stricter tree-child subclass and shows that the extra networks introduced by general binary structures contribute only lower-order terms. A reader would care because these networks represent possible evolutionary histories that include hybridization or horizontal transfer, so an explicit leading-term count gives a concrete size for the space of such histories. The derivation relies on edge-insertion constructions to control local configurations and on subtracting the exceptional cases from the tree-child total.

Core claim

Using edge insertion, the authors analyze local structures that affect constructions of binary phylogenetic networks. By bounding the contribution of networks with exceptional local configurations and combining these bounds with known asymptotic formulas for tree-child networks, they show that when k=o(sqrt n), the number of binary phylogenetic networks with k reticulations on n taxa is asymptotic to binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n}.

What carries the argument

Edge-insertion constructions that generate networks while bounding the contribution of exceptional local configurations, subtracted from the tree-child asymptotic count.

If this is right

  • The leading term is the same as the tree-child count multiplied by the binomial factor binom(n,k).
  • Most binary phylogenetic networks with o(sqrt n) reticulations are tree-child.
  • Enumeration algorithms can safely ignore or correct for a vanishing fraction of non-tree-child structures in this range.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same edge-insertion technique might be refined to obtain asymptotics for k up to n^{1/3} or higher by tightening the exceptional-configuration bounds.
  • The formula supplies a natural null model for statistical tests that ask whether an observed set of networks is larger or smaller than expected under uniform sampling.
  • If the asymptotic continues to hold for k linear in n, it would imply that the total number of binary networks grows like the number of labeled trees times an exponential factor in k.

Load-bearing premise

The contribution of networks containing exceptional local configurations can be bounded so tightly that it does not affect the leading asymptotic term.

What would settle it

Exact enumeration of all binary phylogenetic networks for n=30 and k=4, compared against the numerical value of the claimed asymptotic expression.

Figures

Figures reproduced from arXiv: 2605.23126 by Hao Yu, Louxin Zhang.

Figure 1
Figure 1. Figure 1: (a) A binary phylogenetic network N on one taxon a. (b) All tree components of N. (c) The component graph of N. (d) The unique reticulation cluster of N [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of edge insertion in a network. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Special cases. (a) A highly symmetric network, which can only be generated by [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: A binary phylogenetic network (BPN) N on taxa a and b (left). Deleting (p, r1) and contracting p and r1 yields the same BPN (right) as deleting (q, r1) and contracting q and r1. Conversely, N can be obtained from the right BPN by a unique edge insertion. as deleting eb and contracting pb and rb. Conversely, N can be obtained from N′ by inserting an edge between a unique pair of edges, as illustrated in Fig… view at source ↗
Figure 5
Figure 5. Figure 5: Another binary phylogenetic network (left) that can be obtained by edge insertion [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: (a) (Type-1 motif) Two reticulation nodes r1 and r2 have the same parents. (b) (Type-2 motif) A tree node p is a common parent of two reticulation nodes r1 and r2. The other parents of r1 and r2 are distinct, and r2 is a parent of another reticulation node r3. The node r3 may or may not be a child of r1. (c) (Type-3 motif) A tree node p is a common parent of two reticulation nodes r1 and r2, the parent of … view at source ↗
Figure 6
Figure 6. Figure 6: Theorem 1. Let N be a phylogenetic network with k reticulations that form c clusters, where c ≤ k. If N cannot be obtained in k + c distinct ways by inserting an edge onto an edge between a tree node and a reticulation node in a network with k − 1 reticulation nodes, then N must contain one of the subgraphs shown in [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Two methods for constructing BPNs that contain an induced subgraph of type-5 [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
read the original abstract

Phylogenetic networks provide a general framework for modeling reticulate evolutionary processes such as hybridization, recombination, and horizontal gene transfer. In this paper, we study the asymptotic counting of binary phylogenetic networks with $k$ reticulations on $n$ taxa, where $k$ is allowed to grow with $n$. Using edge insertion, we analyze the local structures that affect the number of possible constructions of such networks. By bounding the contribution of networks with exceptional local configurations and combining these bounds with known asymptotic formulas for tree-child networks, we show that, when $k=o(\sqrt n)$, the number of binary phylogenetic networks with $k$ reticulations on $n$ taxa is asymptotic to \[ \binom{n}{k}2^{n+k-1/2}n^{n+k-1}e^{-n}. \]

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that when k=o(√n), the number of binary phylogenetic networks with k reticulations on n taxa is asymptotic to binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n}. The derivation proceeds by using edge-insertion constructions to enumerate and bound the contribution of networks containing exceptional local configurations, then subtracting this contribution from the known asymptotic count of tree-child networks to obtain the stated leading term.

Significance. If the error bounds on exceptional configurations hold uniformly in the stated regime, the result would extend the known tree-child asymptotics to the full class of binary phylogenetic networks for sub-square-root reticulation counts. This supplies a precise combinatorial count that could support downstream statistical or algorithmic work on reticulate evolution models.

major comments (2)
  1. [edge-insertion analysis of exceptional configurations] The section deriving the bound on networks with exceptional local configurations (via edge insertion): the argument must establish that this subtracted term is o( binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n} ) uniformly for all k=o(√n). The provided sketch controls the contribution only up to a factor that may reach exp(Θ(k²/n)), which is not guaranteed to remain o(1) throughout the regime and therefore does not yet support the claimed leading asymptotic.
  2. [main asymptotic statement] The combination step with prior tree-child asymptotics: the error term inherited from the tree-child formula must be shown to remain negligible after subtraction of the exceptional term; no explicit propagation of the o(·) remainder through the subtraction is given.
minor comments (2)
  1. [abstract] The abstract states the result but does not define the precise notion of 'exceptional local configuration'; a short definition or reference to the relevant combinatorial object would improve readability.
  2. [main result] Notation for the main asymptotic expression uses 2^{n+k-1/2} without an explicit statement of whether the square-root factor arises from Stirling approximation or from a separate counting argument.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and for identifying gaps in the error analysis. We address the two major comments below and will revise the manuscript accordingly to make the bounds uniform and the error propagation explicit.

read point-by-point responses
  1. Referee: [edge-insertion analysis of exceptional configurations] The section deriving the bound on networks with exceptional local configurations (via edge insertion): the argument must establish that this subtracted term is o( binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n} ) uniformly for all k=o(√n). The provided sketch controls the contribution only up to a factor that may reach exp(Θ(k²/n)), which is not guaranteed to remain o(1) throughout the regime and therefore does not yet support the claimed leading asymptotic.

    Authors: We agree that the present sketch yields only a multiplicative factor exp(O(k²/n)). Because k = o(√n) implies k²/n → 0, this factor tends to 1 rather than to 0 and therefore does not establish that the exceptional contribution is o of the target asymptotic. In the revision we will replace the sketch with a refined edge-insertion argument that produces a strictly smaller order (for example by extracting an extra negative exponential or by a more precise enumeration of the admissible insertion sites). The new bound will be shown to be o( binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n} ) uniformly throughout the regime. revision: yes

  2. Referee: [main asymptotic statement] The combination step with prior tree-child asymptotics: the error term inherited from the tree-child formula must be shown to remain negligible after subtraction of the exceptional term; no explicit propagation of the o(·) remainder through the subtraction is given.

    Authors: We concur that an explicit propagation of the o(·) remainder is required. The revised manuscript will contain a short lemma that (i) recalls the tree-child asymptotic together with its remainder, (ii) subtracts the (now o(·)) exceptional term, and (iii) verifies that the combined error is still o of the leading term binom(n,k) 2^{n+k-1/2} n^{n+k-1} e^{-n}. The argument will use only the triangle inequality and the fact that both error sources are o of the main expression. revision: yes

Circularity Check

0 steps flagged

No circularity; derivation combines external tree-child asymptotics with independent edge-insertion bounds.

full rationale

The paper obtains the target asymptotic by subtracting (via explicit edge-insertion enumeration) the contribution of networks containing exceptional local configurations from the known tree-child count, then showing the subtracted term is o(1) relative to the main term when k=o(√n). The subtracted term is controlled by a direct combinatorial argument that does not invoke the target formula or any fitted parameter from the present work. The tree-child asymptotics are treated as known external input. No step reduces the claimed leading term to a self-definition, a fitted input renamed as prediction, or a self-citation chain whose validity depends on the present paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review; no free parameters, ad-hoc axioms, or invented entities are visible in the stated claim. The argument rests on standard combinatorial enumeration and previously published tree-child asymptotics.

axioms (1)
  • standard math Standard results on asymptotic enumeration of labeled trees and tree-child networks hold.
    The paper combines its new bounds with these known formulas.

pith-pipeline@v0.9.0 · 5659 in / 1306 out tokens · 28241 ms · 2026-05-25T02:55:49.143152+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

  1. [1]

    Comparison of tree-child phylogenetic networks.IEEE/ACM Transactions on Computational Biology and Bioin- formatics, 6(4):552–569, 2009

    Gabriel Cardona, Francesc Rossell´ o, and Gabriel Valiente. Comparison of tree-child phylogenetic networks.IEEE/ACM Transactions on Computational Biology and Bioin- formatics, 6(4):552–569, 2009

  2. [2]

    Enumerative and distributional results for d-combining tree-child networks.Advances in Applied Mathematics, 157:102704, 2024

    Yu-Sheng Chang, Michael Fuchs, Hexuan Liu, Michael Wallner, and Guan-Ru Yu. Enumerative and distributional results for d-combining tree-child networks.Advances in Applied Mathematics, 157:102704, 2024

  3. [3]

    Phylogenetic classification and the universal tree.Science, 284(5423):2124–2128, 1999

    W Ford Doolittle. Phylogenetic classification and the universal tree.Science, 284(5423):2124–2128, 1999

  4. [4]

    Which phylogenetic networks are merely trees with additional arcs?Syst

    Andrew R Francis and Mike Steel. Which phylogenetic networks are merely trees with additional arcs?Syst. Biol., 64(5):768–777, 2015

  5. [5]

    Counting phylogenetic networks with few reticulation vertices: a second approach.Discrete Applied Mathematics, 320:140– 149, 2022

    Michael Fuchs, En-Yu Huang, and Guan-Ru Yu. Counting phylogenetic networks with few reticulation vertices: a second approach.Discrete Applied Mathematics, 320:140– 149, 2022

  6. [6]

    Asymptotic enumeration of normal and hybridization networks via tree decoration.arXiv preprint arXiv:2412.02928, 2024

    Michael Fuchs, Mike Steel, and Qiang Zhang. Asymptotic enumeration of normal and hybridization networks via tree decoration.arXiv preprint arXiv:2412.02928, 2024

  7. [7]

    Locating a tree in a phylogenetic network in quadratic time

    Philippe Gambette, Andreas DM Gunawan, Anthony Labarre, St´ ephane Vialette, and Louxin Zhang. Locating a tree in a phylogenetic network in quadratic time. InProc. Int. Confer. Res. Comput. Mol. Biol., pages 96–107. Springer, 2015

  8. [8]

    A decomposition theorem and two algorithms for reticulation-visible networks.Inform

    Andreas DM Gunawan, Bhaskar DasGupta, and Louxin Zhang. A decomposition theorem and two algorithms for reticulation-visible networks.Inform. and Comput., 252:161–175, 2017

  9. [9]

    MIT press, 2014

    Dan Gusfield.ReCombinatorics: the algorithmics of ancestral recombination graphs and explicit phylogenetic networks. MIT press, 2014

  10. [10]

    The fine structure of galls in phylo- genetic networks.INFORMS J

    Dan Gusfield, Satish Eddhu, and Charles Langley. The fine structure of galls in phylo- genetic networks.INFORMS J. Comput., 16(4):459–469, 2004

  11. [11]

    Huson, Regula Rupp, and Celine Scornavacca.Phylogenetic Networks: Con- cepts, Algorithms and Applications

    Daniel H. Huson, Regula Rupp, and Celine Scornavacca.Phylogenetic Networks: Con- cepts, Algorithms and Applications. Cambridge University Press, 2010

  12. [12]

    Proof of a conjecture on young tableaux with walls.arXiv preprint arXiv:2601.09551, 2026

    Zhicong Lin, Feihu Liu, Jiahang Liu, Jing Liu, and Guoce Xin. Proof of a conjecture on young tableaux with walls.arXiv preprint arXiv:2601.09551, 2026

  13. [13]

    Counting general phylogenetic networks.Australas

    Marefatollah Mansouri. Counting general phylogenetic networks.Australas. J. Combin., 83(1):40–86, 2022. REFERENCES19

  14. [14]

    Counting phylogenetic net- works.Ann

    Colin McDiarmid, Charles Semple, and Dominic Welsh. Counting phylogenetic net- works.Ann. Comb., 19(1):205–224, 2015

  15. [15]

    Miquel Pons and Josep Batle. Combinatorial characterization of a certain class of words and a conjectured connection with general subclasses of phylogenetic tree-child networks.Scientific reports, 11(1):21875, 2021

  16. [16]

    Oxford University Press, U.K., 2003

    Charles Semple and Mike Steel.Phylogenetics. Oxford University Press, U.K., 2003

  17. [17]

    SIAM, 2016

    Mike Steel.Phylogeny: discrete and random processes in evolution. SIAM, 2016

  18. [18]

    Perfect phylogenetic networks with recombination.J

    Lusheng Wang, Kaizhong Zhang, and Louxin Zhang. Perfect phylogenetic networks with recombination.J. Comput. Biol., 8(1):69–78, 2001

  19. [19]

    Exact and asymptotic counts of binary phylogenetic networks with a few reticulation events.Journal of Computa- tional Biology, page 15578666251400786, 2026

    Hao Yu, Michael Fuchs, Guan-Ru Yu, and Louxin Zhang. Exact and asymptotic counts of binary phylogenetic networks with a few reticulation events.Journal of Computa- tional Biology, page 15578666251400786, 2026

  20. [20]

    Clusters, trees, and phylogenetic network classes

    Louxin Zhang. Clusters, trees, and phylogenetic network classes. InBioinformatics and Phylogenetics: Seminal Contributions of Bernard Moret, pages 277–315. Springer, 2019

  21. [21]

    Phylofusion—fast and easy fusion of rooted phylogenetic trees into rooted phylogenetic networks.Systematic Biology, 75(1):88–99, 2026

    Louxin Zhang, Banu Cetinkaya, and Daniel H Huson. Phylofusion—fast and easy fusion of rooted phylogenetic trees into rooted phylogenetic networks.Systematic Biology, 75(1):88–99, 2026. Appendix 20 Proof of the Asymptotic Result for Galled Networks A: Two Lemmas The following result is known in the literature. Lemma A1.Fork >0, ∞X i=0 2i+k i 1 (2i+k)2 2i =...