pith. sign in

arxiv: 2111.10456 · v3 · pith:STLKWZAWnew · submitted 2021-11-19 · 🧬 q-bio.PE · math.CO

A lattice structure for ancestral configurations arising from the relationship between gene trees and species trees

Pith reviewed 2026-05-24 13:07 UTC · model grok-4.3

classification 🧬 q-bio.PE math.CO
keywords ancestral configurationsgene treesspecies treeslabeled historieslattice structuredigraphCartesian productphylogenetics
0
0 comments X

The pith

For matching gene and species tree topologies, paths on the ancestral configuration digraph stand in bijection with labeled histories.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper associates ancestral configurations with nodes of a species tree given a gene tree topology. It equips the set of configurations with a lattice structure whose directed-graph representation is built by iterated Cartesian products when the topologies match. A designated collection of paths through this digraph is shown to be in one-to-one correspondence with the labeled histories of the tree. The bijection supplies closed-form counts of labeled histories for several infinite families of trees. The same lattice construction extends to non-matching pairs and identifies which pairs maximize the number of ancestral configurations for a fixed gene-tree topology.

Core claim

When gene tree topology G equals species tree topology S, a specific set of paths on the digraph of ancestral configurations is in bijection with the set of labeled histories.

What carries the argument

The digraph of ancestral configurations, obtained from the tree topology by iterated Cartesian products of graphs, whose selected paths enumerate labeled histories.

If this is right

  • Closed-form expressions become available for the number of labeled histories in several infinite families of trees.
  • Pairs of topologies (G,S) that attain the maximum number of ancestral configurations for fixed G can be characterized.
  • Enumeration of other combinatorial features of gene-species tree pairs becomes possible through path counting on the same digraphs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The path-counting method may yield polynomial-time algorithms for labeled-history enumeration on trees of moderate size.
  • Analogous lattice constructions could organize other ordered coalescence objects such as ranked tree shapes.
  • The graphical representation may support visualization tools that display coalescence orderings directly on the lattice.

Load-bearing premise

The iterated Cartesian product construction produces a digraph whose vertices and edges correctly encode the ancestral configurations whenever the gene-tree and species-tree topologies are identical.

What would settle it

For any small matching tree pair whose number of labeled histories is already known by direct enumeration, a mismatch between that number and the number of qualifying paths in the constructed digraph.

Figures

Figures reproduced from arXiv: 2111.10456 by Egor Lappo, Noah A. Rosenberg.

Figure 1
Figure 1. Figure 1: Two example realizations of a gene tree G in a species tree S with matching topology. We identify each lineage of G by its immediate descendant node. The internal nodes of the species tree S are represented by horizontal dashed lines. (A) In this realization, the ancestral configuration is {c, d} at node h of S; {e, f} at node i of S; {a, b} at node g of S; {h, e, f} at node ℓ of S; {g, h, i} at node m of … view at source ↗
Figure 2
Figure 2. Figure 2: A visual representation of the partial order relation on ancestral configurations. (A) At the [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: A labeled topology on 5 leaves and its associated Hasse diagram. (A) A labeled topology [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: A balanced labeled topology on 8 leaves and its associated Hasse diagram. (A) A balanced labeled [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: p-pseudocaterpillar trees Pn for n = 3, 4, 5, 6, 7, 8 [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Hasse diagrams corresponding to p-pseudocaterpillar trees Pn for n = 3, 4, 5, 6, 7, 8. 8.2 Bicaterpillars For our next family, we consider symmetric bicaterpillar trees, defined as trees whose two root subtrees—trees induced by immediate descendants of the root—are caterpillar trees Cn. We denote by Cn,n the symmetric 11 [PITH_FULL_IMAGE:figures/full_fig_p011_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Bicaterpillar trees Cn,n for n = 1, 2, 3, 4, 5, 6 [PITH_FULL_IMAGE:figures/full_fig_p012_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Hasse diagrams corresponding to symmetric bicaterpillar trees [PITH_FULL_IMAGE:figures/full_fig_p012_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Lodgepole trees Ln for n = 1, 2, 3, 4, 5 [PITH_FULL_IMAGE:figures/full_fig_p013_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Hasse diagrams corresponding to lodgepole trees [PITH_FULL_IMAGE:figures/full_fig_p013_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Modified lodgepole trees Mn for n = 1, 2, 3, 4, 5 [PITH_FULL_IMAGE:figures/full_fig_p014_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Hasse diagrams corresponding to modified lodgepole trees [PITH_FULL_IMAGE:figures/full_fig_p014_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Balanced trees Bn for n = 1, 2, 3, 4 [PITH_FULL_IMAGE:figures/full_fig_p015_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Hasse diagrams corresponding to balanced trees [PITH_FULL_IMAGE:figures/full_fig_p015_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: An example lattice of ancestral configurations in the case of nonmatching [PITH_FULL_IMAGE:figures/full_fig_p017_15.png] view at source ↗
read the original abstract

To a given gene tree topology $G$ and species tree topology $S$ with leaves labeled bijectively from a fixed set $X$, one can associate a set of ancestral configurations, each of which encodes a set of gene lineages that can be found at a given node of the species tree. We introduce a lattice structure on ancestral configurations, studying the directed graphs that provide graphical representations of lattices of ancestral configurations. For a matching gene tree topology and species tree topology $G=S$, we present a method for defining the digraph of ancestral configurations from the tree topology by using iterated cartesian products of graphs. We show that a specific set of paths on the digraph of ancestral configurations is in bijection with the set of labeled histories -- a well-known phylogenetic object that enumerates possible temporal orderings of the coalescences of a tree. For each of a series of tree families, we obtain closed-form expressions for the number of labeled histories by using this bijection to count paths on associated digraphs. Finally, we prove that our lattice construction extends to nonmatching tree pairs, and we use it to characterize pairs $(G,S)$ having the maximal number of ancestral configurations for a fixed $G$. We discuss how the construction provides new methods for performing enumerations of combinatorial aspects of gene and species trees.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript associates ancestral configurations to a gene tree topology G and species tree topology S on the same leaf set. It equips these configurations with a lattice structure whose Hasse diagram is realized as a digraph. For the special case G = S the authors construct the digraph by iterated Cartesian products of smaller graphs derived from the tree topology, prove that a distinguished subset of paths in this digraph is in bijection with the labeled histories of the tree, and obtain closed-form expressions for the number of labeled histories on several infinite families of trees. The lattice construction is then extended to G ≠ S and used to characterize pairs that maximize the number of ancestral configurations for fixed G.

Significance. If the bijection and the Cartesian-product construction are valid, the work supplies a new, graph-theoretic route to enumerating labeled histories—an object central to coalescent theory and phylogenetic combinatorics. The closed forms for concrete tree families and the characterization of maximal (G,S) pairs are concrete, falsifiable contributions. The lattice formalism itself may also furnish new tools for comparing gene and species trees.

major comments (1)
  1. [method for defining the digraph (paragraph following the lattice definition)] The central claim that the iterated Cartesian product yields a digraph whose distinguished paths are in bijection with labeled histories rests on the construction correctly encoding the partial order of coalescences. For unbalanced topologies the product treats sub-configurations as independent factors, yet global coalescence constraints may produce extraneous vertices or omit valid lineage sets. A concrete verification for at least one unbalanced matching pair (e.g., the caterpillar with four or five leaves) is needed to confirm that the bijection survives; the abstract and method description do not supply this check.
minor comments (2)
  1. Notation for the Cartesian product of graphs and for the distinguished path set should be introduced with an explicit small example (e.g., the balanced quartet) before the general statement.
  2. The closed-form expressions are stated for several families; a short table listing the families, the formulas, and the first few numerical values would improve readability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for identifying the need for an explicit check on unbalanced topologies. We address the single major comment below.

read point-by-point responses
  1. Referee: [method for defining the digraph (paragraph following the lattice definition)] The central claim that the iterated Cartesian product yields a digraph whose distinguished paths are in bijection with labeled histories rests on the construction correctly encoding the partial order of coalescences. For unbalanced topologies the product treats sub-configurations as independent factors, yet global coalescence constraints may produce extraneous vertices or omit valid lineage sets. A concrete verification for at least one unbalanced matching pair (e.g., the caterpillar with four or five leaves) is needed to confirm that the bijection survives; the abstract and method description do not supply this check.

    Authors: The recursive definition of the iterated Cartesian product follows the tree hierarchy directly: at each internal node the product is formed from the digraphs of its child subtrees, with edges between factors only when the corresponding coalescences are compatible under the species-tree partial order. This encoding ensures global constraints are respected even when subtrees are unbalanced, because the product order is induced by the ancestor-descendant relations in the full topology rather than by treating factors as fully independent. Nevertheless, we agree that an explicit small-case verification would strengthen the presentation. In the revised manuscript we will insert, immediately after the general construction, a fully worked example for the 5-leaf caterpillar (an unbalanced matching pair). The example will display the digraph, list all distinguished paths, and confirm that their number equals the known count of labeled histories (42) with no extraneous vertices or omitted lineage sets. revision: yes

Circularity Check

0 steps flagged

No circularity: combinatorial bijection and path-counting construction is self-contained

full rationale

The paper defines ancestral configurations via iterated Cartesian products of graphs from the tree topology, then proves a bijection between specific paths on the resulting digraph and labeled histories. Closed-form counts for tree families follow directly from this bijection. No parameters are fitted, no self-citations are load-bearing for the central claim, and no step reduces a derived quantity to its own input by definition. The construction is a standard combinatorial enumeration independent of the target counts.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Relies on standard definitions of lattices, directed graphs, Cartesian products, and tree topologies from combinatorics and phylogenetics; no free parameters or invented entities introduced.

axioms (2)
  • domain assumption Ancestral configurations form a partially ordered set that is a lattice under the natural inclusion or coalescence order.
    Invoked when the lattice structure is introduced on the set of ancestral configurations.
  • standard math Cartesian product of graphs preserves the required path-counting properties for building the digraph from tree topology.
    Used in the method for defining the digraph when G = S.

pith-pipeline@v0.9.0 · 5770 in / 1311 out tokens · 30843 ms · 2026-05-24T13:07:39.350893+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Aho, A. V. and Sloane, N. J. A. 1973. Some doubly exponential sequences, Fibonacci Quarterly 11, 429–437

  2. [2]

    and Rosenberg, N

    Alimpiev, E. and Rosenberg, N. A. 2021. Enumeration of coalescent histories for caterpillar species trees and p-pseudocaterpillar gene trees, Advances in Applied Mathematics 131, 102265

  3. [3]

    Bienvenu, F., Lambert, A., and Steel, M. 2022. Combinatorial and stochastic properties of ranked tree-child networks, Random Structures & Algorithms 60 (4), 653–689

  4. [4]

    Brown, J. K. 1994. Probabilities of evolutionary trees, Systematic Biology 43 (1), 78–91

  5. [5]

    and Plazzotta, G

    Colijn, C. and Plazzotta, G. 2018. A metric on phylogenetic tree shapes, Systematic Biology 67, 113–126

  6. [6]

    Degnan, J. H. and Rosenberg, N. A. 2009. Gene tree discordance, phylogenetic inference and the multispecies coalescent, Trends in Ecology and Evolution 24, 332–340

  7. [7]

    H., Rosenberg, N

    Degnan, J. H., Rosenberg, N. A., and Stadler, T. 2012. The probability distribution of ranked gene trees on a species tree, Mathematical Biosciences 235 (1), 45–55

  8. [8]

    Degnan, J. H. and Salter, L. A. 2005. Gene tree distributions under the coalescent process, Evolution 59 (1), 24–37

  9. [9]

    R., and Rosenberg, N

    Disanto, F., Fuchs, M., Huang, C.-Y., Paningbatan, A. R., and Rosenberg, N. A. 2024. The distributions under two species-tree models of the total number of ancestral configurations for matching gene trees and species trees, Advances in Applied Mathematics 152, 102594

  10. [10]

    R., and Rosenberg, N

    Disanto, F., Fuchs, M., Paningbatan, A. R., and Rosenberg, N. A. 2022. The distributions under two species-tree models of the number of root ancestral configurations for matching gene trees and species trees, Annals of Applied Probability 32 (6), 4426–4458

  11. [11]

    and Rosenberg, N

    Disanto, F. and Rosenberg, N. A. 2015. Coalescent histories for lodgepole species trees, Journal of Computational Biology 22 (10), 918–929

  12. [12]

    and Rosenberg, N

    Disanto, F. and Rosenberg, N. A. 2016. Asymptotic properties of the number of matching coalescent histories for caterpillar-like families of species trees, IEEE/ACM Transactions on Computational Biology and Bioinformatics 13 (5), 913–925

  13. [13]

    and Rosenberg, N

    Disanto, F. and Rosenberg, N. A. 2017. Enumeration of ancestral configurations for matching gene trees and species trees, Journal of Computational Biology 24 (9), 831–850

  14. [14]

    Introduction to Lattice Theory with Computer Science Applications

    Garg, V. K. 2015. “Introduction to Lattice Theory with Computer Science Applications”, John Wiley & Sons, New York. Gr¨ atzer, G. 2011. “Lattice Theory: Foundation”, Springer, Basel

  15. [15]

    Harding, E. 1971. The probabilities of rooted tree-shapes generated by random bifurcation, Advances in Applied Probability 3 (1), 44–77

  16. [16]

    Himwich, Z. M. and Rosenberg, N. A. 2020. Roadblocked monotonic paths and the enumeration of coalescent histories for non-matching caterpillar gene trees and species trees, Advances in Applied Mathematics 113, 101939

  17. [17]

    King, M. C. and Rosenberg, N. A. 2023. A mathematical connection between single-elimination sports tournaments and evolutionary trees, Mathematics Magazine in press

  18. [18]

    and Rosenberg, N

    Mathur, S. and Rosenberg, N. A. 2023. All galls are divided into three or more parts: recursive enumeration of labeled histories for galled trees, Algorithms for Molecular Biology 18, 1

  19. [19]

    A., Bhaskar, A., Disanto, F., and Rosenberg, N

    Palacios, J. A., Bhaskar, A., Disanto, F., and Rosenberg, N. A. 2022. Enumeration of binary trees compatible with a perfect phylogeny, Journal of Mathematical Biology 84 (6), 1–37. 20

  20. [20]

    Rosenberg, N. A. 2007. Counting coalescent histories, Journal of Computational Biology 14, 360–377

  21. [21]

    Rosenberg, N. A. 2013. Coalescent histories for caterpillar-like families, IEEE/ACM Transactions on Computational Biology and Bioinformatics 10 (5), 1253–1262

  22. [22]

    Rosenberg, N. A. 2019. Enumeration of lonely pairs of gene trees and species trees by means of antipodal cherries, Advances in Applied Mathematics 102, 1–17

  23. [23]

    Rosenberg, N. A. 2021. On the Colijn–Plazzotta numbering scheme for unlabeled binary rooted trees, Discrete Applied Mathematics 291, 88–98

  24. [24]

    Sabidussi, G. 1959. Graph multiplication, Mathematische Zeitschrift 72 (1), 446–457

  25. [25]

    S., Lyngso, R., and Hein, J

    Song, Y. S., Lyngso, R., and Hein, J. 2006. Counting all possible ancestral configurations of sample sequences in population genetics, IEEE/ACM Transactions on Computational Biology and Bioinformatics 3 (3), 239–251

  26. [26]

    and Degnan, J

    Stadler, T. and Degnan, J. H. 2012. A polynomial time algorithm for calculating the probability of a ranked gene tree given a species tree, Algorithms for Molecular Biology 7 (1), 7

  27. [27]

    Enumerative Combinatorics, Volume 1

    Stanley, R. P. 2011. “Enumerative Combinatorics, Volume 1”, Cambridge University Press, Cambridge, UK, second edition

  28. [28]

    Phylogeny: Discrete and Random Processes in Evolution

    Steel, M. 2016. “Phylogeny: Discrete and Random Processes in Evolution”, SIAM, Philadelphia

  29. [29]

    and McKenzie, A

    Steel, M. and McKenzie, A. 2001. Properties of phylogenetic trees generated by Yule-type speciation models, Mathematical Biosciences 170 (1), 91–112

  30. [30]

    Probabilistic Structures in Evolution

    Wiehe, T. 2021. Counting, grafting and evolving binary trees. in “Probabilistic Structures in Evolution.” (E. Baake and A. Wakolbinger, eds), p. 427–450, Zurich, EMS Publishing House

  31. [31]

    Wu, Y. 2012. Coalescent-based species tree inference from gene tree topologies under incomplete lineage sorting by maximum likelihood, Evolution 66 (3), 763–775

  32. [32]

    Wu, Y. 2016. An algorithm for computing the gene tree probability under the multispecies coalescent and its application in the inference of population tree, Bioinformatics 32 (12), i225–i233. 21