Inferring DAGs and Phylogenetic Networks from Least Common Ancestors
Pith reviewed 2026-05-18 00:01 UTC · model grok-4.3
The pith
A set of least common ancestor constraints on leaves is realizable by some DAG exactly when the canonical DAG built from their plus-closure realizes them.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Given a collection R of LCA constraints of the form (i,j)<(k,l), the plus-closure R+ is formed by adding all relations that follow from the input. From R+ one constructs a canonical DAG G_R such that R is DAG-realizable precisely when G_R realizes R. The construction is then lifted to phylogenetic networks, yielding a canonical network N_R that is regular (i.e., coincides with the Hasse diagram of its underlying set system). For any DAG-realizable R the classical closure equals the plus-closure.
What carries the argument
The plus-closure R+ of the LCA constraints, from which the canonical DAG G_R is built; realization by G_R is equivalent to existence of any realizing DAG.
If this is right
- Polynomial-time recognition of whether any given collection of LCA constraints admits a DAG realization.
- Explicit construction of a concrete DAG that realizes exactly the given constraints when such a DAG exists.
- A parallel polynomial-time construction that produces a regular canonical phylogenetic network for the same constraints.
- Equivalence between the plus-closure and the set of all LCA relations true in every possible realizing DAG.
Where Pith is reading between the lines
- The method supplies a concrete starting point for inferring reticulate evolutionary histories that trees cannot represent.
- One could apply the same closure-and-canonical-graph technique to hierarchical constraints arising in computer science or database theory.
- Empirical tests on biological sequence data could reveal whether the inferred DAGs expose ancestor relations missed by tree-only methods.
Load-bearing premise
The input collection of LCA constraints is presented in a form that permits direct computation of the plus-closure without undetected inconsistencies in the ancestor order.
What would settle it
A counterexample would be any collection R for which some DAG satisfies all constraints in R yet the explicitly constructed G_R fails to satisfy at least one of them.
Figures
read the original abstract
A least common ancestor (LCA) of two leaves in a directed acyclic graph (DAG) is a vertex that is an ancestor of both leaves and has no proper descendant that is also their common ancestor. LCAs capture hierarchical relationships in rooted trees and, more generally, in DAGs. In 1981, Aho et al. introduced the problem of determining whether a set of pairwise LCA constraints on a set $X$, of the form $(i,j)<(k,l)$ with $i,j,k,l\in X$, can be realized by a rooted tree whose leaf set is $X$, such that whenever $(i,j)<(k,l)$, the LCA of $i,j$ is a descendant of that of $k,l$. They also presented a polynomial-time algorithm, BUILD, to solve this problem. However, many such constraint systems cannot be realized by any tree, prompting the question of whether they can be realized by a more general DAG. We extend Aho et al.'s framework from trees to DAGs, providing both theoretical and algorithmic foundations for reasoning about LCA constraints in this broader setting. Given a collection $R$ of LCA constraints, we define its $+$-closure $R^+$, capturing additional LCA relations implied by $R$. Using $R^+$, we construct a canonical DAG $G_R$ and prove that $R$ is DAG-realizable if and only if it is realized by $G_R$. We further adapt this construction to phylogenetic networks, defining a canonical network $N_R$ and prove that it is regular, i.e., it coincides with the Hasse diagram of its underlying set system. Finally, we show that for any DAG-realizable $R$, its classical closure - comprising all LCA constraints that hold in every DAG realizing $R$ - coincides with its $+$-closure. All constructions are computable in polynomial time, and we provide explicit algorithms for each. All algorithms developed in this paper are implemented in the freely available Python package RealLCA.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript extends Aho et al.'s 1981 BUILD framework from trees to DAGs and phylogenetic networks. Given a collection R of LCA constraints, it defines the +-closure R^+, constructs a canonical DAG G_R, and proves that R is DAG-realizable if and only if G_R realizes it. An analogous construction produces a canonical regular phylogenetic network N_R. The paper further shows that, for any DAG-realizable R, the classical closure coincides with the +-closure. All constructions and decision procedures run in polynomial time and are implemented in the open Python package RealLCA.
Significance. If the central equivalences hold, the work supplies an explicit, constructive characterization of DAG-realizable LCA constraint sets together with a regular-network analogue. The availability of polynomial-time algorithms and reproducible code constitutes a concrete advance over existence-only results and directly generalizes the classic tree case, making the results immediately usable for inference tasks in phylogenetics and ordered set theory.
minor comments (2)
- [Abstract] Abstract: the statement 'R is DAG-realizable if and only if it is realized by G_R' would benefit from an explicit parenthetical reminder that G_R is built from the +-closure R^+.
- [Definition of regularity] Section 2 (or wherever the definition of regularity appears): the claim that N_R 'coincides with the Hasse diagram of its underlying set system' should include a one-sentence reminder of the precise set-system definition used, to avoid any ambiguity with other notions of regularity in the network literature.
Simulated Author's Rebuttal
We thank the referee for their positive and constructive report, including the clear summary of our contributions and the recommendation to accept the manuscript. We are pleased that the referee recognizes the generalization of the Aho et al. BUILD framework, the explicit constructive characterization via the +-closure and canonical DAG/network, the polynomial-time algorithms, and the open-source implementation in RealLCA.
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper defines the +-closure R^+ directly from the input set R of LCA constraints by adding all implied relations, then constructs the canonical DAG G_R explicitly from R^+. It proves the central claim that R is DAG-realizable if and only if G_R realizes it by verifying that the ancestor and LCA relations in G_R coincide exactly with those forced by R^+. An analogous explicit construction yields the regular network N_R. These steps rely on direct construction and equivalence proofs rather than self-referential definitions, fitted parameters, or load-bearing self-citations. The paper further shows that the classical closure coincides with the +-closure and supplies polynomial-time algorithms with an open implementation, making the results independently verifiable and self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Directed acyclic graphs admit a well-defined ancestor relation and least common ancestor operation on leaves.
Forward citations
Cited by 2 Pith papers
-
Inferring Phylogenetic Networks from Required and Forbidden LCA-Constraints
Exact characterizations of realizability for phylogenetic networks from required and forbidden LCA-constraints in three variants, plus polynomial-time algorithms for decision and construction.
-
Inferring Phylogenetic Networks from Required and Forbidden LCA-Constraints
Exact characterizations and polynomial-time algorithms are given for realizing phylogenetic networks from required and forbidden LCA constraints under three variants of avoidance.
Reference graph
Works this paper leans on
-
[1]
SIAM Journal on Computing 1(2):131–137, DOI 10.1137/0201008
Aho A V , Garey MR, Ullman JD (1972) The transitive reduction of a directed graph. SIAM Journal on Computing 1(2):131–137, DOI 10.1137/0201008
-
[2]
SIAM Journal on Computing 10(3):405–421, DOI 10.1137/0210030
Aho A V , Sagiv Y , Szymanski TG, Ullman JD (1981) Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM Journal on Computing 10(3):405–421, DOI 10.1137/0210030
-
[3]
Journal of Combinatorial Theory, Series B 96(1):38–49, DOI 10.1016/j.jctb.2005.06.004
Ardila F, Klivans CJ (2006) The bergman complex of a matroid and phylogenetic trees. Journal of Combinatorial Theory, Series B 96(1):38–49, DOI 10.1016/j.jctb.2005.06.004
-
[4]
Ann Comb 8:391–408, DOI 10.1007/s00026-004-0228-0
Baroni M, Semple C, Steel M (2005) A framework for representing reticulate evolution. Ann Comb 8:391–408, DOI 10.1007/s00026-004-0228-0
-
[5]
PhD thesis, University of Canterbury
Bryant D (1997) Building trees, hunting for trees, and comparing trees: theory and methods in phylogenetic analysis. PhD thesis, University of Canterbury
work page 1997
-
[6]
Bryant D, Steel M (1995) Extension operations on sets of leaf-labelled trees. Adv Appl Math 16(4):425–453
work page 1995
-
[7]
Discrete Applied Mathematics 127(2):241–269, DOI 10.1016/S0166-218X(02)00209-3
Caspard N, Monjardet B (2003) The lattices of closure systems, closure operators, and implicational systems on a finite set: a survey. Discrete Applied Mathematics 127(2):241–269, DOI 10.1016/S0166-218X(02)00209-3
-
[8]
Daniel H Huson CS Regula Rupp (2011) Phylogenetic Networks: Concepts, Algorithms and Applications. Cam- bridge University Press
work page 2011
-
[9]
Algorithmica 80(8):2453–2477, DOI 10.1007/s00453-017-0330-4
Deng Y , Fern ´andez-Baca D (2018) Fast compatibility testing for rooted phylogenetic trees. Algorithmica 80(8):2453–2477, DOI 10.1007/s00453-017-0330-4
-
[10]
Theoretical Computer Science 796:129–146, DOI 10.1016/j.tcs.2019.09.003 23
D ¨ocker J, Linz S, Semple C (2019) Displaying trees across two phylogenetic networks. Theoretical Computer Science 796:129–146, DOI 10.1016/j.tcs.2019.09.003 23
-
[11]
Discrete Mathematics & Theoretical Computer Science V ol
Dress A, Huber K, Steel M (2014) A matroid associated with a phylogenetic tree. Discrete Mathematics & Theoretical Computer Science V ol. 16 no. 2, DOI 10.46298/dmtcs.2078
-
[12]
Journal of Theoretical Biology 614:112236, DOI 10.1016/j.jtbi.2025.112236
Francis A (2025) ”Normal” phylogenetic networks may be emerging as the leading class. Journal of Theoretical Biology 614:112236, DOI 10.1016/j.jtbi.2025.112236
-
[13]
BMC Bioinformatics 10(1):152, DOI 10.1186/1471-2105-10-152
Francisco AP, Bugalho M, Ramirez M, Carric ¸o JA (2009) Global optimal eburst analysis of multilocus typing data using a graphic matroid approach. BMC Bioinformatics 10(1):152, DOI 10.1186/1471-2105-10-152
-
[14]
Journal of Mathematical Biology 74(7):1729–1751, DOI 10.1007/s00285-016-1068-3
Gambette P, Huber KT, Kelk S (2017) On the challenge of reconstructing level-1 phylogenetic networks from triplets and clusters. Journal of Mathematical Biology 74(7):1729–1751, DOI 10.1007/s00285-016-1068-3
-
[15]
Gasieniec L, Jansson J, Lingas A, ¨Ostlin A (1997) On the complexity of computing evolutionary trees. In: Jiang T, Lee DT (eds) Computing and Combinatorics, Springer Berlin Heidelberg, Berlin, Heidelberg, pp 134–145, DOI 10.1007/BFb0045080
-
[16]
Journal of Mathematical Biology 78(7):2015–2057, DOI 10.1007/s00285-019-01332-9
Geiß M, Ch ´avez E, Gonz ´alez Laffitte M, L ´opez S ´anchez A, Stadler BMR, Valdivia DI, Hellmuth M, Hern´andez Rosales M, Stadler PF (2019) Best match graphs. Journal of Mathematical Biology 78(7):2015–2057, DOI 10.1007/s00285-019-01332-9
-
[17]
Journal of Mathematical Biology 80(3):865–953, DOI 10.1007/s00285-019-01444-2
Geiß M, Stadler PF, Hellmuth M (2020) Reciprocal best match graphs. Journal of Mathematical Biology 80(3):865–953, DOI 10.1007/s00285-019-01444-2
-
[18]
Journal of Bioinformatics and Computational Biology 10(05):1250013, DOI 10.1142/S0219720012500138
Habib M, To TH (2012) Constructing a minimum phylogenetic network from a dense triplet set. Journal of Bioinformatics and Computational Biology 10(05):1250013, DOI 10.1142/S0219720012500138
-
[19]
Discrete Applied Mathematics 378:584–593, DOI 10.1016/j.dam.2025.08.037
Hellmuth M, Lindeberg A (2026) Characterizing and transforming dags within the I-lca framework. Discrete Applied Mathematics 378:584–593, DOI 10.1016/j.dam.2025.08.037
-
[20]
Theory in Biosciences 142(4):301–358, DOI 10.1007/s12064-023-00398-w
Hellmuth M, Schaller D, Stadler PF (2023) Clustering systems of phylogenetic networks. Theory in Biosciences 142(4):301–358, DOI 10.1007/s12064-023-00398-w
-
[21]
Algorithmica 24(1):1–13, DOI 10.1007/PL00009268
Henzinger MR, King V , Warnow T (1999) Constructing a tree from homeomorphic subtrees, with applications to computational evolutionary biology. Algorithmica 24(1):1–13, DOI 10.1007/PL00009268
-
[22]
Journal of Symbolic Computation 104:142–158, DOI 10.1016/j.jsc.2020.04.012
Hollering B, Sullivant S (2021) Identifiability in phylogenetics using algebraic matroids. Journal of Symbolic Computation 104:142–158, DOI 10.1016/j.jsc.2020.04.012
-
[23]
J ACM 48(4):723–760, DOI 10.1145/502090.502095
Holm J, de Lichtenberg K, Thorup M (2001) Poly-logarithmic deterministic fully-dynamic algorithms for connec- tivity, minimum spanning tree, 2-edge, and biconnectivity. J ACM 48(4):723–760, DOI 10.1145/502090.502095
-
[24]
Huber KT, van Iersel L, Kelk S, Suchecki R (2011) A practical algorithm for reconstructing level-1 phylogenetic networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics 8(3):635–649, DOI 10.1109/ TCBB.2010.17
work page 2011
-
[25]
Algorithmica 60(2):207–235, DOI 10.1007/s00453-009-9333-0
van Iersel L, Kelk S (2011) Constructing the simplest possible phylogenetic network from triplets. Algorithmica 60(2):207–235, DOI 10.1007/s00453-009-9333-0
-
[26]
Journal of Mathe- matical Biology 68(7):1707–1729, DOI 10.1007/s00285-013-0683-5
van Iersel L, Moulton V (2014) Trinets encode tree-child and level-2 phylogenetic networks. Journal of Mathe- matical Biology 68(7):1707–1729, DOI 10.1007/s00285-013-0683-5
-
[27]
Jansson J (2001) On the complexity of inferring rooted evolutionary trees. Electronic Notes in Discrete Mathe- matics 7:50–53, DOI 10.1016/S1571-0653(04)00222-7, brazilian Symposium on Graphs, Algorithms and Com- binatorics
-
[28]
Jansson J, Sung WK (2006) Inferring a level-1 phylogenetic network from a dense set of rooted triplets. Theoret- ical Computer Science 363(1):60–68, DOI 10.1016/j.tcs.2006.06.022, computing and Combinatorics
-
[29]
Algorithmica 43(4):293–307, DOI 10.5555/3118737.3118847
Jansson J, Ng JHK, Sadakane K, Sung WK (2005) Rooted maximum agreement supertrees. Algorithmica 43(4):293–307, DOI 10.5555/3118737.3118847
-
[30]
SIAM Journal on Computing 35(5):1098–1121, DOI 10.1137/S0097539704446529
Jansson J, Nguyen NB, Sung WK (2006) Algorithms for combining rooted triplets into a galled phylogenetic network. SIAM Journal on Computing 35(5):1098–1121, DOI 10.1137/S0097539704446529
-
[31]
SIAM Journal on Computing 41(1):272–291, DOI 10.1137/100811489
Jansson J, Lemence RS, Lingas A (2012) The complexity of inferring a minimally resolved phylogenetic su- pertree. SIAM Journal on Computing 41(1):272–291, DOI 10.1137/100811489
-
[32]
Master’s thesis, University of Canterbury, DOI 10.26021/15536 24
Levy M (2024) Triplet matroids and closure in phylogenetics. Master’s thesis, University of Canterbury, DOI 10.26021/15536 24
-
[33]
Bull Math Biol 87(3):44, DOI 10.1007/s11538-025-01419-z
Lindeberg A, Hellmuth M (2025) Simplifying and characterizing DAGs and phylogenetic networks via least common ancestor constraints. Bull Math Biol 87(3):44, DOI 10.1007/s11538-025-01419-z
-
[34]
Lindeberg A, Alfonsson A, Hellmuth M (2025)RealLCA.https://github.com/AnnaLindeberg/RealLCA (accessed Nov 7, 2025)
work page 2025
-
[35]
Journal of Mathematical Biology 81(4):961–980, DOI 10.1007/s00285-020-01533-7
Linz S, Semple C (2020) Caterpillars on three and four leaves are sufficient to reconstruct binary normal networks. Journal of Mathematical Biology 81(4):961–980, DOI 10.1007/s00285-020-01533-7
-
[36]
Matou ˇsek J, Neˇsetˇril J (2009) Invitation to discrete mathematics. Oxford University Press
work page 2009
-
[37]
Discrete Applied Mathematics 69(1):19–31, DOI 10.1016/0166-218X(95)00074-2
Ng MP, Wormald NC (1996) Reconstruction of rooted trees from subtrees. Discrete Applied Mathematics 69(1):19–31, DOI 10.1016/0166-218X(95)00074-2
-
[38]
PLOS ONE 9(9):1–1, DOI 10.1371/journal.pone.0106531
Poormohammadi H, Eslahchi C, Tusserkani R (2014) Tripnet: A method for constructing rooted phylogenetic networks from rooted triplets. PLOS ONE 9(9):1–1, DOI 10.1371/journal.pone.0106531
-
[39]
Springer Monographs in Mathematics, Springer
Prof Jørgen Bang-Jensen PGZGa (2009) Digraphs: Theory, Algorithms and Applications, 2nd edn. Springer Monographs in Mathematics, Springer
work page 2009
-
[40]
Theo- retical Computer Science 865:63–84, DOI 10.1016/j.tcs.2021.02.037
Schaller D, Stadler PF, Hellmuth M (2021) Complexity of modification problems for best match graphs. Theo- retical Computer Science 865:63–84, DOI 10.1016/j.tcs.2021.02.037
-
[41]
European Journal of Combinatorics 70:384–407, DOI 10.1016/j.ejc.2018.02.013
Seemann CR, Hellmuth M (2018) The matroid structure of representative triple sets and triple-closure computa- tion. European Journal of Combinatorics 70:384–407, DOI 10.1016/j.ejc.2018.02.013
-
[42]
Journal of Mathematical Biology 83(3):28, DOI 10.1007/s00285-021-01654-7
Semple C, Toft G (2021) Trinets encode orchard phylogenetic networks. Journal of Mathematical Biology 83(3):28, DOI 10.1007/s00285-021-01654-7
-
[43]
Algorithms for Molecular Biology 15(1):5, DOI 10.1186/s13015-020-00165-2
Stadler PF, Geiß M, Schaller D, L ´opez S ´anchez A, Gonz ´alez Laffitte M, Valdivia DI, Hellmuth M, Hern´andez Rosales M (2020) From pairs of most similar sequences to phylogenetic best matches. Algorithms for Molecular Biology 15(1):5, DOI 10.1186/s13015-020-00165-2
-
[44]
Steel M (2016) Phylogeny: discrete and random processes in evolution. SIAM
work page 2016
-
[45]
Steven S Skiena; Pemmaraju SV (2003) Computational Discrete Mathematics: Combinatorics and Graph Theory with Mathematica. Cambridge University Press
work page 2003
-
[46]
Information Processing Letters 178:106300, DOI 10.1016/j.ipl.2022.106300
van Iersel L, Kole S, Moulton V , Nipius L (2022) An algorithm for reconstructing level-2 phylogenetic networks from trinets. Information Processing Letters 178:106300, DOI 10.1016/j.ipl.2022.106300
-
[47]
Bulletin of Mathematical Biology 72(2):340–358, DOI 10.1007/s11538-009-9449-z
Willson SJ (2010) Properties of normal phylogenetic networks. Bulletin of Mathematical Biology 72(2):340–358, DOI 10.1007/s11538-009-9449-z
-
[48]
Journal of Combinatorial Opti- mization 8(1):29–39, DOI 10.1023/B:JOCO.0000021936.04215.68 25
Wu BY (2004) Constructing the maximum consensus tree from rooted triples. Journal of Combinatorial Opti- mization 8(1):29–39, DOI 10.1023/B:JOCO.0000021936.04215.68 25
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.