Zombie Compositions in Assembly Algebras and an Upper Bound on the Size of Chemical Space
Pith reviewed 2026-06-26 13:01 UTC · model grok-4.3
The pith
The composition polytope separates unrealisable molecular compositions as zombies and fixes the growth exponent of chemical space at exactly log 2.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For any construction system equipped with a type system and valence bounds the composition polytope P_val subset R^m is defined so that its integer points are precisely the physically realisable compositions. Any composition outside P_val is a zombie: combinatorially valid yet unrealisable. The zombie classification is sound, returning zero false positives, and conservative, so the true infeasibility rate is at least as high as the polytope predicts. In the concrete molecular graph system the resulting growth function satisfies N(a) <= C * 2^a, tightening the earlier exponent 0.73 to the exact value log 2.
What carries the argument
The composition polytope P_val subset R^m, whose lattice points count the feasible compositions under given valence bounds and type system.
If this is right
- The number of feasible compositions in the molecular system is bounded above by a constant times 2 to the assembly index.
- The earlier growth estimate overcounts feasible objects by a factor larger than 10^198 once the assembly index reaches 10.
- The same polytope construction recovers the click-chemistry system as a second case with m=8 and a partition matroid.
- Analytical upper bounds on N(a) follow directly from the design signature (m, nu, n0) via the associated toric ideal and matroid.
Where Pith is reading between the lines
- Lattice-point enumeration inside the polytope could replace exhaustive enumeration when screening candidate molecules at moderate assembly indices.
- Any future refinement of physical constraints can only enlarge the set of zombies, never shrink it, because the current polytope is already conservative.
- The same construction applies verbatim to other typed assembly systems whose building blocks carry explicit valence rules, yielding comparable tightenings without new case-by-case analysis.
Load-bearing premise
The chosen valence bounds and atom-type partition are sufficient to make the polytope contain every physically realisable composition and exclude every unrealisable one.
What would settle it
Discovery of even one molecular composition that lies strictly outside the polytope inequalities yet forms a stable, physically assemblable molecule would falsify the soundness of the zombie classification.
Figures
read the original abstract
In this paper we present construction systems -- tuples $(X, BB, \oplus, \nu)$ comprising objects, building blocks, an assembly operation, and a joining multiplicity -- as a general algebraic framework for studying how complex objects are built from simpler parts. To each construction system we associate a toric ideal, a toric variety, and a matroid, obtaining analytical bounds on the growth function $N(a)$ (the number of objects of construction complexity $\leq a$) purely from the design signature $(m, \nu, n_0)$. For systems equipped with a type system and valence bounds, we define the composition polytope $P_{\mathrm{val}} \subset \mathbb{R}^m$, whose integer points count the feasible compositions. Compositions outside $P_{\mathrm{val}}$ -- termed zombies -- are combinatorially valid but physically unrealisable. We prove that the zombie classification is sound (zero false positives) and conservative: the true infeasibility rate is at least as high as the polytope predicts. Specialising to the molecular graph assembly system of Morales Parra et al. ($m = 19$ bond types, $5$ atom types with valences $1$--$4$), we identify a composition polytope whose lattice points capture the physically realisable compositions, and show that the resulting growth exponent tightens from $0.73$ to the exact value $\log 2 \approx 0.693$. While the difference of $0.037$ appears small, in the doubly-exponential regime it corresponds to a tightening of the bound by a factor exceeding $10^{198}$ at assembly index $10$. The framework also recovers the bioorthogonal click-chemistry system of the author's prior work as a second instance, with $m = 8$ and a partition matroid.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces construction systems as tuples (X, BB, ⊕, ν) and associates to each a toric ideal, toric variety, and matroid to derive analytical bounds on the growth function N(a) from the design signature (m, ν, n0). For systems with type systems and valence bounds it defines the composition polytope P_val whose lattice points count feasible compositions; compositions outside P_val are called zombies. The paper claims to prove that the zombie classification is sound (zero false positives) and conservative, and specialises to the m=19 molecular graph assembly system of Morales Parra et al. to obtain a composition polytope that tightens the growth exponent from 0.73 to exactly log 2 ≈ 0.693. The framework is also shown to recover the bioorthogonal click-chemistry system as a second instance.
Significance. If the soundness and conservativeness proofs hold, the work supplies an algebraic-geometry and matroid-theoretic framework that yields parameter-free upper bounds on the size of chemical space directly from the design signature, with an exact exponent of log 2 for the m=19 case. The claimed factor-of-10^198 tightening at assembly index 10 and the recovery of a prior click-chemistry instance are concrete strengths. The purely analytical character of the bounds (no fitted parameters) is a notable methodological contribution.
major comments (2)
- [Definition of P_val and the soundness/conservativeness proofs] The soundness claim (zero false positives) for the zombie classification rests on the facet inequalities of P_val being necessary conditions implied by the 5 atom types and valence bounds 1–4 under the assembly operation ⊕ in the m=19 system. The derivation via the toric ideal and matroid must be shown to capture every valence-sum and type-count restriction so that no violating composition is assemblable; any omitted constraint would falsify both soundness and the exact log-2 exponent.
- [Specialisation to the molecular graph assembly system] The explicit construction of P_val for the Morales Parra et al. system (m=19) and the subsequent computation that its growth exponent equals log 2 must be verified to include all physical realizability constraints; if the polytope over-approximates the feasible set beyond what the valence bounds enforce, the claimed tightening from 0.73 to log 2 does not follow.
minor comments (1)
- [Abstract] The abstract states that proofs exist but supplies no outline of the key steps that convert the design signature into the polytope inequalities; a one-sentence indication of the toric-ideal/matroid route would improve accessibility.
Simulated Author's Rebuttal
We thank the referee for their careful reading and for highlighting the importance of verifying the soundness proofs and the explicit specialization to the m=19 system. We respond point-by-point below.
read point-by-point responses
-
Referee: The soundness claim (zero false positives) for the zombie classification rests on the facet inequalities of P_val being necessary conditions implied by the 5 atom types and valence bounds 1–4 under the assembly operation ⊕ in the m=19 system. The derivation via the toric ideal and matroid must be shown to capture every valence-sum and type-count restriction so that no violating composition is assemblable; any omitted constraint would falsify both soundness and the exact log-2 exponent.
Authors: The derivation is given in full in Section 3 and Theorem 4.2: the toric ideal I_⊕ is generated by the binomial relations coming from the assembly operation, and the associated matroid M_ν encodes the valence bounds as rank constraints on the type vectors. The facet inequalities of P_val are obtained directly as the linear inequalities defining the matroid polytope intersected with the type-count hyperplanes; the proof shows these are necessary because any assemblable composition must lie in the integer points of this polytope. For the m=19 case the five atom types and valences 1–4 produce exactly the 19 inequalities listed in Table 1, with no further restrictions implied by ⊕. Thus the classification is sound by construction. revision: no
-
Referee: The explicit construction of P_val for the Morales Parra et al. system (m=19) and the subsequent computation that its growth exponent equals log 2 must be verified to include all physical realizability constraints; if the polytope over-approximates the feasible set beyond what the valence bounds enforce, the claimed tightening from 0.73 to log 2 does not follow.
Authors: Section 5.1 gives the explicit vertex and facet description of P_val for the m=19 signature (m=19 bond types, 5 atom types, valences 1–4). The growth exponent is obtained analytically in Theorem 5.3 by computing the volume of the dominant chamber of P_val and applying the standard lattice-point asymptotics for polytopes; this yields precisely log 2 because the binding constraint is the binary-tree-like branching permitted by the valence bounds. All physical realizability constraints from the valence function and assembly operation are incorporated; the polytope is the tightest convex relaxation compatible with the matroid, so the claimed tightening follows directly. revision: no
Circularity Check
No significant circularity; bounds derived analytically from design signature and polytope definition
full rationale
The paper defines the composition polytope P_val from the design signature (m, ν, n0) and proves soundness/conservativeness of the zombie classification directly from the construction system axioms and valence bounds. The growth exponent tightening to log 2 for the m=19 case follows from lattice-point enumeration in the newly constructed polytope rather than from any fitted parameter or self-referential definition. The mention of recovering a prior instance (m=8) is a secondary application and does not bear the load of the central soundness proof or the exponent calculation. No step reduces by construction to its own inputs; the derivation chain remains self-contained against the stated assumptions.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Toric ideals and matroids associated to a construction system yield analytical bounds on the growth function N(a) directly from the design signature (m, ν, n0)
- domain assumption The integer points inside the composition polytope P_val exactly count the feasible compositions under the given type system and valence bounds
invented entities (2)
-
zombie compositions
no independent evidence
-
composition polytope P_val
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Beck and S
M. Beck and S. Robins.Computing the Continuous Discretely: Integer-Point Enu- meration in Polyhedra. Undergraduate Texts in Mathematics, Springer, 2007
2007
-
[2]
D.A. Cox, J. Little, and D. O’Shea.Ideals, Varieties, and Algorithms: An Introduc- tion to Computational Algebraic Geometry and Commutative Algebra. Undergraduate Texts in Mathematics, Springer, 4th ed., 2015
2015
-
[3]
Dickenstein
A. Dickenstein. Biochemical reaction networks: an invitation for algebraic geome- ters. InAlgebraic and Geometric Methods in Discrete Mathematics, Contemp. Math., vol. 656, pages 65–83, AMS, 2016
2016
-
[4]
Drton, B
M. Drton, B. Sturmfels, and S. Sullivant.Lectures on Algebraic Statistics. Oberwol- fach Seminars, Birkh¨ auser, 2009
2009
-
[5]
Eisenbud.Commutative Algebra with a View Toward Algebraic Geometry
D. Eisenbud.Commutative Algebra with a View Toward Algebraic Geometry. Grad- uate Texts in Mathematics, vol. 150, Springer, 1995
1995
-
[6]
Erd˝ os and T
P. Erd˝ os and T. Gallai. Graphs with prescribed degrees of vertices (in Hungarian). Matematikai Lapok, 11:264–274, 1960
1960
-
[7]
Feinberg.Foundations of Chemical Reaction Network Theory
M. Feinberg.Foundations of Chemical Reaction Network Theory. Applied Mathemat- ical Sciences, vol. 202, Springer, 2019
2019
-
[8]
E. Fel´ ıu and A. Shiu. From chemical reaction networks to algebraic and polyhedral geometry—and back again. InVarieties, Polyhedra, Computation, EMS Series of Congress Reports, European Mathematical Society, 2025. arXiv:2501.06354
arXiv 2025
-
[9]
Flajolet and R
P. Flajolet and R. Sedgewick.Analytic Combinatorics. Cambridge University Press, 2009
2009
-
[10]
Fulton.Introduction to Toric Varieties
W. Fulton.Introduction to Toric Varieties. Annals of Mathematics Studies, vol. 131, Princeton University Press, 1993
1993
-
[11]
Gelfand, M.M
I.M. Gelfand, M.M. Kapranov, and A.V. Zelevinsky.Discriminants, Resultants, and Multidimensional Determinants. Mathematics: Theory & Applications, Birkh¨ auser, 1994
1994
-
[12]
Harary and E.M
F. Harary and E.M. Palmer.Graphical Enumeration. Academic Press, 1973
1973
-
[13]
J.C. Morales Parra, K.Y. Patarroyo, A. Sharma, D.O. Alobo and L. Cronin.Eluci- dating the Size of Chemical Space with Assembly Theory. arXiv:2606.11486, 2025
Pith/arXiv arXiv 2025
-
[14]
Marshall, C
S.M. Marshall, C. Mathis, E. Carrick, G. Keenan, G.R.T. Cooper, H. Graham, M. Craven, P.S. Gromski, D.G. Moore, S.I. Walker and L. Cronin.Identifying molecules as biosignatures with assembly theory and mass spectrometry. Nature Chem- istry, 13:692–698, 2021
2021
-
[15]
Oxley.Matroid Theory, 2nd ed
J. Oxley.Matroid Theory, 2nd ed. Oxford Graduate Texts in Mathematics, Oxford University Press, 2011
2011
-
[16]
Pachter and B
L. Pachter and B. Sturmfels.Algebraic Statistics for Computational Biology. Cam- bridge University Press, 2005
2005
-
[17]
Reymond.The chemical space project
J.-L. Reymond.The chemical space project. Accounts of Chemical Research, 48(3):722–730, 2015
2015
-
[18]
Ribas Ripoll.Algebraic Varieties and Ideal Theory in Combinatorial Click-Reaction Design
V. Ribas Ripoll.Algebraic Varieties and Ideal Theory in Combinatorial Click-Reaction Design. Submitted to J. Comput. Algebra, 2026
2026
-
[19]
Sharma, D
A. Sharma, D. Czegel, M. Lachmann, C. Kempes, S. Walker and L. Cronin.Assembly theory explains and quantifies selection and evolution. Nature, 622:321–328, 2023
2023
-
[20]
Stanley.Enumerative Combinatorics, vol
R.P. Stanley.Enumerative Combinatorics, vol. 1, 2nd ed. Cambridge Studies in Ad- vanced Mathematics, vol. 49, Cambridge University Press, 2012
2012
-
[21]
Sturmfels.Gr¨ obner Bases and Convex Polytopes
B. Sturmfels.Gr¨ obner Bases and Convex Polytopes. University Lecture Series, vol. 8, AMS, 1996. ZOMBIE COMPOSITIONS IN ASSEMBLY ALGEBRAS 19
1996
-
[22]
Ziegler.Lectures on Polytopes
G.M. Ziegler.Lectures on Polytopes. Graduate Texts in Mathematics, vol. 152, Springer, 1995. Independent researcher Email address:vribas@ieee.org
1995
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.