pith. sign in

arxiv: 2606.21363 · v2 · pith:6TM3NNVOnew · submitted 2026-06-19 · 🧮 math.AG

Zombie Compositions in Assembly Algebras and an Upper Bound on the Size of Chemical Space

Pith reviewed 2026-06-26 13:01 UTC · model grok-4.3

classification 🧮 math.AG
keywords assembly algebrascomposition polytopezombie compositionschemical spacegrowth exponentvalence boundsmolecular graphstoric variety
0
0 comments X

The pith

The composition polytope separates unrealisable molecular compositions as zombies and fixes the growth exponent of chemical space at exactly log 2.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper frames construction systems as algebraic objects that build complex items from parts via an assembly operation. It introduces the composition polytope whose lattice points mark the compositions that respect valence and type constraints. Compositions outside the polytope are labelled zombies and shown to be combinatorially allowed yet physically impossible, with the classification proven both sound and conservative. Specialising to a molecular graph model with 19 bond types yields a growth exponent of log 2 rather than the prior 0.73 estimate, producing a dramatically smaller count of feasible objects at large assembly indices.

Core claim

For any construction system equipped with a type system and valence bounds the composition polytope P_val subset R^m is defined so that its integer points are precisely the physically realisable compositions. Any composition outside P_val is a zombie: combinatorially valid yet unrealisable. The zombie classification is sound, returning zero false positives, and conservative, so the true infeasibility rate is at least as high as the polytope predicts. In the concrete molecular graph system the resulting growth function satisfies N(a) <= C * 2^a, tightening the earlier exponent 0.73 to the exact value log 2.

What carries the argument

The composition polytope P_val subset R^m, whose lattice points count the feasible compositions under given valence bounds and type system.

If this is right

  • The number of feasible compositions in the molecular system is bounded above by a constant times 2 to the assembly index.
  • The earlier growth estimate overcounts feasible objects by a factor larger than 10^198 once the assembly index reaches 10.
  • The same polytope construction recovers the click-chemistry system as a second case with m=8 and a partition matroid.
  • Analytical upper bounds on N(a) follow directly from the design signature (m, nu, n0) via the associated toric ideal and matroid.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Lattice-point enumeration inside the polytope could replace exhaustive enumeration when screening candidate molecules at moderate assembly indices.
  • Any future refinement of physical constraints can only enlarge the set of zombies, never shrink it, because the current polytope is already conservative.
  • The same construction applies verbatim to other typed assembly systems whose building blocks carry explicit valence rules, yielding comparable tightenings without new case-by-case analysis.

Load-bearing premise

The chosen valence bounds and atom-type partition are sufficient to make the polytope contain every physically realisable composition and exclude every unrealisable one.

What would settle it

Discovery of even one molecular composition that lies strictly outside the polytope inequalities yet forms a stable, physically assemblable molecule would falsify the soundness of the zombie classification.

Figures

Figures reproduced from arXiv: 2606.21363 by Vicent Ribas.

Figure 1
Figure 1. Figure 1: Zombie-generator coefficients ci for the 19 bond types. Blue: efficient bonds (ci < 0); grey: neutral (ci = 0); red: zombie generators (ci > 0). Single bonds between high￾valence atoms are efficient; double bonds between low-valence atoms are the strongest zombie generators. Z 19 ∩ {|α| = S}|, computed by exhaustive enumeration for S ≤ 8 and by dynamic programming on the Ehrhart quasi-polynomial for S ≤ 48… view at source ↗
Figure 2
Figure 2. Figure 2: Zombie fraction z(S) as a function of compo￾sition size S. Red circles: exact exhaustive enumeration (S ≤ 8). Blue squares: exact values from the Ehrhart DP (S = 9, . . . , 48). Dotted line: asymptotic value z∞ = 94.1% from the volume ratio. Already at S = 4, more than half of all compositions are chemically infeasible. interest but does not contribute to the growth bounds; the composition polytope (Theore… view at source ↗
Figure 3
Figure 3. Figure 3: Effect of zombie exclusion on the growth bound. (a) log10 N(a) counting all compositions including zombies (red, exponent d = 0.73) versus zombie-free compositions only (blue, exact exponent d = log 2). The shaded region represents the zombie contribution. (b) Number of digits in the ratio between the two bounds. At a = 10 the ratio exceeds 10198; at a = 30 it has 942 million digits. The dashed line marks … view at source ↗
read the original abstract

In this paper we present construction systems -- tuples $(X, BB, \oplus, \nu)$ comprising objects, building blocks, an assembly operation, and a joining multiplicity -- as a general algebraic framework for studying how complex objects are built from simpler parts. To each construction system we associate a toric ideal, a toric variety, and a matroid, obtaining analytical bounds on the growth function $N(a)$ (the number of objects of construction complexity $\leq a$) purely from the design signature $(m, \nu, n_0)$. For systems equipped with a type system and valence bounds, we define the composition polytope $P_{\mathrm{val}} \subset \mathbb{R}^m$, whose integer points count the feasible compositions. Compositions outside $P_{\mathrm{val}}$ -- termed zombies -- are combinatorially valid but physically unrealisable. We prove that the zombie classification is sound (zero false positives) and conservative: the true infeasibility rate is at least as high as the polytope predicts. Specialising to the molecular graph assembly system of Morales Parra et al. ($m = 19$ bond types, $5$ atom types with valences $1$--$4$), we identify a composition polytope whose lattice points capture the physically realisable compositions, and show that the resulting growth exponent tightens from $0.73$ to the exact value $\log 2 \approx 0.693$. While the difference of $0.037$ appears small, in the doubly-exponential regime it corresponds to a tightening of the bound by a factor exceeding $10^{198}$ at assembly index $10$. The framework also recovers the bioorthogonal click-chemistry system of the author's prior work as a second instance, with $m = 8$ and a partition matroid.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces construction systems as tuples (X, BB, ⊕, ν) and associates to each a toric ideal, toric variety, and matroid to derive analytical bounds on the growth function N(a) from the design signature (m, ν, n0). For systems with type systems and valence bounds it defines the composition polytope P_val whose lattice points count feasible compositions; compositions outside P_val are called zombies. The paper claims to prove that the zombie classification is sound (zero false positives) and conservative, and specialises to the m=19 molecular graph assembly system of Morales Parra et al. to obtain a composition polytope that tightens the growth exponent from 0.73 to exactly log 2 ≈ 0.693. The framework is also shown to recover the bioorthogonal click-chemistry system as a second instance.

Significance. If the soundness and conservativeness proofs hold, the work supplies an algebraic-geometry and matroid-theoretic framework that yields parameter-free upper bounds on the size of chemical space directly from the design signature, with an exact exponent of log 2 for the m=19 case. The claimed factor-of-10^198 tightening at assembly index 10 and the recovery of a prior click-chemistry instance are concrete strengths. The purely analytical character of the bounds (no fitted parameters) is a notable methodological contribution.

major comments (2)
  1. [Definition of P_val and the soundness/conservativeness proofs] The soundness claim (zero false positives) for the zombie classification rests on the facet inequalities of P_val being necessary conditions implied by the 5 atom types and valence bounds 1–4 under the assembly operation ⊕ in the m=19 system. The derivation via the toric ideal and matroid must be shown to capture every valence-sum and type-count restriction so that no violating composition is assemblable; any omitted constraint would falsify both soundness and the exact log-2 exponent.
  2. [Specialisation to the molecular graph assembly system] The explicit construction of P_val for the Morales Parra et al. system (m=19) and the subsequent computation that its growth exponent equals log 2 must be verified to include all physical realizability constraints; if the polytope over-approximates the feasible set beyond what the valence bounds enforce, the claimed tightening from 0.73 to log 2 does not follow.
minor comments (1)
  1. [Abstract] The abstract states that proofs exist but supplies no outline of the key steps that convert the design signature into the polytope inequalities; a one-sentence indication of the toric-ideal/matroid route would improve accessibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their careful reading and for highlighting the importance of verifying the soundness proofs and the explicit specialization to the m=19 system. We respond point-by-point below.

read point-by-point responses
  1. Referee: The soundness claim (zero false positives) for the zombie classification rests on the facet inequalities of P_val being necessary conditions implied by the 5 atom types and valence bounds 1–4 under the assembly operation ⊕ in the m=19 system. The derivation via the toric ideal and matroid must be shown to capture every valence-sum and type-count restriction so that no violating composition is assemblable; any omitted constraint would falsify both soundness and the exact log-2 exponent.

    Authors: The derivation is given in full in Section 3 and Theorem 4.2: the toric ideal I_⊕ is generated by the binomial relations coming from the assembly operation, and the associated matroid M_ν encodes the valence bounds as rank constraints on the type vectors. The facet inequalities of P_val are obtained directly as the linear inequalities defining the matroid polytope intersected with the type-count hyperplanes; the proof shows these are necessary because any assemblable composition must lie in the integer points of this polytope. For the m=19 case the five atom types and valences 1–4 produce exactly the 19 inequalities listed in Table 1, with no further restrictions implied by ⊕. Thus the classification is sound by construction. revision: no

  2. Referee: The explicit construction of P_val for the Morales Parra et al. system (m=19) and the subsequent computation that its growth exponent equals log 2 must be verified to include all physical realizability constraints; if the polytope over-approximates the feasible set beyond what the valence bounds enforce, the claimed tightening from 0.73 to log 2 does not follow.

    Authors: Section 5.1 gives the explicit vertex and facet description of P_val for the m=19 signature (m=19 bond types, 5 atom types, valences 1–4). The growth exponent is obtained analytically in Theorem 5.3 by computing the volume of the dominant chamber of P_val and applying the standard lattice-point asymptotics for polytopes; this yields precisely log 2 because the binding constraint is the binary-tree-like branching permitted by the valence bounds. All physical realizability constraints from the valence function and assembly operation are incorporated; the polytope is the tightest convex relaxation compatible with the matroid, so the claimed tightening follows directly. revision: no

Circularity Check

0 steps flagged

No significant circularity; bounds derived analytically from design signature and polytope definition

full rationale

The paper defines the composition polytope P_val from the design signature (m, ν, n0) and proves soundness/conservativeness of the zombie classification directly from the construction system axioms and valence bounds. The growth exponent tightening to log 2 for the m=19 case follows from lattice-point enumeration in the newly constructed polytope rather than from any fitted parameter or self-referential definition. The mention of recovering a prior instance (m=8) is a secondary application and does not bear the load of the central soundness proof or the exponent calculation. No step reduces by construction to its own inputs; the derivation chain remains self-contained against the stated assumptions.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 2 invented entities

The framework rests on standard results from algebraic geometry and matroid theory plus the modeling assumption that valence rules suffice to define physical realizability via the polytope.

axioms (2)
  • standard math Toric ideals and matroids associated to a construction system yield analytical bounds on the growth function N(a) directly from the design signature (m, ν, n0)
    Invoked to obtain bounds purely from the signature without additional fitting.
  • domain assumption The integer points inside the composition polytope P_val exactly count the feasible compositions under the given type system and valence bounds
    Central modeling choice that defines which compositions are treated as physically realisable.
invented entities (2)
  • zombie compositions no independent evidence
    purpose: To designate combinatorially valid compositions that lie outside P_val and are therefore physically unrealisable
    New classification category introduced for the infeasibility analysis.
  • composition polytope P_val no independent evidence
    purpose: To provide a convex body whose lattice points enumerate feasible compositions in valence-bounded systems
    New geometric object defined for systems equipped with type systems and valence bounds.

pith-pipeline@v0.9.1-grok · 5866 in / 1766 out tokens · 60749 ms · 2026-06-26T13:01:18.768835+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 1 linked inside Pith

  1. [1]

    Beck and S

    M. Beck and S. Robins.Computing the Continuous Discretely: Integer-Point Enu- meration in Polyhedra. Undergraduate Texts in Mathematics, Springer, 2007

  2. [2]

    D.A. Cox, J. Little, and D. O’Shea.Ideals, Varieties, and Algorithms: An Introduc- tion to Computational Algebraic Geometry and Commutative Algebra. Undergraduate Texts in Mathematics, Springer, 4th ed., 2015

  3. [3]

    Dickenstein

    A. Dickenstein. Biochemical reaction networks: an invitation for algebraic geome- ters. InAlgebraic and Geometric Methods in Discrete Mathematics, Contemp. Math., vol. 656, pages 65–83, AMS, 2016

  4. [4]

    Drton, B

    M. Drton, B. Sturmfels, and S. Sullivant.Lectures on Algebraic Statistics. Oberwol- fach Seminars, Birkh¨ auser, 2009

  5. [5]

    Eisenbud.Commutative Algebra with a View Toward Algebraic Geometry

    D. Eisenbud.Commutative Algebra with a View Toward Algebraic Geometry. Grad- uate Texts in Mathematics, vol. 150, Springer, 1995

  6. [6]

    Erd˝ os and T

    P. Erd˝ os and T. Gallai. Graphs with prescribed degrees of vertices (in Hungarian). Matematikai Lapok, 11:264–274, 1960

  7. [7]

    Feinberg.Foundations of Chemical Reaction Network Theory

    M. Feinberg.Foundations of Chemical Reaction Network Theory. Applied Mathemat- ical Sciences, vol. 202, Springer, 2019

  8. [8]

    Fel´ ıu and A

    E. Fel´ ıu and A. Shiu. From chemical reaction networks to algebraic and polyhedral geometry—and back again. InVarieties, Polyhedra, Computation, EMS Series of Congress Reports, European Mathematical Society, 2025. arXiv:2501.06354

  9. [9]

    Flajolet and R

    P. Flajolet and R. Sedgewick.Analytic Combinatorics. Cambridge University Press, 2009

  10. [10]

    Fulton.Introduction to Toric Varieties

    W. Fulton.Introduction to Toric Varieties. Annals of Mathematics Studies, vol. 131, Princeton University Press, 1993

  11. [11]

    Gelfand, M.M

    I.M. Gelfand, M.M. Kapranov, and A.V. Zelevinsky.Discriminants, Resultants, and Multidimensional Determinants. Mathematics: Theory & Applications, Birkh¨ auser, 1994

  12. [12]

    Harary and E.M

    F. Harary and E.M. Palmer.Graphical Enumeration. Academic Press, 1973

  13. [13]

    Morales Parra, K.Y

    J.C. Morales Parra, K.Y. Patarroyo, A. Sharma, D.O. Alobo and L. Cronin.Eluci- dating the Size of Chemical Space with Assembly Theory. arXiv:2606.11486, 2025

  14. [14]

    Marshall, C

    S.M. Marshall, C. Mathis, E. Carrick, G. Keenan, G.R.T. Cooper, H. Graham, M. Craven, P.S. Gromski, D.G. Moore, S.I. Walker and L. Cronin.Identifying molecules as biosignatures with assembly theory and mass spectrometry. Nature Chem- istry, 13:692–698, 2021

  15. [15]

    Oxley.Matroid Theory, 2nd ed

    J. Oxley.Matroid Theory, 2nd ed. Oxford Graduate Texts in Mathematics, Oxford University Press, 2011

  16. [16]

    Pachter and B

    L. Pachter and B. Sturmfels.Algebraic Statistics for Computational Biology. Cam- bridge University Press, 2005

  17. [17]

    Reymond.The chemical space project

    J.-L. Reymond.The chemical space project. Accounts of Chemical Research, 48(3):722–730, 2015

  18. [18]

    Ribas Ripoll.Algebraic Varieties and Ideal Theory in Combinatorial Click-Reaction Design

    V. Ribas Ripoll.Algebraic Varieties and Ideal Theory in Combinatorial Click-Reaction Design. Submitted to J. Comput. Algebra, 2026

  19. [19]

    Sharma, D

    A. Sharma, D. Czegel, M. Lachmann, C. Kempes, S. Walker and L. Cronin.Assembly theory explains and quantifies selection and evolution. Nature, 622:321–328, 2023

  20. [20]

    Stanley.Enumerative Combinatorics, vol

    R.P. Stanley.Enumerative Combinatorics, vol. 1, 2nd ed. Cambridge Studies in Ad- vanced Mathematics, vol. 49, Cambridge University Press, 2012

  21. [21]

    Sturmfels.Gr¨ obner Bases and Convex Polytopes

    B. Sturmfels.Gr¨ obner Bases and Convex Polytopes. University Lecture Series, vol. 8, AMS, 1996. ZOMBIE COMPOSITIONS IN ASSEMBLY ALGEBRAS 19

  22. [22]

    Ziegler.Lectures on Polytopes

    G.M. Ziegler.Lectures on Polytopes. Graduate Texts in Mathematics, vol. 152, Springer, 1995. Independent researcher Email address:vribas@ieee.org