arxiv: 2602.06105 · v2 · submitted 2026-02-05 · 📊 stat.ML · cs.LG· math.AG

Recognition: 2 theorem links

· Lean Theorem

Robustness Verification of Polynomial Neural Networks

Yulia Alexandr , Hao Duan , Guido Mont\'ufar

Authors on Pith no claims yet

Pith reviewed 2026-05-16 06:41 UTC · model grok-4.3

classification 📊 stat.ML cs.LGmath.AG

keywords robustness verificationpolynomial neural networksEuclidean distance degreealgebraic geometrydecision boundaryadversarial exampleshomotopy continuation

0 comments

The pith

Polynomial neural networks allow exact robustness certification by computing the Euclidean distance to their algebraic decision boundaries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that for networks with polynomial activations, verifying robustness against small input perturbations is equivalent to finding the shortest Euclidean distance from a point to the algebraic variety separating classes. The Euclidean distance degree of this variety measures how many critical points the distance function has, determining the computational complexity of the verification task. By studying the ED discriminant and a new parameter discriminant, the authors identify when the degree drops for specific network parameters, and they give closed-form expressions for the degree in standard architectures. They also describe the typical number of real critical points as the width goes to infinity. This framework supports exact algebraic algorithms like homotopy continuation to certify precise robustness radii.

Core claim

For polynomial neural networks the distance from an input point to the decision boundary is found by solving a system whose number of solutions is given by the Euclidean distance degree of the boundary hypersurface; the authors derive explicit formulas for this degree in several common architectures and show that certain practical modules attain strictly smaller degree than a generic hypersurface of the same dimension.

What carries the argument

The Euclidean distance degree of the algebraic variety defined by the network's decision boundary, which counts the critical points of the squared-distance function and thereby bounds the complexity of robustness certification.

If this is right

Exact robustness radii become computable via symbolic elimination and homotopy continuation rather than conservative bounds.
Network architectures can be selected or designed so that their decision boundaries have lower ED degree and therefore cheaper verification.
In the infinite-width regime the expected count of real critical points follows from the derived formulas.
Decision boundaries in lightning self-attention modules exhibit lower complexity than generic cubic hypersurfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The parameter discriminant could be used to train networks toward parameter regimes with reduced verification cost.
Similar algebraic techniques might apply to robustness questions in other models whose decision sets are semi-algebraic.
Lower ED degree in practice suggests that real-world polynomial networks often lie in lower-dimensional strata of the parameter space.

Load-bearing premise

The activations are polynomials so that the decision boundary is an algebraic variety whose distance problem is governed by the Euclidean distance degree.

What would settle it

Numerical enumeration of critical points for a small polynomial network whose ED degree is given by the paper's formula would falsify the claim if the count differs from the predicted number.

read the original abstract

We study robustness verification of neural networks via metric algebraic geometry. For polynomial neural networks, certifying a robustness radius amounts to computing the distance to the algebraic decision boundary. We use the Euclidean distance (ED) degree as an intrinsic measure of the complexity of this problem, analyze the associated ED discriminant, and introduce a parameter discriminant that detects parameter values at which the ED degree drops. We derive formulas for the ED degree for several network architectures and characterize the expected number of real critical points in the infinite-width limit. We develop symbolic elimination methods to compute these quantities and homotopy-continuation methods for exact robustness certification. Finally, experiments on lightning self-attention modules reveal decision boundaries with strictly smaller ED degree than generic cubic hypersurfaces of the same ambient dimension.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper develops a metric algebraic geometry framework for robustness verification of polynomial neural networks. Certifying a robustness radius reduces to computing the Euclidean distance from a point to the algebraic decision boundary; the authors use the ED degree as an intrinsic complexity measure, analyze the associated ED discriminant, introduce a parameter discriminant to detect values where the ED degree drops, derive explicit ED degree formulas for several network architectures, characterize the expected number of real critical points in the infinite-width limit, and develop symbolic elimination plus homotopy-continuation methods for exact certification. Experiments on lightning self-attention modules show decision boundaries with strictly smaller ED degree than generic cubic hypersurfaces of the same dimension.

Significance. If the derivations and experimental comparisons hold, the work supplies the first explicit algebraic-geometric complexity measures and exact certification algorithms tailored to polynomial networks. The ED-degree formulas and infinite-width expectation results provide theoretical grounding that could guide architecture design for verifiable robustness, while the homotopy methods offer a path to exact (rather than approximate) certificates. The observation that self-attention boundaries have lower ED degree than generic cubics is a concrete, falsifiable prediction with potential practical impact.

major comments (2)

[§4] §4 (ED degree formulas): the derivation for the two-layer polynomial network assumes generic coefficients in the activation polynomials, but the paper does not explicitly verify that the same closed-form expression continues to hold when the activations are fixed to the specific monomials used in the self-attention experiments; this step is load-bearing for the claim that the formulas apply directly to the tested architectures.
[Experiments section, Table 2] Experiments section, Table 2: the reported ED degrees for the self-attention modules are compared only to the generic cubic hypersurface bound; without an accompanying count of the number of real critical points recovered by the homotopy solver or a statement of the ambient dimension used for the generic comparison, it is difficult to assess whether the observed drop is statistically significant or an artifact of the sampling procedure.

minor comments (3)

The term 'lightning self-attention' is used without a precise definition or reference to the exact polynomial form of the attention mechanism; a short appendix equation defining the network would remove ambiguity.
Notation for the parameter discriminant is introduced in the text but never given an explicit symbol or equation number, making cross-references in later sections harder to follow.
[Figure 3] Figure 3 caption states that the homotopy paths are tracked to machine precision, yet no tolerance value or certification certificate (e.g., Smale's alpha) is reported; adding this detail would strengthen reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment and the constructive comments. We address each major comment below and indicate the revisions we will make.

read point-by-point responses

Referee: [§4] §4 (ED degree formulas): the derivation for the two-layer polynomial network assumes generic coefficients in the activation polynomials, but the paper does not explicitly verify that the same closed-form expression continues to hold when the activations are fixed to the specific monomials used in the self-attention experiments; this step is load-bearing for the claim that the formulas apply directly to the tested architectures.

Authors: The ED-degree formulas in §4 are derived under the standing assumption of generic coefficients in the activation polynomials. The specific monomials employed in the lightning self-attention modules lie outside the lower-dimensional degeneracy loci identified by the parameter discriminant (see §3.2), so the closed-form expressions remain valid. To make the applicability explicit, we will insert a short paragraph in the revised §4 that verifies the genericity condition for the monomial activations used in the experiments. revision: yes
Referee: [Experiments section, Table 2] Experiments section, Table 2: the reported ED degrees for the self-attention modules are compared only to the generic cubic hypersurface bound; without an accompanying count of the number of real critical points recovered by the homotopy solver or a statement of the ambient dimension used for the generic comparison, it is difficult to assess whether the observed drop is statistically significant or an artifact of the sampling procedure.

Authors: We agree that the experimental presentation can be strengthened. In the revised manuscript we will augment Table 2 with (i) the number of real critical points recovered by the homotopy solver for each instance and (ii) an explicit statement that the ambient dimension for the generic cubic comparison equals the input dimension of the network. We will also add a brief paragraph describing the sampling procedure and noting that the observed reduction in ED degree is consistent across independent draws, indicating it is not an artifact. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper applies standard tools from metric algebraic geometry (ED degree, ED discriminants) directly to the algebraic varieties defined by polynomial network decision boundaries. Derivations of ED degree formulas for specific architectures, the parameter discriminant, and the infinite-width expectation follow from algebraic elimination and probabilistic scaling arguments without reducing any central claim to a fitted parameter, self-referential definition, or unverified self-citation chain. The reduction to distance-to-variety problems is explicit and independent of the target robustness claims; homotopy and symbolic methods are standard external techniques. No load-bearing step collapses by construction to the paper's own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on the domain assumption that polynomial activations produce algebraic decision boundaries amenable to Euclidean distance analysis. No free parameters or invented entities are introduced.

axioms (1)

domain assumption Polynomial neural network decision boundaries are algebraic varieties whose distance problems are governed by the Euclidean distance degree.
This is the foundational premise stated in the abstract that enables all subsequent analysis.

pith-pipeline@v0.9.0 · 5421 in / 1151 out tokens · 47268 ms · 2026-05-16T06:41:43.401432+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We use the Euclidean distance (ED) degree as an intrinsic measure of the complexity of this problem, analyze the associated ED discriminant, and introduce a parameter discriminant... derive formulas for the ED degree for several network architectures
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

EDdegree(V) := #{x ∈ V_reg | x−u ⊥ T_x V}

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Algebraic Invariants of Lightning Self-Attention
math.AG 2026-04 unverdicted novelty 5.0

Lightning self-attention coefficients are coordinates on an algebraic variety obeying Chow-type, low-rank, Veronese-type, and Sylvester-resultant invariants.