pith. sign in

arxiv: 2605.15651 · v1 · pith:DV27UHZUnew · submitted 2026-05-15 · 💻 cs.LG · cs.AI· cs.GT

Sharp Spectral Thresholds for Logit Fixed Points

Pith reviewed 2026-05-20 20:55 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.GT
keywords softmax feedback systemslogit fixed pointsspectral stability thresholdaffine systemsphase transitionreinforcement learningglobal predictability
0
0 comments X

The pith

Finite-dimensional affine logit systems stay stable and globally predictable up to the sharper spectral threshold β‖ΠWΠ‖_{T→T} < 2.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Softmax feedback systems underpin entropy-regularized reinforcement learning, logit game dynamics, and population choice models, where stability determines whether outcomes remain unique and predictable. Classical analysis only certified stability in a strongly over-regularized regime by treating softmax as a unit-scale response. This paper establishes that the true dimension-free Euclidean threshold occurs later, at β times the operator norm of the projected matrix ΠWΠ on the tangent space being less than 2. A reader would care because the result enlarges the certified parameter range where self-reinforcing systems avoid bifurcation while still responding to rewards.

Core claim

For finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold is β‖ΠWΠ‖_{T→T}<2 rather than the previously used condition, which certifies stability only while the softmax system remains safely over-regularized. The theorem fills the previously missing pre-bifurcation regime, extending stability guarantees for affine softmax feedback systems to reward-responsive yet globally predictable systems.

What carries the argument

The operator norm ‖ΠWΠ‖_{T→T} of the projected weight matrix on the tangent space, which directly sets the critical stability value at 2/β.

Load-bearing premise

The feedback system must be exactly finite-dimensional and affine in the logits, with the stability question posed in the Euclidean norm on the tangent space T after projection by Π.

What would settle it

Finding multiple distinct fixed points or a bifurcation in a finite-dimensional affine logit system where β‖ΠWΠ‖_{T→T} exceeds 2 would show that the claimed sharp threshold is not tight.

Figures

Figures reproduced from arXiv: 2605.15651 by Tongxi Wang.

Figure 1
Figure 1. Figure 1: Numerical illustration of the two-action logit phase diagram. Panel (a) plots the fixed points of m = tanh(βm/2). Solid curves are stable equilibria and the dashed curve is unstable. The old unit-softmax certificate covers only β < 1, while the covariance-calibrated theorem certifies the full pre-bifurcation interval β < 2. Panels (b) and (c) show deterministic logit-adjustment trajectories from several in… view at source ↗
Figure 2
Figure 2. Figure 2: Certificate gain for random affine logit systems. Each violin summarizes 200 fixed-seed draws with n = 32. The plotted quantity is βnew/βold = 2∥W∥2/∥ΠWΠ∥T →T . Ordinary Gaussian symmetric and non-symmetric matrices mainly show the factor-two improvement. Adding payoff￾shift components leaves the tangent certificate unchanged but shrinks the ambient certificate, so the certified inverse-temperature range e… view at source ↗
Figure 3
Figure 3. Figure 3: Picard convergence in the newly certified interval. Panel (a) shows that the chosen β lies above the old threshold and below the new threshold for each random shifted non-symmetric instance. Panel (b) shows the collapse of ten random initializations per instance under Picard iteration. The plot visualizes deterministic Picard convergence in finite-dimensional random affine systems certified by the theorem.… view at source ↗
Figure 4
Figure 4. Figure 4: Signed block cancellation separates the Euclidean and Dobrushin certificates. For the Hadamard block construction with α = 1, the Euclidean contraction factor remains qℓ2 = 1/2 for all powers-of-two block counts, while the natural block ℓ1 influence radius grows as ρ(C) = √ m/2 and exceeds the uniqueness threshold once m ≥ 8. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗
read the original abstract

Softmax feedback systems are a common mathematical core of entropy-regularized reinforcement learning, logit game dynamics, population choice, and mean-field variational updates. Their central stability question is simple: when does a self-reinforcing softmax system produce a unique and globally predictable outcome? Classical theory gives a conservative answer. By treating softmax as a unit-scale response, it certifies stability only in a strongly randomized regime. We prove that the classical approach misses an entire stable regime and does not identify the point at which the qualitative change truly occurs. For finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold is $$\beta\|\Pi W\Pi\|_{\mathcal T\to\mathcal T}<2,$$ rather than the previously used condition, which certifies stability only while the softmax system remains safely over-regularized. Our theorem fills the previously missing pre-bifurcation regime, extending stability guarantees for affine softmax feedback systems to reward-responsive yet globally predictable systems. It enlarges the certified stability boundary for these systems and identifies where the model genuinely undergoes a phase transition.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript proves that for finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold guaranteeing uniqueness and global predictability of the fixed point p = softmax(β W p + b) is β‖ΠWΠ‖_{T→T}<2. This improves upon classical conservative bounds that certify stability only in strongly over-regularized regimes by combining the projected operator norm with the fact that the softmax Jacobian has Euclidean operator norm at most 1/2 on the tangent space T.

Significance. If the central claim holds, the result enlarges the certified stability region for softmax feedback systems that arise in entropy-regularized RL, logit dynamics, population choice models, and mean-field variational updates. The dimension-free character of the threshold is practically valuable for high-dimensional settings, and the work identifies the true location of the phase transition rather than a sufficient but loose condition.

major comments (2)
  1. [Abstract and Main Theorem] Abstract and Main Theorem: The qualifier 'sharp' is load-bearing for the central claim. While the manuscript derives a sufficient contraction condition via the bound on the softmax Jacobian, it does not supply an explicit necessity argument, a matching lower bound, or a concrete bifurcation example showing that uniqueness or global predictability fails when β‖ΠWΠ‖_{T→T} reaches or exceeds 2. Without this, the threshold remains a sufficient but not necessarily tight characterization.
  2. [§3.2 (Jacobian Analysis)] §3.2 (Jacobian Analysis): The proof that the Euclidean operator norm of the softmax Jacobian on T is at most 1/2 is central to obtaining the constant 2. The manuscript should explicitly verify whether this bound is attained at an interior fixed point or provide the precise constant used, as any gap would alter the claimed threshold.
minor comments (2)
  1. [§2 (Preliminaries)] Notation for the projection Π and the tangent space T should be introduced once with a self-contained definition before its repeated use in the main theorem statement.
  2. [Abstract] The abstract refers to 'previously used condition' without a specific citation; adding a reference to the classical bound would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We address the major comments point by point below, providing clarifications on our use of 'sharp' and the Jacobian bound. We outline revisions that will be incorporated to strengthen the presentation.

read point-by-point responses
  1. Referee: [Abstract and Main Theorem] The qualifier 'sharp' is load-bearing for the central claim. While the manuscript derives a sufficient contraction condition via the bound on the softmax Jacobian, it does not supply an explicit necessity argument, a matching lower bound, or a concrete bifurcation example showing that uniqueness or global predictability fails when β‖ΠWΠ‖_{T→T} reaches or exceeds 2. Without this, the threshold remains a sufficient but not necessarily tight characterization.

    Authors: We thank the referee for this observation. The term 'sharp' in the manuscript refers specifically to the fact that the threshold β‖ΠWΠ‖_{T→T}<2 is the best constant obtainable from the contraction-mapping argument once the tight bound of 1/2 on the Euclidean operator norm of the softmax Jacobian over T is used; classical analyses employ a strictly weaker estimate and therefore certify a smaller regime. While the current version emphasizes the sufficient condition and does not contain an explicit bifurcation construction, we will add a short remark together with a low-dimensional numerical example (e.g., n=2 or n=3) demonstrating loss of uniqueness or global predictability as the threshold is approached. This addition will make the sharpness claim with respect to the Jacobian norm fully explicit. revision: partial

  2. Referee: [§3.2 (Jacobian Analysis)] The proof that the Euclidean operator norm of the softmax Jacobian on T is at most 1/2 is central to obtaining the constant 2. The manuscript should explicitly verify whether this bound is attained at an interior fixed point or provide the precise constant used, as any gap would alter the claimed threshold.

    Authors: We agree that explicit verification of attainability is useful. The analysis in §3.2 establishes that the Jacobian of softmax, when restricted to the tangent space T, is a symmetric positive-semidefinite operator whose eigenvalues lie in [0,1/2]; the upper bound 1/2 is therefore the precise constant. Equality is attained whenever the fixed-point probability vector p is uniform, which is an interior point and occurs, for instance, when b=0 and the rows of W sum to the same value. We will insert a short proposition in the revised §3.2 that states the conditions for equality and confirms that the constant 2 cannot be improved within the present contraction framework. revision: yes

Circularity Check

0 steps flagged

No circularity: direct contraction-mapping proof from softmax Jacobian bound

full rationale

The manuscript derives the threshold β‖ΠWΠ‖_{T→T}<2 by combining the Euclidean operator norm of the projected linear map with the fact that the Jacobian of softmax has norm at most 1/2 on the tangent space T. This is a standard first-principles linearization argument that does not reduce any quantity to a fitted parameter, a self-referential definition, or a load-bearing self-citation. The paper explicitly contrasts the new bound with the classical over-regularized regime and presents the result as an enlargement of the certified stability region; no step equates the claimed threshold to its own inputs by construction. The derivation therefore remains self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard properties of the softmax map, affine feedback, and operator norms on finite-dimensional spaces; no free parameters, new axioms, or invented entities are introduced in the abstract.

axioms (2)
  • domain assumption The logit system is finite-dimensional and affine with feedback through the softmax operator.
    Invoked to define the class of systems for which the sharp threshold holds.
  • domain assumption The relevant stability norm is the operator norm from the tangent space T to itself after projection Π.
    Used to state the dimension-free Euclidean threshold.

pith-pipeline@v0.9.0 · 5705 in / 1250 out tokens · 52486 ms · 2026-05-20T20:55:28.835057+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 2 internal anchors

  1. [1]

    2025.3553096

    doi: 10.1109/TAC. 2025.3553096. Yuhao Ding, Junzi Zhang, Hyunin Lee, and Javad Lavaei. Beyond exact gradients: Convergence of stochastic soft-max policy gradient methods with entropy regularization.IEEE Transactions on Automatic Control, 70(8):5129–5144,

  2. [2]

    doi: 10.1109/TAC.2025.3540965. P. L. Dobruschin. The description of a random field by means of conditional probabilities and conditions of its regularity.Theory of Probability & Its Applications, 13(2):197–224,

  3. [3]

    Drew Fudenberg and David K

    doi: 10.1145/3301315. Drew Fudenberg and David K. Levine.The Theory of Learning in Games. MIT Press,

  4. [4]

    On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

    Bolin Gao and Lacra Pavel. On the properties of the softmax function with application in game theory and reinforcement learning.arXiv preprint arXiv:1704.00805,

  5. [5]

    Gavin, Ming Cao, and Keith Paarporn

    Rory C. Gavin, Ming Cao, and Keith Paarporn. An analysis of logit learning with the r-lambert function. InProceedings of the 2024 IEEE 63rd Conference on Decision and Control (CDC), pages 6774–6779, Milan, Italy,

  6. [6]

    doi: 10.1109/CDC56724.2024.10886861

    IEEE. doi: 10.1109/CDC56724.2024.10886861. Matthieu Geist, Bruno Scherrer, and Olivier Pietquin. A theory of regularized markov decision processes. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 2160–2169,

  7. [7]

    doi: 10.1016/j.jet.2005.05.011. Cars H. Hommes and Marius I. Ochea. Multiple equilibria and limit cycles in evolutionary games with logit dynamics.Games and Economic Behavior, 74(1):434–441,

  8. [8]

    2011.05.014

    doi: 10.1016/j.geb. 2011.05.014. Sung-Ha Hwang and Luc Rey-Bellet. Positive feedback in coordination games: Stochastic evolution- ary dynamics and the logit choice rule.Games and Economic Behavior, 126:355–373,

  9. [9]

    Michael I

    doi: 10.1016/j.geb.2021.01.003. Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models.Machine Learning, 37(2):183–233,

  10. [10]

    Hilbert J. Kappen. Path integrals and symmetry breaking for optimal control theory.Journal of Statistical Mechanics: Theory and Experiment, 2005(11):P11011,

  11. [11]

    Lectures on glauber dynamics for discrete spin models

    Fabio Martinelli. Lectures on glauber dynamics for discrete spin models. InLectures on Probability Theory and Statistics (Saint-Flour, 1997), volume 1717 ofLecture Notes in Mathematics, pages 93–191. Springer,

  12. [12]

    A unified view of entropy-regularized Markov decision processes

    ISSN 2835-8856. URL https://openreview.net/forum?id= 6dowaHsa6D. 11 Gergely Neu, Anders Jonsson, and Vicenç Gómez. A unified view of entropy-regularized markov decision processes. arXiv preprint arXiv:1705.07798,

  13. [13]

    Martin J Wainwright and Michael I Jordan

    doi: 10.1016/j.mathsocsci.2017.12.001. Martin J Wainwright and Michael I Jordan. Graphical models, exponential families, and variational inference.Foundations and Trends® in Machine Learning, 1(1-2):1–305,