Sharp Spectral Thresholds for Logit Fixed Points

Tongxi Wang

arxiv: 2605.15651 · v1 · pith:DV27UHZUnew · submitted 2026-05-15 · 💻 cs.LG · cs.AI· cs.GT

Sharp Spectral Thresholds for Logit Fixed Points

Tongxi Wang This is my paper

Pith reviewed 2026-05-20 20:55 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.GT

keywords softmax feedback systemslogit fixed pointsspectral stability thresholdaffine systemsphase transitionreinforcement learningglobal predictability

0 comments

The pith

Finite-dimensional affine logit systems stay stable and globally predictable up to the sharper spectral threshold β‖ΠWΠ‖_{T→T} < 2.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Softmax feedback systems underpin entropy-regularized reinforcement learning, logit game dynamics, and population choice models, where stability determines whether outcomes remain unique and predictable. Classical analysis only certified stability in a strongly over-regularized regime by treating softmax as a unit-scale response. This paper establishes that the true dimension-free Euclidean threshold occurs later, at β times the operator norm of the projected matrix ΠWΠ on the tangent space being less than 2. A reader would care because the result enlarges the certified parameter range where self-reinforcing systems avoid bifurcation while still responding to rewards.

Core claim

For finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold is β‖ΠWΠ‖_{T→T}<2 rather than the previously used condition, which certifies stability only while the softmax system remains safely over-regularized. The theorem fills the previously missing pre-bifurcation regime, extending stability guarantees for affine softmax feedback systems to reward-responsive yet globally predictable systems.

What carries the argument

The operator norm ‖ΠWΠ‖_{T→T} of the projected weight matrix on the tangent space, which directly sets the critical stability value at 2/β.

Load-bearing premise

The feedback system must be exactly finite-dimensional and affine in the logits, with the stability question posed in the Euclidean norm on the tangent space T after projection by Π.

What would settle it

Finding multiple distinct fixed points or a bifurcation in a finite-dimensional affine logit system where β‖ΠWΠ‖_{T→T} exceeds 2 would show that the claimed sharp threshold is not tight.

Figures

Figures reproduced from arXiv: 2605.15651 by Tongxi Wang.

**Figure 1.** Figure 1: Numerical illustration of the two-action logit phase diagram. Panel (a) plots the fixed points of m = tanh(βm/2). Solid curves are stable equilibria and the dashed curve is unstable. The old unit-softmax certificate covers only β < 1, while the covariance-calibrated theorem certifies the full pre-bifurcation interval β < 2. Panels (b) and (c) show deterministic logit-adjustment trajectories from several in… view at source ↗

**Figure 2.** Figure 2: Certificate gain for random affine logit systems. Each violin summarizes 200 fixed-seed draws with n = 32. The plotted quantity is βnew/βold = 2∥W∥2/∥ΠWΠ∥T →T . Ordinary Gaussian symmetric and non-symmetric matrices mainly show the factor-two improvement. Adding payoffshift components leaves the tangent certificate unchanged but shrinks the ambient certificate, so the certified inverse-temperature range e… view at source ↗

**Figure 3.** Figure 3: Picard convergence in the newly certified interval. Panel (a) shows that the chosen β lies above the old threshold and below the new threshold for each random shifted non-symmetric instance. Panel (b) shows the collapse of ten random initializations per instance under Picard iteration. The plot visualizes deterministic Picard convergence in finite-dimensional random affine systems certified by the theorem.… view at source ↗

**Figure 4.** Figure 4: Signed block cancellation separates the Euclidean and Dobrushin certificates. For the Hadamard block construction with α = 1, the Euclidean contraction factor remains qℓ2 = 1/2 for all powers-of-two block counts, while the natural block ℓ1 influence radius grows as ρ(C) = √ m/2 and exceeds the uniqueness threshold once m ≥ 8. 22 [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗

read the original abstract

Softmax feedback systems are a common mathematical core of entropy-regularized reinforcement learning, logit game dynamics, population choice, and mean-field variational updates. Their central stability question is simple: when does a self-reinforcing softmax system produce a unique and globally predictable outcome? Classical theory gives a conservative answer. By treating softmax as a unit-scale response, it certifies stability only in a strongly randomized regime. We prove that the classical approach misses an entire stable regime and does not identify the point at which the qualitative change truly occurs. For finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold is $$\beta\|\Pi W\Pi\|_{\mathcal T\to\mathcal T}<2,$$ rather than the previously used condition, which certifies stability only while the softmax system remains safely over-regularized. Our theorem fills the previously missing pre-bifurcation regime, extending stability guarantees for affine softmax feedback systems to reward-responsive yet globally predictable systems. It enlarges the certified stability boundary for these systems and identifies where the model genuinely undergoes a phase transition.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper sharpens the stability threshold for finite-dim affine logit fixed points to β‖ΠWΠ‖_{T→T}<2 but the necessity argument for sharpness looks underdeveloped.

read the letter

The one thing to know is that this paper sharpens the stability threshold for finite-dimensional affine logit fixed points to β‖ΠWΠ‖_{T→T}<2 in the Euclidean norm after projection onto the tangent space. This extends the certified regime beyond the conservative bounds from treating softmax as unit-scale. What the paper does well is clearly stating how classical analysis misses the pre-bifurcation regime and providing a dimension-free bound that applies to reward-responsive systems while keeping global predictability. The connection to entropy-regularized RL, logit games, and mean-field variational inference is direct and useful. The use of the projected operator norm seems a natural way to handle the affine structure. The soft spots are around the sharpness claim. The abstract presents β‖ΠWΠ‖_{T→T}<2 as the sharp threshold, but to make that stick it needs either a matching lower bound or an explicit example where the norm reaches 2 and the fixed point loses uniqueness or stability. The stress-test note flags this exactly, and without seeing a concrete bifurcation or worst-case attainment of the 1/2 Jacobian factor, the necessity part feels like it could be underdeveloped. The soundness is hard to judge from the abstract alone, but if the full derivation uses standard contraction arguments it should be checkable. This is aimed at theorists working on dynamical systems in ML, particularly those needing precise conditions for unique equilibria in softmax feedback. A reader who cares about tight bounds rather than sufficient ones would find it relevant. The citation pattern looks standard for the area. I think it deserves a serious referee to verify the proof steps and check if the sharpness holds up with an example or converse argument. I would recommend sending it for peer review.

Referee Report

2 major / 2 minor

Summary. The manuscript proves that for finite-dimensional affine logit systems, the sharp dimension-free Euclidean threshold guaranteeing uniqueness and global predictability of the fixed point p = softmax(β W p + b) is β‖ΠWΠ‖_{T→T}<2. This improves upon classical conservative bounds that certify stability only in strongly over-regularized regimes by combining the projected operator norm with the fact that the softmax Jacobian has Euclidean operator norm at most 1/2 on the tangent space T.

Significance. If the central claim holds, the result enlarges the certified stability region for softmax feedback systems that arise in entropy-regularized RL, logit dynamics, population choice models, and mean-field variational updates. The dimension-free character of the threshold is practically valuable for high-dimensional settings, and the work identifies the true location of the phase transition rather than a sufficient but loose condition.

major comments (2)

[Abstract and Main Theorem] Abstract and Main Theorem: The qualifier 'sharp' is load-bearing for the central claim. While the manuscript derives a sufficient contraction condition via the bound on the softmax Jacobian, it does not supply an explicit necessity argument, a matching lower bound, or a concrete bifurcation example showing that uniqueness or global predictability fails when β‖ΠWΠ‖_{T→T} reaches or exceeds 2. Without this, the threshold remains a sufficient but not necessarily tight characterization.
[§3.2 (Jacobian Analysis)] §3.2 (Jacobian Analysis): The proof that the Euclidean operator norm of the softmax Jacobian on T is at most 1/2 is central to obtaining the constant 2. The manuscript should explicitly verify whether this bound is attained at an interior fixed point or provide the precise constant used, as any gap would alter the claimed threshold.

minor comments (2)

[§2 (Preliminaries)] Notation for the projection Π and the tangent space T should be introduced once with a self-contained definition before its repeated use in the main theorem statement.
[Abstract] The abstract refers to 'previously used condition' without a specific citation; adding a reference to the classical bound would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive feedback on our manuscript. We address the major comments point by point below, providing clarifications on our use of 'sharp' and the Jacobian bound. We outline revisions that will be incorporated to strengthen the presentation.

read point-by-point responses

Referee: [Abstract and Main Theorem] The qualifier 'sharp' is load-bearing for the central claim. While the manuscript derives a sufficient contraction condition via the bound on the softmax Jacobian, it does not supply an explicit necessity argument, a matching lower bound, or a concrete bifurcation example showing that uniqueness or global predictability fails when β‖ΠWΠ‖_{T→T} reaches or exceeds 2. Without this, the threshold remains a sufficient but not necessarily tight characterization.

Authors: We thank the referee for this observation. The term 'sharp' in the manuscript refers specifically to the fact that the threshold β‖ΠWΠ‖_{T→T}<2 is the best constant obtainable from the contraction-mapping argument once the tight bound of 1/2 on the Euclidean operator norm of the softmax Jacobian over T is used; classical analyses employ a strictly weaker estimate and therefore certify a smaller regime. While the current version emphasizes the sufficient condition and does not contain an explicit bifurcation construction, we will add a short remark together with a low-dimensional numerical example (e.g., n=2 or n=3) demonstrating loss of uniqueness or global predictability as the threshold is approached. This addition will make the sharpness claim with respect to the Jacobian norm fully explicit. revision: partial
Referee: [§3.2 (Jacobian Analysis)] The proof that the Euclidean operator norm of the softmax Jacobian on T is at most 1/2 is central to obtaining the constant 2. The manuscript should explicitly verify whether this bound is attained at an interior fixed point or provide the precise constant used, as any gap would alter the claimed threshold.

Authors: We agree that explicit verification of attainability is useful. The analysis in §3.2 establishes that the Jacobian of softmax, when restricted to the tangent space T, is a symmetric positive-semidefinite operator whose eigenvalues lie in [0,1/2]; the upper bound 1/2 is therefore the precise constant. Equality is attained whenever the fixed-point probability vector p is uniform, which is an interior point and occurs, for instance, when b=0 and the rows of W sum to the same value. We will insert a short proposition in the revised §3.2 that states the conditions for equality and confirms that the constant 2 cannot be improved within the present contraction framework. revision: yes

Circularity Check

0 steps flagged

No circularity: direct contraction-mapping proof from softmax Jacobian bound

full rationale

The manuscript derives the threshold β‖ΠWΠ‖_{T→T}<2 by combining the Euclidean operator norm of the projected linear map with the fact that the Jacobian of softmax has norm at most 1/2 on the tangent space T. This is a standard first-principles linearization argument that does not reduce any quantity to a fitted parameter, a self-referential definition, or a load-bearing self-citation. The paper explicitly contrasts the new bound with the classical over-regularized regime and presents the result as an enlargement of the certified stability region; no step equates the claimed threshold to its own inputs by construction. The derivation therefore remains self-contained against external mathematical benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The result rests on standard properties of the softmax map, affine feedback, and operator norms on finite-dimensional spaces; no free parameters, new axioms, or invented entities are introduced in the abstract.

axioms (2)

domain assumption The logit system is finite-dimensional and affine with feedback through the softmax operator.
Invoked to define the class of systems for which the sharp threshold holds.
domain assumption The relevant stability norm is the operator norm from the tangent space T to itself after projection Π.
Used to state the dimension-free Euclidean threshold.

pith-pipeline@v0.9.0 · 5705 in / 1250 out tokens · 52486 ms · 2026-05-20T20:55:28.835057+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

sup_{p∈Δ} ‖Σ(p)‖₂ = 1/2 ... attained by a distribution supported equally on two coordinates
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff echoes

?

echoes
ECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.

X_i v_i² / x_i ≥ 2 ‖v‖₂² for v∈T ... entropy contributes curvature at least 2
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean absolute_floor_iff_bare_distinguishability refines

?

refines
Relation between the paper passage and the cited Recognition theorem.

W_tan := ΠWΠ|_T ... payoff-shift invariance and tangent feasibility

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

13 extracted references · 13 canonical work pages · 2 internal anchors

[1]

2025.3553096

doi: 10.1109/TAC. 2025.3553096. Yuhao Ding, Junzi Zhang, Hyunin Lee, and Javad Lavaei. Beyond exact gradients: Convergence of stochastic soft-max policy gradient methods with entropy regularization.IEEE Transactions on Automatic Control, 70(8):5129–5144,

work page doi:10.1109/tac 2025
[2]

doi: 10.1109/TAC.2025.3540965. P. L. Dobruschin. The description of a random field by means of conditional probabilities and conditions of its regularity.Theory of Probability & Its Applications, 13(2):197–224,

work page doi:10.1109/tac.2025.3540965 2025
[3]

Drew Fudenberg and David K

doi: 10.1145/3301315. Drew Fudenberg and David K. Levine.The Theory of Learning in Games. MIT Press,

work page doi:10.1145/3301315
[4]

On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

Bolin Gao and Lacra Pavel. On the properties of the softmax function with application in game theory and reinforcement learning.arXiv preprint arXiv:1704.00805,

work page internal anchor Pith review Pith/arXiv arXiv
[5]

Gavin, Ming Cao, and Keith Paarporn

Rory C. Gavin, Ming Cao, and Keith Paarporn. An analysis of logit learning with the r-lambert function. InProceedings of the 2024 IEEE 63rd Conference on Decision and Control (CDC), pages 6774–6779, Milan, Italy,

work page 2024
[6]

doi: 10.1109/CDC56724.2024.10886861

IEEE. doi: 10.1109/CDC56724.2024.10886861. Matthieu Geist, Bruno Scherrer, and Olivier Pietquin. A theory of regularized markov decision processes. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 2160–2169,

work page doi:10.1109/cdc56724.2024.10886861 2024
[7]

doi: 10.1016/j.jet.2005.05.011. Cars H. Hommes and Marius I. Ochea. Multiple equilibria and limit cycles in evolutionary games with logit dynamics.Games and Economic Behavior, 74(1):434–441,

work page doi:10.1016/j.jet.2005.05.011 2005
[8]

2011.05.014

doi: 10.1016/j.geb. 2011.05.014. Sung-Ha Hwang and Luc Rey-Bellet. Positive feedback in coordination games: Stochastic evolution- ary dynamics and the logit choice rule.Games and Economic Behavior, 126:355–373,

work page doi:10.1016/j.geb 2011
[9]

Michael I

doi: 10.1016/j.geb.2021.01.003. Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models.Machine Learning, 37(2):183–233,

work page doi:10.1016/j.geb.2021.01.003 2021
[10]

Hilbert J. Kappen. Path integrals and symmetry breaking for optimal control theory.Journal of Statistical Mechanics: Theory and Experiment, 2005(11):P11011,

work page 2005
[11]

Lectures on glauber dynamics for discrete spin models

Fabio Martinelli. Lectures on glauber dynamics for discrete spin models. InLectures on Probability Theory and Statistics (Saint-Flour, 1997), volume 1717 ofLecture Notes in Mathematics, pages 93–191. Springer,

work page 1997
[12]

A unified view of entropy-regularized Markov decision processes

ISSN 2835-8856. URL https://openreview.net/forum?id= 6dowaHsa6D. 11 Gergely Neu, Anders Jonsson, and Vicenç Gómez. A unified view of entropy-regularized markov decision processes. arXiv preprint arXiv:1705.07798,

work page internal anchor Pith review Pith/arXiv arXiv
[13]

Martin J Wainwright and Michael I Jordan

doi: 10.1016/j.mathsocsci.2017.12.001. Martin J Wainwright and Michael I Jordan. Graphical models, exponential families, and variational inference.Foundations and Trends® in Machine Learning, 1(1-2):1–305,

work page doi:10.1016/j.mathsocsci.2017.12.001 2017

[1] [1]

2025.3553096

doi: 10.1109/TAC. 2025.3553096. Yuhao Ding, Junzi Zhang, Hyunin Lee, and Javad Lavaei. Beyond exact gradients: Convergence of stochastic soft-max policy gradient methods with entropy regularization.IEEE Transactions on Automatic Control, 70(8):5129–5144,

work page doi:10.1109/tac 2025

[2] [2]

doi: 10.1109/TAC.2025.3540965. P. L. Dobruschin. The description of a random field by means of conditional probabilities and conditions of its regularity.Theory of Probability & Its Applications, 13(2):197–224,

work page doi:10.1109/tac.2025.3540965 2025

[3] [3]

Drew Fudenberg and David K

doi: 10.1145/3301315. Drew Fudenberg and David K. Levine.The Theory of Learning in Games. MIT Press,

work page doi:10.1145/3301315

[4] [4]

On the Properties of the Softmax Function with Application in Game Theory and Reinforcement Learning

Bolin Gao and Lacra Pavel. On the properties of the softmax function with application in game theory and reinforcement learning.arXiv preprint arXiv:1704.00805,

work page internal anchor Pith review Pith/arXiv arXiv

[5] [5]

Gavin, Ming Cao, and Keith Paarporn

Rory C. Gavin, Ming Cao, and Keith Paarporn. An analysis of logit learning with the r-lambert function. InProceedings of the 2024 IEEE 63rd Conference on Decision and Control (CDC), pages 6774–6779, Milan, Italy,

work page 2024

[6] [6]

doi: 10.1109/CDC56724.2024.10886861

IEEE. doi: 10.1109/CDC56724.2024.10886861. Matthieu Geist, Bruno Scherrer, and Olivier Pietquin. A theory of regularized markov decision processes. InProceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 2160–2169,

work page doi:10.1109/cdc56724.2024.10886861 2024

[7] [7]

doi: 10.1016/j.jet.2005.05.011. Cars H. Hommes and Marius I. Ochea. Multiple equilibria and limit cycles in evolutionary games with logit dynamics.Games and Economic Behavior, 74(1):434–441,

work page doi:10.1016/j.jet.2005.05.011 2005

[8] [8]

2011.05.014

doi: 10.1016/j.geb. 2011.05.014. Sung-Ha Hwang and Luc Rey-Bellet. Positive feedback in coordination games: Stochastic evolution- ary dynamics and the logit choice rule.Games and Economic Behavior, 126:355–373,

work page doi:10.1016/j.geb 2011

[9] [9]

Michael I

doi: 10.1016/j.geb.2021.01.003. Michael I. Jordan, Zoubin Ghahramani, Tommi S. Jaakkola, and Lawrence K. Saul. An introduction to variational methods for graphical models.Machine Learning, 37(2):183–233,

work page doi:10.1016/j.geb.2021.01.003 2021

[10] [10]

Hilbert J. Kappen. Path integrals and symmetry breaking for optimal control theory.Journal of Statistical Mechanics: Theory and Experiment, 2005(11):P11011,

work page 2005

[11] [11]

Lectures on glauber dynamics for discrete spin models

Fabio Martinelli. Lectures on glauber dynamics for discrete spin models. InLectures on Probability Theory and Statistics (Saint-Flour, 1997), volume 1717 ofLecture Notes in Mathematics, pages 93–191. Springer,

work page 1997

[12] [12]

A unified view of entropy-regularized Markov decision processes

ISSN 2835-8856. URL https://openreview.net/forum?id= 6dowaHsa6D. 11 Gergely Neu, Anders Jonsson, and Vicenç Gómez. A unified view of entropy-regularized markov decision processes. arXiv preprint arXiv:1705.07798,

work page internal anchor Pith review Pith/arXiv arXiv

[13] [13]

Martin J Wainwright and Michael I Jordan

doi: 10.1016/j.mathsocsci.2017.12.001. Martin J Wainwright and Michael I Jordan. Graphical models, exponential families, and variational inference.Foundations and Trends® in Machine Learning, 1(1-2):1–305,

work page doi:10.1016/j.mathsocsci.2017.12.001 2017