Centre manifold theorem for maps along manifolds of fixed points
Pith reviewed 2026-05-10 04:00 UTC · model grok-4.3
The pith
The centre manifold theorem extends to maps along a manifold-with-boundary of fixed points.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We prove that under suitable smoothness and spectral conditions, a map with a manifold-with-boundary of fixed points admits a locally invariant centre manifold tangent to the centre bundle along the manifold. This manifold is used to reduce the dynamics near the fixed-point set. The theorem is applied to study the behavior of large-step-size gradient descent iterates in two-layer matrix factorization.
What carries the argument
The centre manifold tangent to the centre directions along the manifold-with-boundary of fixed points, which remains invariant under the map and captures the non-hyperbolic dynamics.
Load-bearing premise
The map is sufficiently smooth and the linearization along the fixed-point manifold has a spectral gap that cleanly separates centre, stable, and unstable directions.
What would settle it
A concrete C^1 map possessing a manifold-with-boundary of fixed points for which no locally invariant centre manifold exists near a boundary point.
read the original abstract
We prove a centre manifold theorem for a map along a manifold-with-boundary of fixed points, and provide an application to the study of gradient descent with large step size on two-layer matrix factorisation problems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves a centre manifold theorem for a map along a manifold-with-boundary of fixed points and applies the result to analyse gradient descent with large step sizes on two-layer matrix factorisation problems.
Significance. If the theorem holds under the stated conditions, the result supplies a reduction tool for non-hyperbolic dynamics near continua of equilibria that include boundaries; the matrix-factorisation application could then yield concrete statements about the long-term behaviour of large-step GD near rank-deficient loci.
major comments (2)
- [§3] §3 (Statement of the main theorem): the required spectral-gap and invariance hypotheses are stated only locally in the interior of the manifold-with-boundary; no uniform control near the boundary is proved or assumed, yet the application in §5 sends trajectories arbitrarily close to that boundary.
- [§5] §5 (GD application): the linearisation of the large-step GD map at points of the fixed-point manifold is never computed explicitly, so it is impossible to verify that the centre spectrum remains on the unit circle while the stable/unstable parts satisfy a uniform gap as rank deficiency is approached.
minor comments (2)
- [§2] The definition of the centre bundle in §2 is given only in local coordinates; a coordinate-free formulation would clarify the statement for readers.
- [References] Several standard references on centre-manifold theorems for maps (e.g., Vanderbauwhede 1989, Carr 1981) are omitted from the bibliography.
Simulated Author's Rebuttal
We thank the referee for their thorough review and valuable feedback on our manuscript. We address each of the major comments below and outline the revisions we plan to make to strengthen the paper.
read point-by-point responses
-
Referee: [§3] §3 (Statement of the main theorem): the required spectral-gap and invariance hypotheses are stated only locally in the interior of the manifold-with-boundary; no uniform control near the boundary is proved or assumed, yet the application in §5 sends trajectories arbitrarily close to that boundary.
Authors: We acknowledge that the spectral gap and invariance conditions in Theorem 3.1 are formulated in a pointwise manner for points in the interior of the manifold-with-boundary. The theorem itself is local in nature, constructing the center manifold in a neighborhood of each point. However, to ensure the application in §5 is rigorous, where trajectories can approach the boundary, we will add a uniform spectral gap assumption near the boundary and prove that it holds for the specific gradient descent map under consideration. This will be incorporated as an additional hypothesis in the theorem statement and verified in the application section. revision: yes
-
Referee: [§5] §5 (GD application): the linearisation of the large-step GD map at points of the fixed-point manifold is never computed explicitly, so it is impossible to verify that the centre spectrum remains on the unit circle while the stable/unstable parts satisfy a uniform gap as rank deficiency is approached.
Authors: In §5, the analysis of the linearization relies on the algebraic structure of the two-layer matrix factorization and the form of the gradient descent update, allowing us to determine the spectrum without a full explicit matrix representation at every point. Nevertheless, to facilitate verification, we will include an explicit computation of the Jacobian in a new subsection or appendix, demonstrating that the center eigenvalues lie on the unit circle and that the spectral gap condition holds uniformly as the rank deficiency parameter varies and approaches the boundary. This explicit calculation will confirm the applicability of the center manifold theorem. revision: yes
Circularity Check
No circularity: standard proof of extended centre manifold theorem from stated assumptions
full rationale
The paper claims to prove a centre manifold theorem for maps along a manifold-with-boundary of fixed points, plus an application to gradient descent on matrix factorisation. No equations, fitted parameters, or 'predictions' appear in the abstract or reader's summary. The derivation is a mathematical proof deriving invariance and tangency properties from hyperbolicity, spectral gap, and smoothness hypotheses on the fixed-point manifold. These are external assumptions, not self-defined or fitted to the target result. No self-citation load-bearing steps, ansatz smuggling, or renaming of known results are indicated. The application is presented as an illustration rather than a fitted prediction. The result is therefore self-contained against external benchmarks and does not reduce to its inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Second-order regression models exhibit progressive sharpening to the edge of stability
Atish Agarwala, Fabian Pedregosa, and Jeffrey Pennington. Second-order regression models exhibit progressive sharpening to the edge of stability. InICML, 2023
work page 2023
-
[2]
K. Ahn, J. Zhang, and S. Sra. Understanding the unstable convergence of gradient descent. InICML, 2022
work page 2022
- [3]
-
[4]
F. Behr and G. Dolzmann. A note on Clarke’s Generalized Jacobian for the Inverse of Bi-Lipschitz Maps. Journal of Optimization Theory and Applications, 200:852–857, 2024
work page 2024
-
[5]
C. Bonatti and S. Crovisier. Center manifolds for partially hyperbolic set without strong unstable connections.J. Inst. Math. Jussieu, 15:785–828, 2016
work page 2016
-
[6]
Y. Cai, J. Wu, S. Mei, M. Lindsey, and P. L. Bartlett. Large Stepsize Gradient Descent for Non- Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization. InNeurIPS, 2024
work page 2024
-
[7]
Beyond the Edge of Stability via Two-step Gradient Updates
Lei Chen and Joan Bruna. Beyond the Edge of Stability via Two-step Gradient Updates. InICML, 2023
work page 2023
-
[8]
S.-N. Chow, W. Liu, and Y. Yi. Center Manifolds for Invariant Sets.Journal of Differential Equations, 168:355–385, 2000
work page 2000
-
[9]
S.-N. Chow, W. Liu, and Y. Yi. Center manifolds for smooth invariant manifolds.Trans. Amer. Math. Soc., 352(11):5179–5211, 2000
work page 2000
-
[10]
F. H. Clarke. On the inverse function theorem.Pacific Journal of Mathematics, 64(1):97–102, 1976
work page 1976
-
[11]
Clarke.Optimization and Nonsmooth Analysis
Frank H. Clarke.Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, 1990
work page 1990
- [12]
- [13]
-
[14]
Dayal Singh Kalra and Tianyu He and Maissam Barkeshli. Universal Sharpness Dynamics in Neural Network Training: Fixed Point Analysis, Edge of Stability, and Route to Chaos. InICLR, 2025
work page 2025
-
[15]
E. De Faria and P. Hazard. Generalized Whitney topologies are Baire.Proceedings of the American Mathematical Society, 148(12):5441–5455, 2020. 27
work page 2020
- [16]
- [17]
-
[18]
Learning dynamics of deep matrix factorization beyond the edge of stability
Avrajit Ghosh, Soo Min Kwon, Rongrong Wang, Saiprasad Ravishankar, and Qing Qu. Learning dynamics of deep matrix factorization beyond the edge of stability. InICLR, 2025
work page 2025
- [19]
-
[20]
M. W. Hirsch.Differential Topology. Springer, 1976
work page 1976
-
[21]
M. W. Hirsch, C. C. Pugh, and M. Schub.Invariant Manifolds. Springer, 1977
work page 1977
-
[22]
A.Kelley. TheStable, Center-Stable, Center, Center-Unstable, UnstableManifolds.Journal of Differential Equations, 3:546–570, 1967
work page 1967
-
[23]
N. S. Keskar, D. Mudigere, J. Nocedal, M. Smelyanskiy, P. Tak, and P. Tang. On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima. InICLR, 2017
work page 2017
-
[24]
Itai Kreisler, Mor Shpigel Nacson, Daniel Soudry, and Yair Carmon. Gradient descent monotonically decreases the sharpness of gradient flow solutions in scalar networks and beyond. InICML, 2023
work page 2023
- [25]
-
[26]
A minimalist example of edge-of-stability and progressive sharpening, 2025
Liming Liu, Zixuan Zhang, Simon Du, and Tuo Zhao. A minimalist example of edge-of-stability and progressive sharpening, 2025
work page 2025
-
[27]
L. E. MacDonald, H. Min, L. Palma, S. Tarmoun, Z. Xu, and R. Vidal. Convergence Rates for Gradient Descent on the Edge of Stability for Overparametrised Least Squares. InNeurIPS, 2025
work page 2025
-
[28]
V. Pliss. A reduction principle in the theory of stability of motion.Izv. Akad. Nauk SSSR Ser. Mat., 28:1297–1324, 1964
work page 1964
-
[29]
B. Sandstede and T. Theerakarn. Regularity of Center Manifolds via the Graph Transform.Journal of Dynamics and Differential Equations, 27:989–1006, 2015
work page 2015
-
[30]
Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect
Yuqing Wang, Minshuo Chen, Tuo Zhao, and Molei Tao. Large Learning Rate Tames Homogeneity: Convergence and Balancing Effect. InICLR, 2022
work page 2022
-
[31]
Yuqing Wang, Zhenghao Xu, Tuo Zhao, and Molei Tao. Good regularity creates large learning rate implicit biases: edge of stability, balancing, and catapult. InNeurIPS 2023 Workshop on Mathematics of Modern Machine Learning, 2023
work page 2023
-
[32]
Z. Wang, Z. Li, and J. Li. Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability. InNeurIPS, 2022
work page 2022
-
[33]
J. Wu, P. L. Bartlett, M. Telgarsky, and B. Yu. Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency. InCOLT, 2024
work page 2024
-
[34]
J. Wu, V. Braverman, and J. Lee. Implicit Bias of Gradient Descent for Logistic Regression at the Edge of Stability. InNeurIPS, 2023
work page 2023
-
[35]
X. Zhu, Z. Wang, X. Wang, M. Zhou, and R. Ge. Understanding Edge-of-Stability Training Dynamics with a Minimalist Example. InICLR, 2023. 28
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.