Recognition: 2 theorem links
· Lean TheoremDiscrete Double-Bracket Flows for Isotropic-Noise Invariant Eigendecomposition
Pith reviewed 2026-05-15 22:07 UTC · model grok-4.3
The pith
A discrete double-bracket flow on SO(n) performs eigendecomposition whose trajectory and step size depend only on the trace-free signal.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the streaming model C_k equals C_sig plus sigma_k squared I plus E_k, the discrete double-bracket iteration produces a trajectory on SO(n) whose evolution, associated Lyapunov function, and maximal stable step size eta_max equals 1 over L_C are determined exclusively by the trace-free signal C_e. The same generator yields input-to-state stability with a ball controlled solely by trace-free perturbations and guarantees global convergence to the eigendecomposition by strict-saddle geometry together with a discrete Lojasiewicz inequality.
What carries the argument
The skew-symmetric generator Omega equals [A, diag(A)] in the tangent Lie algebra so(n), whose antisymmetry forces every scalar multiple of the identity to vanish identically.
If this is right
- Maximal stable step size and contraction rates stay fixed as the isotropic noise floor rises.
- Input-to-state stability ball size is controlled only by the trace-free component E_k.
- Global convergence to eigenvectors holds for any time-varying but isotropic noise level.
- The same generator extends directly to top-k tracking on the Stiefel manifold at linear cost in k.
Where Pith is reading between the lines
- The invariance could let the algorithm run in sensor streams where a large common-mode offset varies over time without retuning.
- Similar cancellations might be engineered on other matrix Lie groups whose center contains the identity.
- Numerical checks with deliberately time-varying sigma squared would confirm that the discrete path itself never feels the isotropic level.
Load-bearing premise
The perturbation must be exactly a scalar multiple of the identity plus a small trace-free matrix so the algebraic cancellation in the generator remains exact.
What would settle it
Run the iteration on two observation sequences that share the same trace-free signal but differ by a constant isotropic offset and check whether the computed eigenvectors or the sequence of step sizes remain identical.
Figures
read the original abstract
We study eigendecomposition on $SO(n)$ under streaming observations $C_k = C_{\mathrm{sig}} + \sigma_k^2 I + E_k$, where the isotropic background $\sigma_k^2 I$ may be time-varying and arbitrarily large. Standard algorithms couple their stability to $\lVert C_k \rVert_2 \approx \sigma^2$, forcing step sizes, contraction rates, and iteration counts to degrade with the noise floor. We observe that $\sigma^2 I$ lies in the center of the matrix algebra and therefore *should never enter* the eigenspace dynamics. We construct a discrete double-bracket flow whose skew-symmetric generator $\Omega = [A, \operatorname{diag}(A)]$ operates in the tangent Lie algebra $\mathfrak{so}(n)$, where scalar multiples of the identity vanish by antisymmetry. The resulting trajectory, Lyapunov function, and maximal stable step size $\eta_{\max} = 1/L_C$ depend exclusively on the trace-free signal $C_e$ -- achieving pointwise, pathwise $\sigma^2$-invariance. We establish input-to-state stability with a noise ball governed solely by trace-free perturbations, prove global convergence via strict-saddle geometry and a discrete {\L}ojasiewicz argument, and extend the framework to top-$k$ eigentracking on the Stiefel manifold $\operatorname{St}(k,n)$ at cost $k$ matrix-vector products per step.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes a discrete double-bracket flow on SO(n) for eigendecomposition under streaming covariances C_k = C_sig + σ_k² I + E_k. The skew-symmetric generator is defined as Ω = [A, diag(A)] with A = Q^T C_k Q; because scalar matrices commute to zero under the Lie bracket, the resulting trajectory, off-diagonal Frobenius Lyapunov function, and maximal stable step size η_max = 1/L_C depend only on the trace-free component C_e. The authors assert global convergence from strict-saddle geometry together with a discrete Łojasiewicz inequality, input-to-state stability with respect to trace-free perturbations alone, and an extension to top-k tracking on the Stiefel manifold St(k,n).
Significance. If the algebraic invariance and the accompanying stability and convergence arguments hold, the result supplies a principled way to perform streaming eigendecomposition whose iteration complexity and step-size selection are independent of the isotropic noise floor. This property is directly useful for online PCA and subspace tracking in high-noise regimes common in machine learning. The construction is parameter-free once the trace-free part is isolated, and the discrete formulation with an explicit Lipschitz-based step-size bound is a practical asset.
major comments (2)
- [discrete update rule and discrete Łojasiewicz section] The central invariance claim rests on the identity [σ² I, ·] ≡ 0. The manuscript must verify that this cancellation survives the discrete update exactly (i.e., that the one-step map on SO(n) remains a function of C_e alone) and that the discrete Łojasiewicz inequality inherits the same independence; otherwise the claimed pathwise σ²-invariance is only approximate.
- [input-to-state stability theorem] Input-to-state stability is stated with a noise ball governed solely by the trace-free part of E_k. The explicit radius of this ball and the associated ISS gain must be derived and shown to be independent of σ_k²; the current sketch leaves the quantitative bound implicit.
minor comments (2)
- [introduction] The notation C_e for the trace-free signal should be introduced with an explicit definition (C_e := C_k - (tr(C_k)/n) I) at first use rather than only in the abstract.
- [Stiefel extension paragraph] The extension to the Stiefel manifold is stated to cost k matrix-vector products per step; a short complexity comparison with existing top-k methods would help readers assess the practical overhead.
Simulated Author's Rebuttal
We thank the referee for the positive evaluation and the recommendation for minor revision. The two major comments concern the exactness of the invariance under discretization and the explicit quantitative bounds in the ISS result. We address each point below and will incorporate the requested clarifications.
read point-by-point responses
-
Referee: [discrete update rule and discrete Łojasiewicz section] The central invariance claim rests on the identity [σ² I, ·] ≡ 0. The manuscript must verify that this cancellation survives the discrete update exactly (i.e., that the one-step map on SO(n) remains a function of C_e alone) and that the discrete Łojasiewicz inequality inherits the same independence; otherwise the claimed pathwise σ²-invariance is only approximate.
Authors: We agree that an explicit verification is useful. Because Ω_k = [A_k, diag(A_k)] with A_k = Q_k^T C_k Q_k and [σ_k² I, M] = 0 for any M, the generator Ω_k is identical to the generator computed from the trace-free part C_e alone. Consequently the one-step map Q_{k+1} = Q_k exp(η Ω_k) depends only on C_e. The discrete Lyapunov function is defined directly on the trace-free component, V(Q) = ||Q^T C_e Q - diag(Q^T C_e Q)||_F², and the discrete Łojasiewicz inequality is proved using the Lipschitz constant L_C of the map induced by C_e; both the inequality and its constants are therefore independent of σ_k². We will insert a short paragraph after the definition of the discrete flow that records this exact cancellation and the resulting independence of the Łojasiewicz constants. revision: yes
-
Referee: [input-to-state stability theorem] Input-to-state stability is stated with a noise ball governed solely by the trace-free part of E_k. The explicit radius of this ball and the associated ISS gain must be derived and shown to be independent of σ_k²; the current sketch leaves the quantitative bound implicit.
Authors: We accept the request for explicit constants. The ISS statement is with respect to the trace-free perturbation sequence {E_k}. Because both the vector field and the Lyapunov function are insensitive to the isotropic term, the radius of the ultimate bound is of the form r = γ · sup_k ||E_k||_F where the gain γ depends only on the step-size η and the Lipschitz constant L_C of the trace-free map; it is independent of σ_k². We will add the explicit expression for γ (γ = η / (1 - η L_C)) together with the corresponding ball radius in the revised ISS theorem statement and proof. revision: yes
Circularity Check
No significant circularity identified
full rationale
The derivation chain relies on the algebraic identity [σ²I, X] ≡ 0 for any matrix X, which is a standard property of the matrix commutator and the center of the Lie algebra. This directly implies that the generator Ω = [A, diag(A)], the discrete update, the off-diagonal Frobenius Lyapunov function, and the Lipschitz bound L_C are all functions of the trace-free part C_e alone. No parameters are fitted to data and then relabeled as predictions, no uniqueness theorems are imported from self-citations, and no ansatz is smuggled in. The input-to-state stability statement is likewise a direct consequence of the same commutator cancellation rather than a self-referential construction. The paper is therefore self-contained against external mathematical benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption σ² I lies in the center of the matrix algebra and produces zero under the skew-symmetric commutator [A, diag(A)]
- domain assumption The discrete flow admits a strict-saddle geometry and satisfies a discrete Lojasiewicz inequality
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel matches?
matchesMATCHES: this paper passage directly uses, restates, or depends on the cited Recognition theorem or module.
Ω(M):=[A(M),D(M)]=AD−DA ... [αI,·]≡0 ... Ω_{C+αI}(M)=Ω_C(M) ... trajectory {Mk} ... depends exclusively on the trace-free signal Ce
-
IndisputableMonolith/Foundation/LogicAsFunctionalEquation.leanTranslation Theorem / J-uniqueness corollary echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
any generator achieving such invariance must align with the commutator Ω=[A,diag(A)] ... dissipation channel condition
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Yuan, J., Xie, F., Nie, F., and Li, X
URL https://openreview.net/forum? id=RURIyF9Vuu. Yuan, J., Xie, F., Nie, F., and Li, X. Riemannian optimiza- tion on relaxed indicator matrix manifold. InThe Four- teenth International Conference on Learning Represen- tations, 2026. URL https://openreview.net/ forum?id=ERJd7dMN6U. Zhang, Y ., Congliang, C., Li, Z., Ding, T., Wu, C., Bei, Y ., and Wang, Z....
-
[2]
The entrywise condition follows fromΩ ij = (Ajj −A ii)Aij
SinceMis invertible, this holds iffΩ(M) = 0. The entrywise condition follows fromΩ ij = (Ajj −A ii)Aij. Lemma C.9(Critical Point Dichotomy).At any critical pointMoff: (a) Global Minimum:If off(A) = 0 , then f(M) = 0 and M maps eigenvectors of Ce to the standard basis (up to permutation and sign). (b)Degenerate Block:Ifoff(A)̸= 0, then there exist indicesi...
work page 1991
-
[3]
Our flow uses D= diag(A(M)) , which varies with state
assumes ˙X= [X,[X, N]] withfixed N. Our flow uses D= diag(A(M)) , which varies with state. This prevents direct application of Brockett’s Morse–Bott framework. However, our convergence proof (Theorem C.24) relies only on: (i) Łojasiewicz inequality for real-analytic gradient flows (convergence to single critical point); (ii) Strict saddle property (negati...
work page 1991
-
[4]
and taking conditional expectation: E[fk+1 | F k]≤f k −η k∥Ωk∥2 F + LCη2 k 2 (∥Ωk∥2 F +σ 2 u).(81) Whenη k is small enough that1−L Cηk/2>0, applying the local PL condition∥Ω k∥2 F ≥2δ 2fk: E[fk+1 | F k]≤(1−2δ 2ηk)fk + LCη2 kσ2 u 2 . Step 2 (Recursive solution):Substitute ηk =c/(k+k 0) and use mathematical induction to prove E[fk]≤C/(k+k 0). The key is cho...
work page 2018
-
[5]
Euclidean SGD with Riemannian projection
requires m≥ 2∥C∥2 F n2ϵ2ζ ≥ 2σ4 nϵ2ζ (120) MVP queries per iteration to achieve|ˆτ−τ| ≤ϵwith probability≥1−ζ. (iii) Algebraic filtering (commutator):The commutator generator ΩC(M) = [M ⊤CM,diag(M ⊤CM)] satisfies ΩC+αI = ΩC, achieving exactσ 2-elimination with0additional MVP cost. Proof. Part (i):By Lemmas F.16–F.17 below, non-filtering baselines exhibit s...
work page 2013
-
[6]
(146) G.2.3. LIPSCHITZBOUND ONΩ(M) Step 1: Lipschitz bound onA: A(M)−A(N) =M ⊤Ce(M−N) + (M ⊤ −N ⊤)CeN.(147) 56 Discrete Double-Bracket Flows Using∥M∥ 2 =∥N∥ 2 = 1: ∥A(M)−A(N)∥ F ≤2∥C e∥2∥M−N∥ F . Step 2: Lipschitz bound onD: ∥D(M)−D(N)∥ F ≤ ∥A(M)−A(N)∥ F ≤2∥C e∥2∥M−N∥ F . Step 3: Lipschitz bound onΩ: Ω(M)−Ω(N) = [A(M)−A(N), D(M)] + [A(N), D(M)−D(N)].(148)...
work page 2018
-
[7]
FixC sig,∆,M 0 with deterministic seed (seed=42)
-
[8]
For eachσ 2 ∈ {0,10 3,10 6}: run Cayley method withuse trace free=True
-
[9]
Compute pairwise trajectory differencesmax k ∥M(σ1) k −M (σ2) k ∥F
-
[10]
Run baselines (Orthogonal Iteration, Oja) for comparison. 61 Discrete Double-Bracket Flows Table 3.E1: Trajectory differences for Cayley method (n= 20,K= 5000). Pair(σ 2 1 vsσ 2 2)Max∥∆M k∥F Mean∥∆M k∥F 0vs10 3 2.1×10 −12 1.8×10 −12 0vs10 6 3.5×10 −12 2.9×10 −12 103 vs10 6 2.8×10 −12 2.4×10 −12 /uni00000013/uni00000014/uni00000013/uni00000013/uni00000013/...
-
[11]
Run each method to convergence (f <10 −6) or untilK max = 30000
-
[12]
Record iterations to convergence and convergence success rate
-
[13]
Results.Table 5 presents the mean iterations to convergence
Repeat with 5 random seeds. Results.Table 5 presents the mean iterations to convergence. Table 5.E14: Iterations to convergence vsσ 2 (n= 10, 5 seeds). σ2 Cayley Riem-Polar Riem-QR tf-Oja Raw Oja 0 1594 1594 1600 1880 9372 1 1594 1594 1600 1880 11317 5 1594 1594 1600 1880 20939 10 1594 1594 1600 1880 FAIL 100 1594 1594 1600 1880 FAIL 1000 1594 1594 1600 1...
-
[14]
For each(ε E, σ2)pair withε E ∈ {0.1,0.2,0.5,1.0,2.0}andσ 2 ∈ {0,10 3,10 6}:
-
[15]
trace-free noiseE k at each step
RunK= 5000iterations with i.i.d. trace-free noiseE k at each step
-
[16]
Compute steady-state estimatelim sup p f(M k)from last 10% of trajectory
-
[17]
Results.Table 6 shows linear regression fits forlim sup √fvsε E
Repeat with 10 seeds per configuration. Results.Table 6 shows linear regression fits forlim sup √fvsε E. Table 6.E3: Linear regression of steady-state error vs noise amplitude (n= 20). σ2 Slope InterceptR 2 0 0.206−0.0010.998 103 0.207−0.0020.997 106 0.205−0.0010.998 Conclusion:The slopes are nearly identical across σ2 values (coefficient of variation <1%...
-
[18]
Run deterministic Cayley flow (no noise) with fixed step size
-
[19]
Count non-monotone events:#{k:f(M k+1)> f(M k)(1 + 10−6)}. 65 Discrete Double-Bracket Flows /uni00000014/uni00000013/uni00000016 /uni00000014/uni00000013/uni00000017 /uni0000002c/uni00000057/uni00000048/uni00000055/uni00000044/uni00000057/uni0000004c/uni00000052/uni00000051/uni00000003k /uni00000014/uni00000013/uni00000016 /uni00000014/uni00000013/uni0000...
-
[20]
Results.Withn= 20and∥C e∥2 ≈0.95, theoreticalη max ≈0.03
Compute monotone fraction across 10 seeds. Results.Withn= 20and∥C e∥2 ≈0.95, theoreticalη max ≈0.03. Table 7.E5: Monotonicity vs step size (n= 20,K= 1000, 10 seeds). ηMonotone Fraction Mean Non-Monotone Count 0.001 1.00 0 0.005 1.00 0 0.01 1.00 0 0.02 1.00 0 0.03 0.90 2.3 0.05 0.00 47.8 Conclusion:Sharp phase transition nearη= 0.03matches theoretical pred...
-
[21]
66 Discrete Double-Bracket Flows
SampleM 0 ∼Haar(SO(n))for 100 independent seeds. 66 Discrete Double-Bracket Flows
-
[22]
Run Cayley iteration untilf(M k)<10 −8 orK= 10000
-
[23]
Record convergence rate and iteration count distribution. Results (n= 10,K= 5000, 50 seeds): •Convergence rate:100% (50/50 trajectories reachedf <10 −8) •Mean iterations:2847±412 •Max finalf:3.2×10 −10 Conclusion:No trajectory was trapped at a saddle point, consistent with the measure-zero stable manifold of strict saddles. H.4. Domain Stability H.4.1. NO...
-
[24]
Initialize near eigenbasis:M 0 =Q true ·Cay(εK)with small skew-symmetricK, soδ(M 0)≈g
-
[25]
For noise levelsε E ∈ {0.01,0.02,0.05,0.1,0.2,0.5}:
-
[26]
RunK= 5000iterations with bounded trace-free noise
-
[27]
Table 8.E9: Escape probability vs noise level (20 seeds per level)
Trackδ(M k)and detect escape events (δ < g/4). Table 8.E9: Escape probability vs noise level (20 seeds per level). Noiseε E P(escape) Meanδ min Mean finalδ 0.01 0.000.0996±0.0001 0.0999±0.0001 0.02 0.000.0991±0.0002 0.0999±0.0001 0.05 0.000.0982±0.0005 0.0998±0.0002 0.1 0.050.0721±0.0183 0.0987±0.0051 0.2 0.350.0412±0.0287 0.0891±0.0234 0.5 0.950.0089±0.0...
-
[28]
SampleM 0 ∼Haar(SO(n))for 20 seeds
-
[29]
Track first iterationT enter wheref(M k)≤f enter = (g−δ )2/8
-
[30]
Compare to theoretical bound. 67 Discrete Double-Bracket Flows /uni00000013/uni00000014/uni00000013/uni00000015/uni00000013/uni00000016/uni00000013/uni00000017/uni00000013/uni00000018/uni00000013 /uni0000002c/uni00000057/uni00000048/uni00000055/uni00000044/uni00000057/uni0000004c/uni00000052/uni00000051/uni00000003k /uni00000013/uni00000011/uni00000013/un...
-
[31]
Fixn= 20,g= 1, convergence thresholdf(M k)<10 −6
-
[32]
Sweepσ 2 ∈ {0,1,10,100,500,1000}
-
[33]
Measure wall-clock time and iteration count for each method
-
[34]
Declare “FAIL” if convergence not reached within10 6 iterations. Table 11.Wall-clock time (seconds) across methods. “—” indicates failure (>10 6 iterations). Methodσ 2 = 0 1 10 100 500 1000 Cayley-Neumann 2.3 2.3 2.3 2.3 2.3 2.3 Riem-QR 2.4 2.4 2.4 2.4 2.4 2.4 Riem-Polar 2.5 2.5 2.5 2.5 2.5 2.5 Raw Oja 0.18 0.52 — — — — Subspace Iter 0.21 0.89 4.7 — — — E...
-
[35]
SampleM 0 ∼Haar(SO(n))forn∈ {5,10,20,50}
-
[36]
Computeδ(M 0)/gwhereg= 1is the true spectral gap
-
[37]
Table 14.Haar initialization:δ(M 0)/gstatistics
Repeat 1000 times per dimension; estimate mean and 5th percentile. Table 14.Haar initialization:δ(M 0)/gstatistics. nE[δ(M 0)/g]5th Percentile 5 0.031 0.008 10 0.016 0.004 20 0.008 0.002 50 0.003 0.0007 Results. Interpretation.The ratio E[δ(M0)/g]≈O(1/n) confirms that random initialization places M0 far from eigenbasis (small δ). Combined with E11 (100% g...
-
[38]
Forn∈ {5,10,20}, sample 1000 pairs of nearby matricesM, M ′ ∈SO(n)with∥M−M ′∥F =ε= 10 −4
-
[39]
EstimateL est = max∥∇f(M)− ∇f(M ′)∥F /∥M−M ′∥F
-
[40]
Table 15.Lipschitz constant: empirical vs
Compare to theoretical boundL bound =c n∥Ce∥2 2. Table 15.Lipschitz constant: empirical vs. theoretical bound. n L est Lbound Ratio 5 18.3 494 0.037 10 42.7 1421 0.030 20 127 3782 0.034 Results. Interpretation.The theoretical bound is conservative (ratio ≈3% ), but crucially it remainsvalidacross all tested dimensions. The conservativeness arises from wor...
-
[41]
SampleCfrom symmetric distribution (Wishart) andM∼Haar(SO(n))
-
[42]
Compute Oja directionG Oja =CMand commutator directionG Ω =MΩwhereΩ = [A, D]
-
[43]
Computecosθ=⟨G Oja, GΩ⟩F /(∥GOja∥F ∥GΩ∥F )
-
[44]
Repeat 10000 times; estimateP(cosθ <0). Table 16.Direction alignment:P(cosθ <0)across dimensions. n P(cosθ <0)95% CI 5 0.498[0.488,0.508] 10 0.501[0.491,0.511] 20 0.499[0.489,0.509] Results. Interpretation.The probability is statistically indistinguishable from 0.5 (p >0.8 for all n), confirming that the commutator is not a “better Oja” but a fundamentall...
work page 2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.