Recognition: no theorem link
A Novel Schur-Decomposition-Based Weight Projection Method for Stable State-Space Neural-Network Architectures
Pith reviewed 2026-05-15 01:33 UTC · model grok-4.3
The pith
Projecting the quasi-triangular factor from the real Schur decomposition of the state matrix onto its nearest stable peer keeps discrete-time state-space neural-network layers asymptotically stable during training.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a backpropagation-compatible projection based on the real Schur decomposition of the state matrix, applied by mapping its quasi-triangular factor onto the nearest stable matrix, guarantees asymptotic stability in discrete-time state-space neural-network architectures while using minimal extra weights and preserving model accuracy comparable to state-of-the-art methods.
What carries the argument
Real Schur decomposition of the state matrix, with dynamic projection of its quasi-triangular factor onto the nearest stable peer.
If this is right
- Accuracy and convergence rates on synthetic linear systems remain comparable to state-of-the-art stable-system identification techniques.
- Fewer total weights allow faster convergence when the stable layers are stacked with static nonlinearities for real-world data.
- The projection supplies a numerically robust way to identify complex dynamics while satisfying strict asymptotic-stability constraints.
- Only a marginal increase in per-iteration computational cost is required.
Where Pith is reading between the lines
- The method could support reliable long-horizon forecasting in control or time-series tasks by removing the need for post-training stability fixes.
- Extending the same projection idea to continuous-time state matrices would widen its applicability to physical-system modeling.
- Embedding the projection inside larger recurrent architectures might reduce dependence on auxiliary regularization for stability.
Load-bearing premise
Repeatedly projecting the state matrix during training does not distort the optimization landscape or introduce bias that stops the model from reaching accurate solutions on real data.
What would settle it
Training runs on real-world datasets where the projected models reach visibly lower accuracy or fail to converge compared with otherwise identical non-projected models, or where long-horizon simulations of the learned dynamics still exhibit growing trajectories.
Figures
read the original abstract
Building black-box models for dynamical systems from data is a challenging problem in machine learning, especially when asymptotic stability guarantees are required. In this paper, we introduce a novel stability-ensuring and backpropagation-compatible projection scheme based on the Schur decomposition for the state matrix of linear discrete-time state-space layers, as well as an alternative pre-factorized formulation of the methodology. The proposed methods dynamically project the quasi-triangular factor of the state matrix's real Schur decomposition onto its nearest stable peer, ensuring stable dynamics with minimal overparameterization. Experiments on synthetic linear systems demonstrate that the method achieves accuracy and convergence rates comparable to those of state-of-the-art stable-system identification techniques, despite a marginal increase in computational complexity. Furthermore, the lower weight count facilitates convergence during training without sacrificing accuracy in stacked neural-network architectures with static nonlinearities targeting real-world datasets. These results suggest that the Schur-based projection provides a numerically robust framework for identifying complex dynamics on par with the State of the Art while satisfying strict asymptotic-stability requirements.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to introduce a Schur-decomposition-based weight projection method for ensuring asymptotic stability in state-space neural-network architectures. The approach projects the quasi-triangular factor of the state matrix's real Schur decomposition onto its nearest stable peer during training, with an alternative pre-factorized formulation. Experiments on synthetic linear systems show comparable accuracy and convergence to state-of-the-art techniques, and the method is shown to facilitate convergence in stacked architectures with static nonlinearities on real-world datasets with minimal overparameterization.
Significance. If the results hold, this method could offer a valuable tool for learning stable dynamical models in machine learning, particularly where stability is required without excessive parameterization. The Schur-based projection appears numerically robust and backpropagation-compatible, potentially advancing the field of stable system identification and neural state-space models.
major comments (2)
- [Experiments] The reported experiments on synthetic linear systems demonstrate comparable accuracy but do not include error bars, detailed ablations, or a comparison to post-hoc projection after unconstrained optimization, leaving the impact of repeated projection on the optimization landscape unaddressed.
- [Method] The description of the projection operation lacks analysis regarding its Lipschitz continuity, sub-differentiability at the stability boundary, or effects on the loss landscape's smoothness, which are necessary to substantiate the backpropagation compatibility claim.
minor comments (2)
- [Abstract] The claim of 'marginal increase in computational complexity' should be supported by a brief quantitative comparison or complexity analysis.
- [Introduction] Ensure all acronyms and notations (e.g., for Schur factors) are defined upon first use for improved readability.
Simulated Author's Rebuttal
We thank the referee for their thorough review and positive evaluation of the significance of our work. We address each of the major comments below, proposing revisions to strengthen the manuscript where appropriate.
read point-by-point responses
-
Referee: [Experiments] The reported experiments on synthetic linear systems demonstrate comparable accuracy but do not include error bars, detailed ablations, or a comparison to post-hoc projection after unconstrained optimization, leaving the impact of repeated projection on the optimization landscape unaddressed.
Authors: We agree that error bars and additional ablations would enhance the robustness of our experimental results. In the revised manuscript, we will include error bars computed over multiple independent runs with different random seeds and provide ablations on key hyperparameters such as the projection frequency. Regarding the comparison to post-hoc projection, we note that our method is designed to enforce stability during the entire training process, which prevents the optimizer from exploring unstable regions that could lead to poor local minima. A post-hoc approach might achieve similar final accuracy but could suffer from instability during training, as highlighted by the convergence issues in unconstrained baselines. We will add a discussion clarifying that the repeated projections smooth the optimization by keeping the state matrix within the stable set, supported by the observed comparable convergence rates. revision: partial
-
Referee: [Method] The description of the projection operation lacks analysis regarding its Lipschitz continuity, sub-differentiability at the stability boundary, or effects on the loss landscape's smoothness, which are necessary to substantiate the backpropagation compatibility claim.
Authors: We appreciate this suggestion for a more formal analysis. The projection operation involves modifying the eigenvalues (via the diagonal of the Schur factor) to lie within the unit disk, which is a Lipschitz continuous map with constant 1 in the appropriate matrix norm, as it is a nearest-point projection onto a convex set. At the stability boundary, the operation is sub-differentiable, with the subgradient corresponding to the projection onto the tangent cone. We will add a new subsection in the method section providing this analysis, including a proof sketch of Lipschitz continuity and a note on how the chain rule applies through the projection for backpropagation. This will also address the smoothness of the effective loss landscape, which remains differentiable almost everywhere. revision: yes
Circularity Check
Schur-based projection derives from standard linear algebra; no circular reduction to inputs
full rationale
The paper's core method projects the quasi-triangular factor of the real Schur form onto its nearest stable matrix using established properties of Schur decomposition and spectral-radius stability. This construction is independent of any fitted parameters, self-referential definitions, or load-bearing self-citations; it relies on external linear-algebra facts rather than re-deriving or renaming prior results from the same authors. Experiments compare final accuracy to existing techniques without presenting any projection step as a 'prediction' forced by data fitting. The derivation chain remains self-contained against standard matrix theory benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Every real matrix possesses a real Schur decomposition that can be computed numerically.
Reference graph
Works this paper leans on
-
[1]
Ubaid M Al-Saggaf and Gene F Franklin. Model reduction via balanced realizations: an extension and frequency weighting techniques.IEEE Transactions on Automatic Control, 33(7):687–692, 2002
work page 2002
-
[2]
Efficiently Modeling Long Sequences with Structured State Spaces
Albert Gu, Karan Goel, and Christopher Ré. Efficiently modeling long sequences with structured state spaces.arXiv preprint arXiv:2111.00396, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[3]
Sim- plified state space layers for sequence modeling.arXiv preprint arXiv:2208.04933, 2022
Jimmy TH Smith, Andrew Warrington, and Scott W Linderman. Simplified state space layers for sequence modeling.arXiv preprint arXiv:2208.04933, 2022
-
[4]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces.arXiv preprint arXiv:2312.00752, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[5]
Albert Gu, Tri Dao, Stefano Ermon, Atri Rudra, and Christopher Ré. Hippo: Recurrent memory with optimal polynomial projections.Advances in neural information processing systems, 33:1474–1487, 2020
work page 2020
-
[6]
Loris Di Natale, Muhammad Zakwan, Philipp Heer, Giancarlo Ferrari-Trecate, and Colin Neil Jones. Simba: System identification methods leveraging backpropagation.IEEE Transactions on Control Systems Technology, 2024
work page 2024
-
[7]
Alberto Bemporad. An L-BFGS-B approach for linear and nonlinear system identification under ℓ1 and group-Lasso regularization.IEEE Transactions on Automatic Control, 70(7):4857–4864, 2025. doi: 10.1109/TAC.2025.3541018
-
[8]
Nearest Ω-stable matrix via Riemannian optimization.Numerische Mathematik, 148(4):817–851, 2021
Vanni Noferini and Federico Poloni. Nearest Ω-stable matrix via Riemannian optimization.Numerische Mathematik, 148(4):817–851, 2021
work page 2021
-
[9]
System identification—a survey.Automatica, 7(2):123–162, 1971
Karl Johan Åström and Peter Eykhoff. System identification—a survey.Automatica, 7(2):123–162, 1971
work page 1971
-
[10]
Peter Van Overschee and Bart De Moor. A unifying theorem for three subspace system identification algorithms.Automatica, 31(12):1853–1864, 1995
work page 1995
-
[11]
Bi-Qiang Mu and Han-Fu Chen. Hankel matrices for system identification.Journal of Mathematical Analysis and Applications, 409(1):494–508, 2014
work page 2014
-
[12]
An overview of subspace identification.Computers & chemical engineering, 30(10-12): 1502–1513, 2006
S Joe Qin. An overview of subspace identification.Computers & chemical engineering, 30(10-12): 1502–1513, 2006
work page 2006
-
[13]
Maarten Schoukens and Koen Tiels. Identification of block-oriented nonlinear systems starting from linear approximations: A survey.Automatica, 85:272–292, 2017
work page 2017
-
[14]
Modern Koopman theory for dynamical systems.arXiv preprint arXiv:2102.12086, 2021
Steven L Brunton, Marko Budiši ´c, Eurika Kaiser, and J Nathan Kutz. Modern Koopman theory for dynamical systems.arXiv preprint arXiv:2102.12086, 2021
-
[15]
Identification of Hammerstein–Wiener models.Automatica, 49(1):70–81, 2013
Adrian Wills, Thomas B Schön, Lennart Ljung, and Brett Ninness. Identification of Hammerstein–Wiener models.Automatica, 49(1):70–81, 2013
work page 2013
-
[16]
Guaranteed stability with subspace methods.Systems & Control Letters, 26(2): 153–156, 1995
Jan M Maciejowski. Guaranteed stability with subspace methods.Systems & Control Letters, 26(2): 153–156, 1995
work page 1995
-
[17]
Gianluigi Pillonetto, Tianshi Chen, Alessandro Chiuso, Giuseppe De Nicolao, and Lennart Ljung.Regular- ized system identification: Learning dynamic models from data. Springer Nature, 2022
work page 2022
-
[18]
On the use of regularization in system identification.IFAC Proceedings Volumes, 26(2):75–80, 1993
J Sjöberg, T McKelvey, and L Ljung. On the use of regularization in system identification.IFAC Proceedings Volumes, 26(2):75–80, 1993. 10
work page 1993
-
[19]
Tony Van Gestel, Johan AK Suykens, Paul Van Dooren, and Bart De Moor. Identification of stable models in subspace identification by using regularization.IEEE Transactions on Automatic control, 46(9): 1416–1420, 2002
work page 2002
-
[20]
Abdus Salam, Rasel Mahmud, Tohedul Islam, Saddam Mukta, and Swakkhar Shatabda. A comprehensive survey on Mamba: Architectures, challenges, and opportunities.Computer, 58(8):64–76, 2025
work page 2025
-
[21]
Simba: Simplified mamba-based architecture for vision and multivariate time series,
Badri N. Patro and Vijay S. Agneeswaran. SiMBA: Simplified Mamba-based architecture for vision and multivariate time series, 2024. URLhttp://arxiv.org/abs/2403.15360
-
[22]
Biswa Datta.Numerical methods for linear control systems. Elsevier, 2004
work page 2004
-
[23]
Cambridge university press, 2012
Roger A Horn and Charles R Johnson.Matrix analysis, chapter 2, page 82. Cambridge university press, 2012
work page 2012
-
[24]
Gene H Golub and Charles F Van Loan.Computing Subspaces with the SVD, pages 601–606. JHU press, 3rd edition, 1996. ISBN 978-0-8018-5414-9
work page 1996
-
[25]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[26]
Maarten Schoukens. Nonlinear benchmarks. https://github.com/MaartenSchoukens/nonlinear_ benchmarks, 2026
work page 2026
-
[27]
The Kalman-Yakubovich-Popov lemma for discrete-time positive linear systems
Federico Najson. The Kalman-Yakubovich-Popov lemma for discrete-time positive linear systems. In 2012 American Control Conference (ACC), pages 5188–5193. IEEE, 2012
work page 2012
-
[28]
Discrete Mathematics and Its Applications
Leslie Hogben, editor.Handbook of Linear Algebra. Discrete Mathematics and Its Applications. Chapman & Hall/CRC, Philadelphia, PA, 2 edition, November 2013
work page 2013
-
[29]
Pritchard.Mathematical Systems Theory I
Diederich Hinrichsen and Anthony J. Pritchard.Mathematical Systems Theory I. Springer Berlin Hei- delberg, 2005. ISBN 9783540264101. doi: 10.1007/b137541. URL http://dx.doi.org/10.1007/ b137541
-
[30]
Nicolas Gillis and Punit Sharma. On computing the distance to stability for matrices using linear dissipative Hamiltonian systems.Automatica, 85:113–121, 2017
work page 2017
-
[31]
Approximating the nearest stable discrete-time system
Nicolas Gillis, Michael Karow, and Punit Sharma. Approximating the nearest stable discrete-time system. Linear Algebra and its Applications, 573:37–53, 2019
work page 2019
- [32]
-
[33]
James Townsend, Niklas Koep, and Sebastian Weichwald. Pymanopt: A Python toolbox for optimization on manifolds using automatic differentiation.Journal of Machine Learning Research, 17(137):1–5, 2016. URLhttp://jmlr.org/papers/v17/16-177.html
work page 2016
-
[34]
Lecture notes in numerical methods for solving large scale eigenvalue problems, 2016
Peter Arbenz. Lecture notes in numerical methods for solving large scale eigenvalue problems, 2016
work page 2016
-
[35]
A new deflation criterion for the QR algorithm.LAPACK Working Note, 122, 1997
Mario Ahues and Francoise Tisseur. A new deflation criterion for the QR algorithm.LAPACK Working Note, 122, 1997
work page 1997
-
[36]
John C Gower and Garmt B Dijksterhuis.Procrustes problems, volume 30. Oxford university press, 2004
work page 2004
-
[37]
Berthold KP Horn, Hugh M Hilden, and Shahriar Negahdaripour. Closed-form solution of absolute orientation using orthonormal matrices.Journal of the Optical Society of America A, 5(7):1127–1135, 1988
work page 1988
-
[38]
On the computation of a matrix inverse square root.Computing, 46(4):295–305, December
Nagwa Sherif. On the computation of a matrix inverse square root.Computing, 46(4):295–305, December
-
[39]
ISSN 1436-5057. doi: 10.1007/bf02257775. URLhttp://dx.doi.org/10.1007/BF02257775
-
[40]
Three free data sets for development and benchmarking in nonlinear system identification
Torbjorn Wigren and Johan Schoukens. Three free data sets for development and benchmarking in nonlinear system identification. In2013 European Control Conference (ECC), pages 2933–2938. IEEE, 7 2013. doi: 10.23919/ecc.2013.6669201. URLhttp://dx.doi.org/10.23919/ECC.2013.6669201
-
[41]
Coupled electric drives data set and reference models
Torbjörn Wigren and Maarten Schoukens. Coupled electric drives data set and reference models. Technical report, Uppsala University Sweden, 2017. 11
work page 2017
-
[42]
Data set and reference models of EMPS
Alexandre Janot, Maxime Gautier, and Mathieu Brunot. Data set and reference models of EMPS. In Nonlinear system identification benchmarks, 2019
work page 2019
-
[43]
Dataset and baseline for an industrial robot identification benchmark, 2022
Jonas Weigand, Julian Götz, Jonas Ulmen, and Martin Ruskowski. Dataset and baseline for an industrial robot identification benchmark, 2022. URL https://fdm-fallback.uni-kl.de/TUK/FB/MV/WSKL/ 0001/
work page 2022
-
[44]
Merijn Floren, Leonardo Peri, Jeroen De Maeyer, Wim De Munter, Dirk Vandepitte, and Jean-Philippe Noël. Data-driven state-space identification and nonlinearity assessment of the CubeSpec Fine Steering Mirror.ISMA-USD2024, pages 2042–2052, 2024
work page 2042
-
[45]
Optuna: A next-generation hyperparameter optimization framework
Takuya Akiba, Shotaro Sano, Toshihiko Yanase, Takeru Ohta, and Masanori Koyama. Optuna: A next-generation hyperparameter optimization framework. InProceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2019
work page 2019
-
[46]
Max Champneys, Gerben I Beintema, Roland Tóth, Maarten Schoukens, and Timothy J Rogers. Baseline results for selected nonlinear system identification benchmarks.IFAC-PapersOnLine, 58(15):474–479, 2024
work page 2024
-
[47]
Baseline models for the Fine Steering Mirror benchmark dataset
Merijn Floren. Baseline models for the Fine Steering Mirror benchmark dataset. https://doi.org/ 10.5281/zenodo.19591136, 2026. URL https://doi.org/10.5281/zenodo.19591136. Version v2026.04.15. A Computing the Nearest Real 2-by-2 Schur-Stable Matrix This appendix is included for the sake of convenience, but it is taken almost verbatim from Noferini and Pol...
-
[48]
IfA 2,1 ̸=−A 1,2, then ˆA1,1 = ˆA2,2
-
[49]
A non-empty closed set S ⊆R 2×2 is doubly rotation-invariant if X∈ S=⇒U(α)XU(β) ⊺ ∈ S ∀{α, β} ∈R 2
If A2,1 =−A 1,2, then ∃α∈[0, π/2) : ˆA=U(α) ˇAU(α) ⊺, where ˇA is another local minimizer with the same objective value and ˇA1,1 = ˇA2,2. A non-empty closed set S ⊆R 2×2 is doubly rotation-invariant if X∈ S=⇒U(α)XU(β) ⊺ ∈ S ∀{α, β} ∈R 2. Lemma A.4.LetS ∈R 2×2 be doubly rotation-invariant, and let ˆAbe a local minimizer of min X∈S ∥A−X∥ F for some diagona...
-
[50]
Ifσ 1 ̸=σ 2, then ˆAis diagonal with ˆA1,1 − ˆA2,2 ≥0and ˆA1,1 + ˆA2,2 ≥0
-
[51]
The result in Lemma A.5 is a variant of the Routh-Hurwitz stability criterion (see Hogben [28, Sec
If σ1 =σ 2, then ∃α∈[0, π/2) : ˆA=U(α) ˇAU(α) ⊺, where ˇA is another local minimizer with the same objective value and ˇA1,1 = ˇA2,2 = 0. The result in Lemma A.5 is a variant of the Routh-Hurwitz stability criterion (see Hogben [28, Sec. 26.2]). Lemma A.5.Both roots of the polynomial p(λ) =aλ 2 +bλ+c , with a̸= 0 , lie in the closed left half-plane{λ∈C:Re...
-
[52]
ˆA∈∂S 0, ˆA/∈∂S + ∪∂S −
-
[53]
ˆA∈∂S +, ˆA/∈∂S 0 ∪∂S −
-
[54]
ˆA∈∂S −, ˆA/∈∂S 0 ∪∂S +
-
[55]
ˆA∈∂S 0 ∩∂S +, ˆA/∈∂S −
-
[56]
ˆA∈∂S 0 ∩∂S −, ˆA/∈∂S +
-
[57]
The six cases are treated separately:
ˆA∈∂S + ∩∂S −, ˆA/∈∂S 0. The six cases are treated separately:
-
[58]
ˆA∈∂S 0, ˆA/∈∂S + ∪∂S −. Then, ˆA is a local minimizer of ∥A−X∥ F in the doubly rotation-invariant set ∂S0, and Ξ=U ⊺ 0 ˆAV0 is a local minimizer of σ0,1 σ0,2 −X F in∂S 0. If σ0,1 ̸=σ 0,2, then Ξ is diagonal by Lemma A.4; and for Ξ= [ τ1 τ2 ] (with τ1τ2 = det(Ξ) = 1), it is necessary to have(τ 1, τ2)∈ M(σ 0,1, σ0,2). Hence, ˆA∈ ˆA0. Otherwise, ifσ 0,1 =σ ...
-
[59]
Then, min X∈∂S + ∥A−X∥ F = min X∈∂S + ∥U ⊺ +(A−I)V + −U ⊺ +(X−I)V +∥F
ˆA∈∂S +, ˆA/∈∂S 0 ∪∂S −. Then, min X∈∂S + ∥A−X∥ F = min X∈∂S + ∥U ⊺ +(A−I)V + −U ⊺ +(X−I)V +∥F . Hence, ˆA is a local minimizer of ∥A−X∥ F if and only if Ξ=U ⊺ +(X−I)V + in ∂Sd ≜ X∈R 2×2 : det(X) = 0 : indeed, X has an eigenvalue 1 if and only if Ξ has determinant 0. Thus, either Ξ= [ σ+,1 0 ] and hence ˆA= ˆA+, or a minimizer with the same objective valu...
-
[60]
Proof as in case 2, swapping all plus and minus signs
ˆA∈∂S −, ˆA/∈∂S 0 ∪∂S +. Proof as in case 2, swapping all plus and minus signs
-
[61]
ˆA∈∂S 0 ∩∂S +, ˆA/∈∂S −. Note that ∂S0 ∩∂S + is the rotation-invariant set of matrices with double eigenvalue +1 (since they must have eigenvalue +1 and determinant 1). ˆA is a minimizer if and only if Ξ=G ⊺ ˆAG is a minimizer of ∥ ˇA−X∥ in ∂S0 ∩∂S +; so, by Lemma A.3, 1 2 tr(Ξ) = Ξ1,1 = Ξ2,2 = 1 2 tr( ˆA) = +1. Since det(Ξ) = 1, at least one among Ξ1,2 a...
-
[62]
Proof as in case 4, swapping all plus and minus signs
ˆA∈∂S 0 ∩∂S −, ˆA/∈∂S +. Proof as in case 4, swapping all plus and minus signs
-
[63]
Note that∂S + ∩∂S − is the rotation-invariant set of matrices with eigenvalues1and−1
ˆA∈∂S + ∩∂S −, ˆA/∈∂S 0. Note that∂S + ∩∂S − is the rotation-invariant set of matrices with eigenvalues1and−1. Arguing as in case 4, G⊺ ˆAG=Ξ= 0τ 1 τ2 0 , τ 1τ2 = 1, as Ξ must have Ξ1,1 = Ξ 2,2 = 1 2 tr(Ξ) = 0 and det(Ξ) = 1 . Moreover, to minimize ∥ ˇA−Ξ∥ F ,(τ 1, τ2)must belong toM( ˇA1,2, ˇA2,1). Thus, ˆA∈ A ∗. B Projection Scheme - Full Benchmark The ...
-
[64]
Random with independent entries inN(0,1)
-
[65]
The resulting eigenvalues are shown in the complex plane in Figure 2
Random with independent entries inU([0,1]). The resulting eigenvalues are shown in the complex plane in Figure 2. Following the setup from Noferini and Poloni [8], the maximum number of iterations and execution time were set to1000and1000seconds, respectively. However, the iteration limit only controls the maximum number of outer iterations, meaning that ...
-
[66]
The execution time of each method (Figure 4a)
-
[67]
The reconstruction error (Figure 4b), defined as the NSFE between the original matrix A and the product of its Schur factors ZT Z ⊺. This metric indicates whether or not each implementation yields a valid factorization, regardless of the factors’ orthogonality and sparsity pattern
-
[68]
This metric indicates whether Z fulfils the orthogonality requirement
The orthogonality error (Figure 4c), defined as the NSFE between the identity matrix and the symmetric matrix ZZ ⊺. This metric indicates whether Z fulfils the orthogonality requirement
-
[69]
By Theorem 3.2, under a correct factorization, this metric should only be bound by numerical error
The spectral error (Figure 4c), defined as the NSSR between the original matrix A and the quasi-triangular factor T . By Theorem 3.2, under a correct factorization, this metric should only be bound by numerical error. This experiment is executed on an isolated HPC instance with 4 Xeon Gold 6230 logical cores, 32 GiB of RAM, and a single Nvidia V olta V100...
-
[70]
The method from Golub and Van Loan [24], which finds the nearest orthogonal matrix by multiplying together the unitary factors from the Singular-Value Decomposition Z= UΣV ⊺; i.e. ˆZ=U V ∗.(18)
-
[71]
The method from Horn et al. [37], which defines the closed form ˆZ=Z(Z ⊺Z) −1/2 , calculating the inverse square root of the product in parentheses through its eigenvalue decomposition(Z ⊺Z) =QΛQ −1 as (Z ⊺Z) −1/2 =QΛ −1/2Q−1 =Q 1√ λ1 ... 1√ λnx Q−1.(19)
-
[72]
The same method from Horn et al. [37], but instead approximating the inverse square root using the stable iterative scheme from Sherif [38, Section 3]; i.e., via the recursion X −1/2 = lim r→∞ Ξr s.t. Ξr+1 =Ξ r (I+E r),Ξ 0 =I, Er+1 =E 2 r 2I−S 2 r ,E 0 = (I−X) (I+X) −1 . (20) Since the methods are jit-compiled during training, the stop condition for Equat...
-
[73]
The execution time of each method (Figure 4a). Execution times for the Schur-stable projection, the Schur decomposition, and the spectral norm are also included for scale and relative-cost analysis. 20
-
[74]
The projection distance (Figure 4b), defined as the NSFE between the original matrix Z and its projection ˆZ. This metric should only be evaluated relative to other methods, as a null value should only be achievable ifZis already orthogonal
-
[75]
The orthogonality numerical error (Figure 4c), defined as the NSFE between the identity matrix and ˆZ ⊺ ˆZ. Unlike the projection distance, if ˆZ∈O(n x), then this metric should be approximately 0 (bounded by numerical error). This experiment is executed on an interactive HPC instance with 4 Xeon Gold 6230 logical cores and 8 GiB of RAM. Each metric is sa...
-
[76]
Silverbox[ 39], an electronic implementation of the Duffing oscillator. It is built as a 2nd order linear time-invariant system with a 3rd degree polynomial static nonlinearity around it in feedback
-
[77]
The pulley is held by a spring, resulting in a lightly damped dynamic mode
CED[ 40], or Coupled Electric Drives, consists of two electric motors that drive a pulley using a flexible belt. The pulley is held by a spring, resulting in a lightly damped dynamic mode. The electric drives can be individually controlled, allowing the tension and the speed of the belt to be simultaneously controlled. The drive control is symmetric aroun...
-
[78]
The main source of nonlinearity is caused by friction effects that are present in the setup
EMPS[ 41], or Electro-Mechanical Positioning System, is a standard configuration of a drive system for the prismatic joint of robots or machine tools. The main source of nonlinearity is caused by friction effects that are present in the setup. Due to the presence of a pure integrator in the system, the measurements are taken in a closed-loop setting. The ...
-
[79]
Industrial Robot[ 42], an identification benchmark dataset for a full robot movement with a KUKA KR300 R2500 ultra SE industrial robot. It is a robot with a nominal payload capacity of 300 kg, a weight of 1120 kg, and a reach of 2500 mm, exhibiting 12 states accounting for position and velocity for each of the 6 joints. The robot encounters backlash in al...
-
[80]
Fine-Steering Mirror[ 43], a multi-input multi-output high-precision control platform used in a small satellite. V oltages applied to three piezo-actuators serve as inputs, while the mirror displacements measured at three non-collocated reference points serve as outputs. The excitation signals are orthogonal random-phase multisines spanning a wide frequen...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.