Recursive Sparse Parameter Identification of Multivariate ARMAX Systems with Non-stationary Observations and Colored Noise
Pith reviewed 2026-05-22 18:13 UTC · model grok-4.3
The pith
Recursive algorithms using a bivariate weighted L1 criterion and alternating optimization are proposed for sparse parameter identification of multivariate stochastic systems, with proofs of set and parameter convergence under non-stationary conditions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the non-stationary and non-persistent excitation conditions on the systems, the estimates are proved to be with (i) set convergence, i.e., the accurate estimation of the sparse index set of the unknown parameter matrix, and (ii) parameter convergence, i.e., the consistent estimation for values of the non-zero elements of the unknown parameter matrix.
Load-bearing premise
The systems satisfy non-stationary and non-persistent excitation conditions, as required for the theoretical properties of set convergence and parameter convergence to hold (abstract, paragraph on theoretical properties).
Figures
read the original abstract
The classical sparse parameter identification methods are usually based on the iterative basis selection such as greedy algorithms, or the numerical optimization of regularized cost functions such as LASSO and Bayesian posterior probability distribution, etc., which, however, are not suitable for online sparsity inference when data arrive sequentially. This paper presents recursive algorithms for sparse parameter identification of multivariate stochastic systems with non-stationary observations. First, a new bivariate criterion function is presented by introducing an auxiliary variable matrix into a weighted $L_1$ regularization criterion. The new criterion function is subsequently decomposed into two solvable subproblems via alternating optimization of the two variable matrices, for which the optimizers can be explicitly formulated into recursive equations. Second, under the non-stationary and non-persistent excitation conditions on the systems, theoretical properties of the recursive algorithms are established. That is, the estimates are proved to be with (i) set convergence, i.e., the accurate estimation of the sparse index set of the unknown parameter matrix, and (ii) parameter convergence, i.e., the consistent estimation for values of the non-zero elements of the unknown parameter matrix. Finally, numerical examples are given to support the theoretical analysis.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops recursive algorithms for sparse parameter identification of multivariate ARMAX systems with non-stationary observations and colored noise. It introduces a bivariate weighted L1 regularization criterion incorporating an auxiliary variable matrix, which is decomposed via alternating optimization into two subproblems with explicit recursive update equations. Under non-stationary and non-persistent excitation conditions, the paper establishes set convergence (correct recovery of the sparse index set) and parameter convergence (consistent estimation of nonzero entries). Numerical examples are included to illustrate the theoretical results.
Significance. If the convergence claims hold, the work provides a meaningful contribution to online system identification by enabling sparsity-aware recursive estimation without requiring persistent excitation, a common limitation in practical non-stationary settings. The explicit recursive formulations and handling of colored noise in the multivariate case extend applicability to adaptive control and signal processing. The use of alternating optimization to obtain closed-form recursions is a constructive strength that supports real-time implementation.
major comments (1)
- Convergence proofs section: The set convergence result for the sparse index set is load-bearing for the central claim, yet the interaction between the alternating optimization steps and the stochastic convergence arguments under non-persistent excitation requires an explicit step showing that the auxiliary variable does not introduce bias in the support recovery when the noise is colored.
minor comments (3)
- Abstract: The bivariate criterion is described at a high level; adding a brief reference to its mathematical form (e.g., the role of the auxiliary matrix) would improve clarity for readers unfamiliar with the decomposition.
- Numerical examples: The manuscript should specify the exact system orders, dimensions of the parameter matrix, and the time-varying characteristics of the non-stationary excitation used in the simulations to facilitate reproducibility.
- Notation: The distinction between the two matrix variables in the alternating optimization could be emphasized with consistent subscripting throughout the recursive equations.
Simulated Author's Rebuttal
We thank the referee for the careful review and constructive comment on the convergence analysis. We address the major comment point by point below and will revise the manuscript to improve clarity.
read point-by-point responses
-
Referee: Convergence proofs section: The set convergence result for the sparse index set is load-bearing for the central claim, yet the interaction between the alternating optimization steps and the stochastic convergence arguments under non-persistent excitation requires an explicit step showing that the auxiliary variable does not introduce bias in the support recovery when the noise is colored.
Authors: We agree that an explicit bridging step would strengthen the exposition. In the current proof of set convergence (Theorem 4.1), the alternating optimization is used to decouple the bivariate criterion, with the auxiliary matrix serving as a dynamic weight that converges to the support indicator. The stochastic arguments rely on a martingale convergence theorem adapted to non-stationary regressors and colored noise, where the noise covariance enters the variance bound but does not bias the zero/nonzero classification because the weighted L1 term dominates for sufficiently large weights. To make this interaction fully explicit, we will add a supporting lemma (new Lemma 4.3) that shows the auxiliary variable update produces no asymptotic bias in the recovered support under the stated non-persistent excitation and bounded noise conditions. The lemma will be inserted before Theorem 4.1 and referenced in the proof. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces a bivariate weighted-L1 criterion with an auxiliary matrix variable, decomposes it via alternating optimization into two explicit recursive subproblems, and then proves set and parameter convergence under non-stationary/non-persistent excitation conditions using standard stochastic approximation arguments. These steps rely on conventional optimization techniques and convergence analysis rather than any reduction of a claimed prediction or result to a fitted quantity or self-citation by construction. The central claims retain independent mathematical content from the proofs and do not collapse to the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The systems satisfy non-stationary and non-persistent excitation conditions
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
a new bivariate criterion function is presented by introducing an auxiliary variable matrix into a weighted L1 regularization criterion... decomposed into two solvable subproblems via alternating optimization... recursive equations
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
under the non-stationary and non-persistent excitation conditions... set convergence... parameter convergence
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
H´ edy Attouch, J´ erˆ ome Bolte, Patrick Redont, and Antoine Soubeyran. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-/suppress Lojasiewicz inequality.Mathematics of Operations Research, 35:438–457, 2010
work page 2010
-
[2]
Sparse convex optimization via adaptively regularized hard thresholdin g
Kyriakos Axiotis and Maxim Sviridenko. Sparse convex optimization via adaptively regularized hard thresholdin g. Journal of Machine Learning Research , 22:5421–5467, 2021
work page 2021
-
[3]
Coherence-based performance guarantees for estimating a sparse vector under random noise
Zvika Ben-Haim, Yonina C Eldar, and Michael Elad. Coherence-based performance guarantees for estimating a sparse vector under random noise. IEEE Transactions on Signal Processing, 58:5030–5043, 2010
work page 2010
-
[4]
Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3:1–122, 2011
work page 2011
-
[5]
Discovering governing equations from data by sparse identification of nonlinear dynamical systems
Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences , 113:3932–3937, 2016
work page 2016
-
[6]
Decoding by linear programming
Emmanuel J Cand` es and Terence Tao. Decoding by linear programming. IEEE Transactions on Information Theory , 51:4203–4215, 2005
work page 2005
-
[7]
Enhancing sparsity by reweighted L1 minimization
Emmanuel J Cand` es, Michael B W akin, and Stephen P Boyd. Enhancing sparsity by reweighted L1 minimization. Journal of Fourier Analysis and Applications , 14:877–905, 2008
work page 2008
-
[8]
Iteratively reweighted algorithms for compressive sensing
Rick Chartrand and W otao Yin. Iteratively reweighted algorithms for compressive sensing. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
work page 2008
-
[9]
Identification and stochastic adaptive control
Han-Fu Chen and Lei Guo. Identification and stochastic adaptive control. Birkh¨ auser Boston, MA, 1991
work page 1991
-
[10]
Atomic decomposition by basis pursuit
Scott Shaobing Chen, David L Donoho, and Michael A Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43:129–159, 2001
work page 2001
-
[11]
David L Donoho, Yaakov Tsaig, Iddo Drori, and Jean-Luc Starck. Sparse solution of underdetermined systems of line ar equations by stagewise orthogonal matching pursuit. IEEE Transactions on Information Theory , 58:1094–1121, 2012
work page 2012
-
[12]
Variable selection via nonco ncave penalized likelihood and its oracle properties
Jianqing Fan and Runze Li. Variable selection via nonco ncave penalized likelihood and its oracle properties. Journal of the American Statistical Association , 96:1348–1360, 2001
work page 2001
-
[13]
Theoretical foundations and numerical methods for sparse recovery
Massimo Fornasier. Theoretical foundations and numerical methods for sparse recovery . De Gruyter Berlin, Germany, 2010
work page 2010
-
[14]
Fast sparse regression and classific ation
Jerome H Friedman. Fast sparse regression and classific ation. International Journal of Forecasting , 28:722–738, 2012
work page 2012
-
[15]
Support recovery and parameter identification of multivariate ARMA systems with exogenous inputs
Yanxin Fu and W enxiao Zhao. Support recovery and parameter identification of multivariate ARMA systems with exogenous inputs. SIAM Journal on Control and Optimization, 61:1835–1860, 2023
work page 2023
-
[16]
Die Gan and Zhixin Liu. Distributed sparse identificati on for stochastic dynamic systems under cooperative non-persist ent excitation condition. Automatica, 151:110958, 2023
work page 2023
-
[17]
X. Gao, L. Dai, S. Han, I. Chih-Lin, and R. W. Heath. Energy-efficient hybrid analog and digital precoding for mmW ave MIMO systems with large antenna arrays. IEEE Journal on Selected Areas in Communications , 34(4), 998– 1009, 2016
work page 2016
-
[18]
Sparse signal recovery using iterative proximal projectio n
Fateme Ghayem, Mostafa Sadeghi, Massoud Babaie-Zadeh , Saikat Chatterjee, Mikael Skoglund, and Christian Jutten. Sparse signal recovery using iterative proximal projectio n. IEEE Transactions on Signal Processing , 66(4):879–894, 2018
work page 2018
-
[19]
Sparse parameter identification for stochastic systems based on regularization
Jian Guo, Ying W ang, Yanlong Zhao, and Ji-Feng Zhang. Sparse parameter identification for stochastic systems based on regularization. SIAM Journal on Control and Optimization, 62:2884–2909, 2024
work page 2024
- [20]
-
[21]
On the concept of excitation in least squares identification and adaptive con trol
Tze Leung Lai and Ching Zong W ei. On the concept of excitation in least squares identification and adaptive con trol. Stochastics, 16:227–254, 1986
work page 1986
-
[22]
Lassonet: A neural network with feature sparsi ty
Ismael Lemhadri, Feng Ruan, Louis Abraham, and Robert Tibshirani. Lassonet: A neural network with feature sparsi ty. Journal of Machine Learning Research , 22:1–29, 2021
work page 2021
-
[23]
Online sparse identification f or regression models
Junlin Li and Xiuting Li. Online sparse identification f or regression models. Systems & Control Letters , 141:104710, 2020
work page 2020
-
[24]
Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers
Junjie Liu, Zhe Xu, Runbin Shi, Ray CC Cheung, and Hayden KH So. Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers. I n International Conference on Learning Representations, 2020
work page 2020
-
[25]
A survey on compressed sensing approach to systems and control
Masaaki Nagahara and Yutaka Yamamoto. A survey on compressed sensing approach to systems and control. Mathematics of Control, Signals, and Systems , 36:1–20, 2024
work page 2024
-
[26]
A. Mehrotra, S. Srivastava, S. Asifa, A. K. Jagannatham , and L. Hanzo. Online Bayesian learning-aided sparse CSI estimation in OTFS modulated MIMO systems for ultra- high-Doppler scenarios. IEEE Trans. Communications , 72(4), 2182-2200, 2023
work page 2023
-
[27]
Cosamp: Iterative sign al recovery from incomplete and inaccurate samples
Deanna Needell and Joel A Tropp. Cosamp: Iterative sign al recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis , 26:301–321, 2009
work page 2009
-
[28]
Richard Obermeier and Jose Angel Martinez-Lorenzo. Sensing matrix design via mutual coherence minimization for electromagnetic compressive imaging applications. IEEE Transactions on Computational Imaging , 3:217–229, 2017. 15
work page 2017
-
[29]
Trevor Park and George Casella. The bayesian lasso. Journal of the American Statistical Association , 103:681–686, 2008
work page 2008
-
[30]
Transformation of regressors for low coherent sparse syste m identification
Javad Parsa, Cristian R Rojas, and H ˚ akan Hjalmarsson. Transformation of regressors for low coherent sparse syste m identification. IEEE Transactions on Automatic Control , 69:2947–2962, 2023
work page 2023
-
[31]
Sparse estima tion in linear dynamic networks using the stable spline horsesho e prior
Gianluigi Pillonetto and Akram Yazdani. Sparse estima tion in linear dynamic networks using the stable spline horsesho e prior. Automatica, 146:110666, 2022
work page 2022
-
[32]
Group sparse regularization for deep neura l networks
Simone Scardapane, Danilo Comminiello, Amir Hussain, and Aurelio Uncini. Group sparse regularization for deep neura l networks. Neurocomputing, 241:81–89, 2017
work page 2017
-
[33]
K. Sun, H. Meng, Y. W ang, and X. W ang. Direct data domain STAP using sparse representation of clutter spectrum. IEEE Trans. Aerospace and Electronic Systems , 52(2), 539–553, 2016
work page 2016
-
[34]
A survey for sparse regularization based compression methods
Anda Tang, Pei Quan, Lingfeng Niu, and Yong Shi. A survey for sparse regularization based compression methods. Annals of Data Science , 9:695–722, 2022
work page 2022
-
[35]
Regression shrinkage and selectio n via the lasso
Robert Tibshirani. Regression shrinkage and selectio n via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58:267–288, 1996
work page 1996
-
[36]
Accelerated schemes for the L1/L2 minimization
Chao W ang, Ming Yan, Yaghoub Rahimi, and Yifei Lou. Accelerated schemes for the L1/L2 minimization. IEEE Transactions on Signal Processing , 68:2660–2669, 2020
work page 2020
-
[37]
Online alternating direction method
Huahua W ang and Arindam Banerjee. Online alternating direction method. Proceedings of 29th International Conference on Machine Learning , pp. 1699-1706, 2012
work page 2012
-
[38]
Nearly unbiased variable selection und er minimax concave penalty
Cun-Hui Zhang. Nearly unbiased variable selection und er minimax concave penalty. The Annals of Statistics , 38:894 – 942, 2010
work page 2010
-
[39]
Mul ti- task sparse identification for closed-loop systems with gen eral observation sequences
Kang Zhang, Xiaoli Luan, Xiaojing Ping, and Fei Liu. Mul ti- task sparse identification for closed-loop systems with gen eral observation sequences. Journal of the Franklin Institute , 360:6609–6631, 2023
work page 2023
-
[40]
Analysis of multi-stage convex relaxation for sparse regularization
Tong Zhang. Analysis of multi-stage convex relaxation for sparse regularization. Journal of Machine Learning Research, 11, 2010
work page 2010
-
[41]
Sparse system identification for stochastic systems with general observa tion sequences
W enxiao Zhao, George Yin, and Er-W ei Bai. Sparse system identification for stochastic systems with general observa tion sequences. Automatica, 121:109162, 2020
work page 2020
-
[42]
Sparse Bayesian deep learning for dynamic system identification
Hongpeng Zhou, Chahine Ibrahim, W ei Xing Zheng, and W ei Pan. Sparse Bayesian deep learning for dynamic system identification. Automatica, 144:110489, 2022
work page 2022
-
[43]
Nonparametric identification of kroneck er networks
Mattia Zorzi. Nonparametric identification of kroneck er networks. Automatica, 145:110518, 2022
work page 2022
-
[44]
The adaptive lasso and its oracle properties
Hui Zou. The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429, 2006. 16
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.