pith. sign in

arxiv: 2505.00323 · v3 · submitted 2025-05-01 · 📡 eess.SY · cs.SY

Recursive Sparse Parameter Identification of Multivariate ARMAX Systems with Non-stationary Observations and Colored Noise

Pith reviewed 2026-05-22 18:13 UTC · model grok-4.3

classification 📡 eess.SY cs.SY
keywords parameterrecursivesparsealgorithmscriterionidentificationmatrixnon-stationary
0
0 comments X

The pith

Recursive algorithms using a bivariate weighted L1 criterion and alternating optimization are proposed for sparse parameter identification of multivariate stochastic systems, with proofs of set and parameter convergence under non-stationary conditions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The work targets the problem of identifying which parameters are zero and estimating the non-zero ones in multivariate ARMAX models when data arrives one sample at a time and the underlying system is changing. Classical batch methods such as LASSO or greedy selection do not fit online use. The authors introduce a bivariate criterion that adds an auxiliary variable matrix to a weighted L1 penalty. Alternating optimization splits the problem into two subproblems whose solutions are written as explicit recursive update equations. Under the stated conditions of non-stationary observations and non-persistent excitation, the paper claims two convergence results: the algorithm correctly recovers the set of zero parameters and consistently estimates the values of the remaining non-zero entries even in the presence of colored noise. Numerical examples are said to illustrate these properties.

Core claim

Under the non-stationary and non-persistent excitation conditions on the systems, the estimates are proved to be with (i) set convergence, i.e., the accurate estimation of the sparse index set of the unknown parameter matrix, and (ii) parameter convergence, i.e., the consistent estimation for values of the non-zero elements of the unknown parameter matrix.

Load-bearing premise

The systems satisfy non-stationary and non-persistent excitation conditions, as required for the theoretical properties of set convergence and parameter convergence to hold (abstract, paragraph on theoretical properties).

Figures

Figures reproduced from arXiv: 2505.00323 by Wenxiao Zhao, Yanxin Fu.

Figure 1
Figure 1. Figure 1: Parameter estimation errors, correct rates of spars [PITH_FULL_IMAGE:figures/full_fig_p008_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Parameter estimation errors, correct rates, and cal [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
read the original abstract

The classical sparse parameter identification methods are usually based on the iterative basis selection such as greedy algorithms, or the numerical optimization of regularized cost functions such as LASSO and Bayesian posterior probability distribution, etc., which, however, are not suitable for online sparsity inference when data arrive sequentially. This paper presents recursive algorithms for sparse parameter identification of multivariate stochastic systems with non-stationary observations. First, a new bivariate criterion function is presented by introducing an auxiliary variable matrix into a weighted $L_1$ regularization criterion. The new criterion function is subsequently decomposed into two solvable subproblems via alternating optimization of the two variable matrices, for which the optimizers can be explicitly formulated into recursive equations. Second, under the non-stationary and non-persistent excitation conditions on the systems, theoretical properties of the recursive algorithms are established. That is, the estimates are proved to be with (i) set convergence, i.e., the accurate estimation of the sparse index set of the unknown parameter matrix, and (ii) parameter convergence, i.e., the consistent estimation for values of the non-zero elements of the unknown parameter matrix. Finally, numerical examples are given to support the theoretical analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 3 minor

Summary. The manuscript develops recursive algorithms for sparse parameter identification of multivariate ARMAX systems with non-stationary observations and colored noise. It introduces a bivariate weighted L1 regularization criterion incorporating an auxiliary variable matrix, which is decomposed via alternating optimization into two subproblems with explicit recursive update equations. Under non-stationary and non-persistent excitation conditions, the paper establishes set convergence (correct recovery of the sparse index set) and parameter convergence (consistent estimation of nonzero entries). Numerical examples are included to illustrate the theoretical results.

Significance. If the convergence claims hold, the work provides a meaningful contribution to online system identification by enabling sparsity-aware recursive estimation without requiring persistent excitation, a common limitation in practical non-stationary settings. The explicit recursive formulations and handling of colored noise in the multivariate case extend applicability to adaptive control and signal processing. The use of alternating optimization to obtain closed-form recursions is a constructive strength that supports real-time implementation.

major comments (1)
  1. Convergence proofs section: The set convergence result for the sparse index set is load-bearing for the central claim, yet the interaction between the alternating optimization steps and the stochastic convergence arguments under non-persistent excitation requires an explicit step showing that the auxiliary variable does not introduce bias in the support recovery when the noise is colored.
minor comments (3)
  1. Abstract: The bivariate criterion is described at a high level; adding a brief reference to its mathematical form (e.g., the role of the auxiliary matrix) would improve clarity for readers unfamiliar with the decomposition.
  2. Numerical examples: The manuscript should specify the exact system orders, dimensions of the parameter matrix, and the time-varying characteristics of the non-stationary excitation used in the simulations to facilitate reproducibility.
  3. Notation: The distinction between the two matrix variables in the alternating optimization could be emphasized with consistent subscripting throughout the recursive equations.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and constructive comment on the convergence analysis. We address the major comment point by point below and will revise the manuscript to improve clarity.

read point-by-point responses
  1. Referee: Convergence proofs section: The set convergence result for the sparse index set is load-bearing for the central claim, yet the interaction between the alternating optimization steps and the stochastic convergence arguments under non-persistent excitation requires an explicit step showing that the auxiliary variable does not introduce bias in the support recovery when the noise is colored.

    Authors: We agree that an explicit bridging step would strengthen the exposition. In the current proof of set convergence (Theorem 4.1), the alternating optimization is used to decouple the bivariate criterion, with the auxiliary matrix serving as a dynamic weight that converges to the support indicator. The stochastic arguments rely on a martingale convergence theorem adapted to non-stationary regressors and colored noise, where the noise covariance enters the variance bound but does not bias the zero/nonzero classification because the weighted L1 term dominates for sufficiently large weights. To make this interaction fully explicit, we will add a supporting lemma (new Lemma 4.3) that shows the auxiliary variable update produces no asymptotic bias in the recovered support under the stated non-persistent excitation and bounded noise conditions. The lemma will be inserted before Theorem 4.1 and referenced in the proof. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces a bivariate weighted-L1 criterion with an auxiliary matrix variable, decomposes it via alternating optimization into two explicit recursive subproblems, and then proves set and parameter convergence under non-stationary/non-persistent excitation conditions using standard stochastic approximation arguments. These steps rely on conventional optimization techniques and convergence analysis rather than any reduction of a claimed prediction or result to a fitted quantity or self-citation by construction. The central claims retain independent mathematical content from the proofs and do not collapse to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Review is based on abstract only; specific free parameters such as regularization weights are not enumerated, and the main added construct is the bivariate criterion function.

axioms (1)
  • domain assumption The systems satisfy non-stationary and non-persistent excitation conditions
    Invoked explicitly for establishing set convergence and parameter convergence of the recursive estimates.

pith-pipeline@v0.9.0 · 5739 in / 1204 out tokens · 103499 ms · 2026-05-22T18:13:13.978353+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages

  1. [1]

    H´ edy Attouch, J´ erˆ ome Bolte, Patrick Redont, and Antoine Soubeyran. Proximal alternating minimization and projection methods for nonconvex problems: An approach based on the Kurdyka-/suppress Lojasiewicz inequality.Mathematics of Operations Research, 35:438–457, 2010

  2. [2]

    Sparse convex optimization via adaptively regularized hard thresholdin g

    Kyriakos Axiotis and Maxim Sviridenko. Sparse convex optimization via adaptively regularized hard thresholdin g. Journal of Machine Learning Research , 22:5421–5467, 2021

  3. [3]

    Coherence-based performance guarantees for estimating a sparse vector under random noise

    Zvika Ben-Haim, Yonina C Eldar, and Michael Elad. Coherence-based performance guarantees for estimating a sparse vector under random noise. IEEE Transactions on Signal Processing, 58:5030–5043, 2010

  4. [4]

    Distributed optimization and statistical learning via the alternating direction method of multipliers

    Stephen Boyd, Neal Parikh, Eric Chu, Borja Peleato, Jonathan Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends® in Machine Learning, 3:1–122, 2011

  5. [5]

    Discovering governing equations from data by sparse identification of nonlinear dynamical systems

    Steven L Brunton, Joshua L Proctor, and J Nathan Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proceedings of the National Academy of Sciences , 113:3932–3937, 2016

  6. [6]

    Decoding by linear programming

    Emmanuel J Cand` es and Terence Tao. Decoding by linear programming. IEEE Transactions on Information Theory , 51:4203–4215, 2005

  7. [7]

    Enhancing sparsity by reweighted L1 minimization

    Emmanuel J Cand` es, Michael B W akin, and Stephen P Boyd. Enhancing sparsity by reweighted L1 minimization. Journal of Fourier Analysis and Applications , 14:877–905, 2008

  8. [8]

    Iteratively reweighted algorithms for compressive sensing

    Rick Chartrand and W otao Yin. Iteratively reweighted algorithms for compressive sensing. In IEEE International Conference on Acoustics, Speech and Signal Processing, 2008

  9. [9]

    Identification and stochastic adaptive control

    Han-Fu Chen and Lei Guo. Identification and stochastic adaptive control. Birkh¨ auser Boston, MA, 1991

  10. [10]

    Atomic decomposition by basis pursuit

    Scott Shaobing Chen, David L Donoho, and Michael A Saunders. Atomic decomposition by basis pursuit. SIAM Review, 43:129–159, 2001

  11. [11]

    Sparse solution of underdetermined systems of line ar equations by stagewise orthogonal matching pursuit

    David L Donoho, Yaakov Tsaig, Iddo Drori, and Jean-Luc Starck. Sparse solution of underdetermined systems of line ar equations by stagewise orthogonal matching pursuit. IEEE Transactions on Information Theory , 58:1094–1121, 2012

  12. [12]

    Variable selection via nonco ncave penalized likelihood and its oracle properties

    Jianqing Fan and Runze Li. Variable selection via nonco ncave penalized likelihood and its oracle properties. Journal of the American Statistical Association , 96:1348–1360, 2001

  13. [13]

    Theoretical foundations and numerical methods for sparse recovery

    Massimo Fornasier. Theoretical foundations and numerical methods for sparse recovery . De Gruyter Berlin, Germany, 2010

  14. [14]

    Fast sparse regression and classific ation

    Jerome H Friedman. Fast sparse regression and classific ation. International Journal of Forecasting , 28:722–738, 2012

  15. [15]

    Support recovery and parameter identification of multivariate ARMA systems with exogenous inputs

    Yanxin Fu and W enxiao Zhao. Support recovery and parameter identification of multivariate ARMA systems with exogenous inputs. SIAM Journal on Control and Optimization, 61:1835–1860, 2023

  16. [16]

    Distributed sparse identificati on for stochastic dynamic systems under cooperative non-persist ent excitation condition

    Die Gan and Zhixin Liu. Distributed sparse identificati on for stochastic dynamic systems under cooperative non-persist ent excitation condition. Automatica, 151:110958, 2023

  17. [17]

    X. Gao, L. Dai, S. Han, I. Chih-Lin, and R. W. Heath. Energy-efficient hybrid analog and digital precoding for mmW ave MIMO systems with large antenna arrays. IEEE Journal on Selected Areas in Communications , 34(4), 998– 1009, 2016

  18. [18]

    Sparse signal recovery using iterative proximal projectio n

    Fateme Ghayem, Mostafa Sadeghi, Massoud Babaie-Zadeh , Saikat Chatterjee, Mikael Skoglund, and Christian Jutten. Sparse signal recovery using iterative proximal projectio n. IEEE Transactions on Signal Processing , 66(4):879–894, 2018

  19. [19]

    Sparse parameter identification for stochastic systems based on regularization

    Jian Guo, Ying W ang, Yanlong Zhao, and Ji-Feng Zhang. Sparse parameter identification for stochastic systems based on regularization. SIAM Journal on Control and Optimization, 62:2884–2909, 2024

  20. [20]

    Huang, J

    Y. Huang, J. L. Beck, and H. Li. Bayesian system identification based on hierarchical sparse Bayesian learn ing and Gibbs sampling with application to structural damage assessment. Computer Methods in Applied Mechanics and Engineering, 318, 382-411, 2017

  21. [21]

    On the concept of excitation in least squares identification and adaptive con trol

    Tze Leung Lai and Ching Zong W ei. On the concept of excitation in least squares identification and adaptive con trol. Stochastics, 16:227–254, 1986

  22. [22]

    Lassonet: A neural network with feature sparsi ty

    Ismael Lemhadri, Feng Ruan, Louis Abraham, and Robert Tibshirani. Lassonet: A neural network with feature sparsi ty. Journal of Machine Learning Research , 22:1–29, 2021

  23. [23]

    Online sparse identification f or regression models

    Junlin Li and Xiuting Li. Online sparse identification f or regression models. Systems & Control Letters , 141:104710, 2020

  24. [24]

    Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers

    Junjie Liu, Zhe Xu, Runbin Shi, Ray CC Cheung, and Hayden KH So. Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers. I n International Conference on Learning Representations, 2020

  25. [25]

    A survey on compressed sensing approach to systems and control

    Masaaki Nagahara and Yutaka Yamamoto. A survey on compressed sensing approach to systems and control. Mathematics of Control, Signals, and Systems , 36:1–20, 2024

  26. [26]

    Mehrotra, S

    A. Mehrotra, S. Srivastava, S. Asifa, A. K. Jagannatham , and L. Hanzo. Online Bayesian learning-aided sparse CSI estimation in OTFS modulated MIMO systems for ultra- high-Doppler scenarios. IEEE Trans. Communications , 72(4), 2182-2200, 2023

  27. [27]

    Cosamp: Iterative sign al recovery from incomplete and inaccurate samples

    Deanna Needell and Joel A Tropp. Cosamp: Iterative sign al recovery from incomplete and inaccurate samples. Applied and Computational Harmonic Analysis , 26:301–321, 2009

  28. [28]

    Sensing matrix design via mutual coherence minimization for electromagnetic compressive imaging applications

    Richard Obermeier and Jose Angel Martinez-Lorenzo. Sensing matrix design via mutual coherence minimization for electromagnetic compressive imaging applications. IEEE Transactions on Computational Imaging , 3:217–229, 2017. 15

  29. [29]

    The bayesian lasso

    Trevor Park and George Casella. The bayesian lasso. Journal of the American Statistical Association , 103:681–686, 2008

  30. [30]

    Transformation of regressors for low coherent sparse syste m identification

    Javad Parsa, Cristian R Rojas, and H ˚ akan Hjalmarsson. Transformation of regressors for low coherent sparse syste m identification. IEEE Transactions on Automatic Control , 69:2947–2962, 2023

  31. [31]

    Sparse estima tion in linear dynamic networks using the stable spline horsesho e prior

    Gianluigi Pillonetto and Akram Yazdani. Sparse estima tion in linear dynamic networks using the stable spline horsesho e prior. Automatica, 146:110666, 2022

  32. [32]

    Group sparse regularization for deep neura l networks

    Simone Scardapane, Danilo Comminiello, Amir Hussain, and Aurelio Uncini. Group sparse regularization for deep neura l networks. Neurocomputing, 241:81–89, 2017

  33. [33]

    K. Sun, H. Meng, Y. W ang, and X. W ang. Direct data domain STAP using sparse representation of clutter spectrum. IEEE Trans. Aerospace and Electronic Systems , 52(2), 539–553, 2016

  34. [34]

    A survey for sparse regularization based compression methods

    Anda Tang, Pei Quan, Lingfeng Niu, and Yong Shi. A survey for sparse regularization based compression methods. Annals of Data Science , 9:695–722, 2022

  35. [35]

    Regression shrinkage and selectio n via the lasso

    Robert Tibshirani. Regression shrinkage and selectio n via the lasso. Journal of the Royal Statistical Society: Series B (Methodological), 58:267–288, 1996

  36. [36]

    Accelerated schemes for the L1/L2 minimization

    Chao W ang, Ming Yan, Yaghoub Rahimi, and Yifei Lou. Accelerated schemes for the L1/L2 minimization. IEEE Transactions on Signal Processing , 68:2660–2669, 2020

  37. [37]

    Online alternating direction method

    Huahua W ang and Arindam Banerjee. Online alternating direction method. Proceedings of 29th International Conference on Machine Learning , pp. 1699-1706, 2012

  38. [38]

    Nearly unbiased variable selection und er minimax concave penalty

    Cun-Hui Zhang. Nearly unbiased variable selection und er minimax concave penalty. The Annals of Statistics , 38:894 – 942, 2010

  39. [39]

    Mul ti- task sparse identification for closed-loop systems with gen eral observation sequences

    Kang Zhang, Xiaoli Luan, Xiaojing Ping, and Fei Liu. Mul ti- task sparse identification for closed-loop systems with gen eral observation sequences. Journal of the Franklin Institute , 360:6609–6631, 2023

  40. [40]

    Analysis of multi-stage convex relaxation for sparse regularization

    Tong Zhang. Analysis of multi-stage convex relaxation for sparse regularization. Journal of Machine Learning Research, 11, 2010

  41. [41]

    Sparse system identification for stochastic systems with general observa tion sequences

    W enxiao Zhao, George Yin, and Er-W ei Bai. Sparse system identification for stochastic systems with general observa tion sequences. Automatica, 121:109162, 2020

  42. [42]

    Sparse Bayesian deep learning for dynamic system identification

    Hongpeng Zhou, Chahine Ibrahim, W ei Xing Zheng, and W ei Pan. Sparse Bayesian deep learning for dynamic system identification. Automatica, 144:110489, 2022

  43. [43]

    Nonparametric identification of kroneck er networks

    Mattia Zorzi. Nonparametric identification of kroneck er networks. Automatica, 145:110518, 2022

  44. [44]

    The adaptive lasso and its oracle properties

    Hui Zou. The adaptive lasso and its oracle properties. Journal of the American Statistical Association, 101:1418–1429, 2006. 16