pith. sign in

arxiv: 2604.05518 · v1 · submitted 2026-04-07 · 🧮 math.OC · cs.LG· cs.SY· eess.SY· stat.ML

Optimal Centered Active Excitation in Linear System Identification

Pith reviewed 2026-05-10 19:24 UTC · model grok-4.3

classification 🧮 math.OC cs.LGcs.SYeess.SYstat.ML
keywords linear system identificationactive learningsample complexityoptimal excitationsemidefinite programmingordinary least squarescentered noise
0
0 comments X

The pith

An active excitation algorithm matches the minimal sample complexity for identifying linear system matrices.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a method combining ordinary least squares with semidefinite programming can select centered noise inputs to identify the matrix of a linear dynamical system while using only the theoretically smallest number of samples, up to universal constants. A sympathetic reader cares because linear system identification underpins control design and prediction tasks, and cutting the required measurements reduces the time and cost of experiments. The authors first prove lower bounds that apply to every active learning procedure, then demonstrate that their algorithm meets those bounds. The resulting sample-complexity expressions depend explicitly on system parameters such as state dimension, making the guarantees easy to interpret in practice.

Core claim

We propose an active learning algorithm for linear system identification with optimal centered noise excitation. Notably, our algorithm, based on ordinary least squares and semidefinite programming, attains the minimal sample complexity while allowing for efficient computation of an estimate of a system matrix. More specifically, we first establish lower bounds of the sample complexity for any active learning algorithm to attain the prescribed accuracy and confidence levels. Next, we derive a sample complexity upper bound of the proposed algorithm, which matches the lower bound for any algorithm up to universal factors. Our tight bounds are easy to interpret and explicitly show their depend

What carries the argument

The semidefinite programming problem that designs optimal centered active excitation sequences, which minimizes the sample complexity needed for accurate ordinary-least-squares estimation of the system matrix.

If this is right

  • Every active learning algorithm needs at least a certain number of samples that scales with the state dimension and other system parameters.
  • The proposed method achieves this lower bound up to universal constant factors.
  • An estimate of the system matrix can be obtained efficiently by solving the semidefinite program and then applying ordinary least squares.
  • The explicit dependence of the bounds on system parameters allows direct comparison of sample requirements across different linear systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same centering and semidefinite-programming idea might be adapted to reduce data needs in partially observed or switched linear systems.
  • If the noise model deviates from centering, the matching of upper and lower bounds may no longer hold, suggesting a natural next test case.

Load-bearing premise

The lower bounds on sample complexity hold for every possible active learning algorithm whenever the system is linear, the noise is centered, and the system is identifiable.

What would settle it

Finding any active learning procedure that reaches the target accuracy and confidence with strictly fewer samples than the paper's lower bound on a linear system satisfying the stated conditions.

Figures

Figures reproduced from arXiv: 2604.05518 by Alexandre Proutiere, Kaito Ito.

Figure 1
Figure 1. Figure 1: Number of samples and the estimation error, where [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

We propose an active learning algorithm for linear system identification with optimal centered noise excitation. Notably, our algorithm, based on ordinary least squares and semidefinite programming, attains the minimal sample complexity while allowing for efficient computation of an estimate of a system matrix. More specifically, we first establish lower bounds of the sample complexity for any active learning algorithm to attain the prescribed accuracy and confidence levels. Next, we derive a sample complexity upper bound of the proposed algorithm, which matches the lower bound for any algorithm up to universal factors. Our tight bounds are easy to interpret and explicitly show their dependence on the system parameters such as the state dimension.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proposes an active learning algorithm for linear system identification that employs optimal centered noise excitation. It first derives information-theoretic lower bounds on the sample complexity needed for any active learning algorithm to achieve prescribed accuracy and confidence levels. It then presents an algorithm based on ordinary least squares combined with semidefinite programming whose sample-complexity upper bound matches the lower bound up to universal factors, with explicit dependence on system parameters such as state dimension, while permitting efficient computation of a system-matrix estimate.

Significance. If the claimed matching of bounds holds under the stated assumptions, the work would be a useful contribution to system identification and control theory by supplying tight, interpretable sample-complexity results together with a computationally tractable procedure. The explicit dependence on system parameters and the emphasis on centered excitation distinguish it from prior active-learning approaches.

minor comments (2)
  1. [Abstract] Abstract: the phrase 'matches the lower bound for any algorithm up to universal factors' is stated without indicating the size or origin of those factors; a brief parenthetical remark or reference to the relevant theorem would improve immediate clarity.
  2. [Abstract] Abstract: the key assumptions (linearity, centered noise, identifiability) are mentioned only in the reader's summary and not explicitly in the abstract itself; adding one sentence listing them would help readers assess applicability at a glance.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, significance assessment, and recommendation of minor revision. The work is correctly characterized as providing tight sample-complexity bounds for active linear system identification via centered excitation, with an efficient OLS+SDP procedure. As no specific major comments were raised in the report, we have no points requiring rebuttal or clarification at this stage and will incorporate any minor suggestions during revision.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper first derives information-theoretic lower bounds on sample complexity that apply to any active learning algorithm under the stated assumptions of linearity, centered noise, and identifiability. It then constructs an explicit OLS+SDP algorithm and proves a matching upper bound up to universal factors, with explicit dependence on system parameters. No load-bearing step reduces a claimed prediction to a fitted input by construction, invokes a self-citation for a uniqueness theorem, or renames an empirical pattern; the upper-bound derivation remains independent of the lower-bound argument and does not smuggle in ansatzes via prior work by the same authors.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; full paper likely contains additional technical assumptions on system stability, noise bounds, and persistence of excitation.

axioms (2)
  • domain assumption The underlying system is linear time-invariant with finite state dimension.
    Core modeling assumption stated implicitly by the problem setup.
  • domain assumption Excitation noise can be chosen actively and is centered.
    Explicitly required for the proposed optimal excitation method.

pith-pipeline@v0.9.0 · 5406 in / 1181 out tokens · 45875 ms · 2026-05-10T19:24:34.338461+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages

  1. [1]

    Lennart Ljung, System Identification: Theory for the User , Prentice- Hall, Inc., Upper Saddle River, NJ, USA, 1986

  2. [2]

    Graham Clifford Goodwin and Robert L Payne, Dynamic System Identification: Experiment Design and Data Analysis , Academic Press New Y ork, 1977

  3. [3]

    Consistency of the least-squares ident ification method

    Lennart Ljung, “Consistency of the least-squares ident ification method”, IEEE Transactions on Automatic Control , vol. 21, no. 5, pp. 779–781, 1976

  4. [4]

    On the consistency of prediction error i dentification methods

    Lennart Ljung, “On the consistency of prediction error i dentification methods”, in Mathematics in Science and Engineering , vol. 126, pp. 121–164. Elsevier, 1976

  5. [5]

    Synthesis of optimal inputs for multiin put- multioutput (mimo) systems with process noise Part I: Frequ enc y- domain synthesis Part II: Time-domain synthesis

    Raman K. Mehra, “Synthesis of optimal inputs for multiin put- multioutput (mimo) systems with process noise Part I: Frequ enc y- domain synthesis Part II: Time-domain synthesis”, in Mathematics in Science and Engineering , vol. 126, pp. 211–249. Elsevier, 1976

  6. [6]

    Optimal experiment design for open and closed-loo p system identification

    Xavier Bombois, Michel Gevers, Roland Hildebrand, and G abriel Solari, “Optimal experiment design for open and closed-loo p system identification”, Communications in Information and Systems , vol. 11, pp. 197–224, 2011

  7. [7]

    Learning without concentration

    Shahar Mendelson, “Learning without concentration”, i n Proc. 27th Conference on Learning Theory . 2014, vol. 35, pp. 25–39, PMLR

  8. [8]

    210–268, Cambridge University Press, 2012

    Roman V ershynin, Introduction to the non-asymptotic analysis of random matrices , pp. 210–268, Cambridge University Press, 2012

  9. [9]

    Victor H Pe˜ na, Tze Leung Lai, and Qi-Man Shao, Self-Normalized Processes: Limit Theory and Statistical Applications , Springer Science & Business Media, 2009

  10. [10]

    Concentration bounds for single para meter adaptive control

    Anders Rantzer, “Concentration bounds for single para meter adaptive control”, in Proc. 2018 Annual American Control Conference (ACC) . IEEE, 2018, pp. 1862–1866

  11. [11]

    Finite time identification in unstable linea r systems

    Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, and Ge orge Michailidis, “Finite time identification in unstable linea r systems”, Automatica, vol. 96, pp. 342–353, 2018

  12. [12]

    Learning without mixing: Towards a sharp a nalysis of linear system identification

    Max Simchowitz, Horia Mania, Stephen Tu, Michael I Jord an, and Benjamin Recht, “Learning without mixing: Towards a sharp a nalysis of linear system identification”, in Proc. Conference on Learning Theory, 2018, pp. 439–473

  13. [13]

    Non-asymptotic identifi cation of LTI systems from a single trajectory

    Samet Oymak and Necmiye Ozay, “Non-asymptotic identifi cation of LTI systems from a single trajectory”, in Proc. 2019 American control conference (ACC). IEEE, 2019, pp. 5655–5661

  14. [14]

    Near optimal finit e time identification of arbitrary linear dynamical systems

    Tuhin Sarkar and Alexander Rakhlin, “Near optimal finit e time identification of arbitrary linear dynamical systems”, in Proc. 36th International Conference on Machine Learning . 2019, pp. 5610–5618, PMLR

  15. [15]

    Finite-time id entification of stable linear systems optimality of the least-squares esti mator

    Y assir Jedra and Alexandre Proutiere, “Finite-time id entification of stable linear systems optimality of the least-squares esti mator”, in Proc. 2020 59th IEEE Conference on Decision and Control (CDC ). IEEE, 2020, pp. 996–1001

  16. [16]

    Finite time LTI system identification

    Tuhin Sarkar, Alexander Rakhlin, and Munther A Dahleh, “Finite time LTI system identification”, Journal of Machine Learning Research, vol. 22, no. 26, pp. 1–61, 2021

  17. [17]

    Finite-time id entification of linear systems: Fundamental limits and optimal algorithms

    Y assir Jedra and Alexandre Proutiere, “Finite-time id entification of linear systems: Fundamental limits and optimal algorithms ”, IEEE Transactions on Automatic Control , vol. 68, no. 5, pp. 2805–2820, 2023

  18. [18]

    Active learnin g for identification of linear dynamical systems

    Andrew Wagenmaker and Kevin Jamieson, “Active learnin g for identification of linear dynamical systems”, in Proc. Conference on Learning Theory . PMLR, 2020, pp. 3487–3582

  19. [19]

    High effort, low gain: Fundamental limits of active learning for linear dynamical systems,

    Nicolas Chatzikiriakos, Kevin Jamieson, and Andrea Ia nnelli, “High effort, low gain: Fundamental limits of active learning for linear dynamical systems”, arXiv preprint arXiv:2509.11907 , 2025

  20. [20]

    Hidden convexity in active learning: A convexified o nline input design for ARX systems

    Nicolas Chatzikiriakos, Bowen Song, Philipp Rank, and Andrea Ian- nelli, “Hidden convexity in active learning: A convexified o nline input design for ARX systems”, in Proc. 2025 IEEE 64th Conference on Decision and Control (CDC) . IEEE, 2025, pp. 63–68

  21. [21]

    Chi-Tsong Chen, Linear System Theory and Design , Saunders College Publishing, 1984

  22. [22]

    Nearest stable system using successive convex approximat ions

    Francois-Xavier Orbandexivry, Y urii Nesterov, and Pa ul V an Dooren, “Nearest stable system using successive convex approximat ions”, Au- tomatica, vol. 49, no. 5, pp. 1195–1203, 2013

  23. [23]

    Appr oximating the nearest stable discrete-time system

    Nicolas Gillis, Michael Karow, and Punit Sharma, “Appr oximating the nearest stable discrete-time system”, Linear Algebra and its Applications, vol. 573, pp. 37–53, 2019

  24. [24]

    Roman V ershynin, High-Dimensional Probability: An Introduction with Applications in Data Science , Cambridge University Press, 2018

  25. [25]

    CVX: Matlab software for disciplin ed convex programming, version 2.0

    Inc. CVX Research, “CVX: Matlab software for disciplin ed convex programming, version 2.0”, https://cvxr.com/cvx, August 2012

  26. [26]

    Task- optimal exploration in linear dynamical systems

    Andrew J Wagenmaker, Max Simchowitz, and Kevin Jamieso n, “Task- optimal exploration in linear dynamical systems”, in Proc. Interna- tional Conference on Machine Learning . PMLR, 2021, pp. 10641– 10652

  27. [27]

    Certaint y equiva- lence is efficient for linear quadratic control

    Horia Mania, Stephen Tu, and Benjamin Recht, “Certaint y equiva- lence is efficient for linear quadratic control”, Advances in Neural Information Processing Systems , vol. 32, pp. 10154–10164, 2019. APPENDIX I PROOF OF PROPOSITION 3 The proof follows a similar argument as in [17, Theo- rem 1]. Since wt follows a non-degenerate Gaussian dis- tribution, the ...

  28. [28]

    Lemma 13 (Perturbed Gramians): Let A be a stable matrix

    Moreover, this result plays a crucial role in analyzing the sensitivity of the optimal solution U to the maximization of λ min(Γ ∞ (A) + Ξ ∞ (A,U )); see Proposition 11. Lemma 13 (Perturbed Gramians): Let A be a stable matrix. Then, for any ∆ ∈ Rnx× nx and{ Uk} such that ∥∆∥≤∥ Γ ∞ (A)∥− 3/ 2/ 4 and tr(Uk)≤ ¯u for any k≥ 0, it holds that for any s≥ 0, ∥Γs(...

  29. [29]

    meanε≤∥ Γ ∞ (A)∥− 3/ 2/ (12 √ 2). Then, by Proposition 14, the perturbation ∆ = Ai− A satisfies 3 √ 2ε =∥∆∥≤∥ Γ ∞ (A)∥− 3/ 2/ 4, and Lemma 13 with ∆ = Ai− A yields 1 |P| |P|∑ i=1 tr ( (Ai− A)⊤ (Ai− A)Gt(Ai) ) − 1 |P| |P|∑ i=1 tr ( (Ai− A)⊤ (Ai− A)Gt(A) ) ≤ 1 |P| |P|∑ i=1 ∥Ai− A∥2 F∥Gt(Ai)− Gt(A)∥ ≤ 1 |P| |P|∑ i=1 ∥Ai− A∥2 F(t− 1)32(1 +∥B∥2 ¯u)∥Ai− A∥ ×∥ Γ ...