Regularization in Data-driven Predictive Control: A Convex Relaxation Perspective

Xu Shang; Yang Zheng

arxiv: 2509.09027 · v2 · submitted 2025-09-10 · 🧮 math.OC · cs.SY· eess.SY

Regularization in Data-driven Predictive Control: A Convex Relaxation Perspective

Xu Shang , Yang Zheng This is my paper

Pith reviewed 2026-05-18 17:05 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY

keywords data-driven predictive controlregularizationconvex relaxationbi-level optimizationsystem identificationcausality-based regularizerA-DDPC

0 comments

The pith

Regularized data-driven predictive control formulations are convex relaxations of bi-level optimization problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that several common regularization techniques in data-driven predictive control correspond to convex relaxations of a bi-level optimization setup. System identification is treated as the inner problem whose solution is needed for the outer predictive control task. This framing unifies direct and indirect data-driven methods by showing how penalties, projections, and causality constraints implicitly enforce identification. A reader would care because the view explains the strong practical performance of regularization and suggests a new iterative method that solves the identification subproblem more completely.

Core claim

Using a bi-level optimization framework, system identification is modeled as an inner problem and predictive control as an outer problem. Several regularized DDPC formulations, including l1-norm penalties, projection-based regularizers, and a newly introduced causality-based regularizer, are shown to be convex relaxations of their respective bi-level problems. This perspective clarifies the conceptual links between direct and indirect data-driven control and highlights how regularization implicitly enforces system identification. An optimality-based variant called A-DDPC is proposed that approximately solves the inner problem with all identification constraints via an iterative algorithm.

What carries the argument

The bi-level optimization framework that places system identification in the inner level and predictive control in the outer level, which permits regularizers to be read as convex relaxations enforcing identification constraints.

If this is right

Regularization implicitly enforces system identification constraints.
Direct and indirect data-driven control methods are linked through the shared relaxation view.
The A-DDPC variant reduces both bias and variance errors relative to standard regularized formulations.
Pre-processing the trajectory library with system identification techniques can improve performance in nonlinear cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same relaxation lens might be used to derive new regularizers from other identification constraints not yet considered.
The framework could be tested on nonlinear systems where identification errors are larger to see whether the performance gains of A-DDPC persist.
Similar bi-level relaxations may apply to data-driven methods in adjacent fields such as reinforcement learning.

Load-bearing premise

The bi-level optimization structure with identification as the inner problem accurately represents the data-driven predictive control setting.

What would settle it

Solve the explicit bi-level problem on a given trajectory dataset and compare its outer-level optimum to the solution of a regularized DDPC formulation on the same data; large mismatch would falsify the relaxation interpretation.

Figures

Figures reproduced from arXiv: 2509.09027 by Xu Shang, Yang Zheng.

**Figure 2.** Figure 2: The influence of the penalty parameter λ in (15), where we have replaced y with 1/x. is the same as {(g, σy, u, y) | Dg = col(uini, yini + σy, u, y), h(g) = 0}, where D is a fixed trajectory matrix. We next move the constraint on g to the cost via regularization, leading to a direct DDPC min g,σy∈Γ, u∈U,y∈Y ∥y∥ 2 Q + ∥u∥ 2 R + λy∥σy∥ 2 2 + λw|h(g)| subject to Dg = col(uini, yini + σy, u, y). (16) Section 4… view at source ↗

**Figure 3.** Figure 3: Numerical illustration of Theorem 1. The optimal [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of realized control cost of different vari [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 6.** Figure 6: Comparison of realized control cost of different [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

read the original abstract

This paper explores the role of regularization in data-driven predictive control (DDPC) through the lens of convex relaxation. Using a bi-level optimization framework, we model system identification as an inner problem and predictive control as an outer problem. Within this framework, we show that several regularized DDPC formulations, including l1-norm penalties, projection-based regularizers, and a newly introduced causality-based regularizer, can be viewed as convex relaxations of their respective bi-level problems. This perspective clarifies the conceptual links between direct and indirect data-driven control and highlights how regularization implicitly enforces system identification. We further propose an optimality-based variant, A-DDPC, which approximately solves the inner problem with all identification constraints via an iterative algorithm. Numerical experiments demonstrate that A-DDPC outperforms existing regularized DDPC by reducing both bias and variance errors. These results indicate that further benefits may be obtained by applying system identification techniques to pre-process the trajectory library in nonlinear settings. Overall, our analysis contributes to a unified convex relaxation view of regularization in DDPC and sheds light on its strong empirical performance beyond linear time-invariant systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames regularization in DDPC as convex relaxations of a bi-level ID-plus-control problem, adds a causality regularizer, and gives an iterative A-DDPC that reports lower bias and variance.

read the letter

The main takeaway is that this work treats several regularized DDPC schemes as convex relaxations of a bi-level program with system identification inside and control outside. It introduces a causality-based regularizer and an optimality-based iterative algorithm called A-DDPC that tries to handle the inner constraints more explicitly. The bi-level lens also ties direct and indirect data-driven approaches together by showing how regularization implicitly enforces identification steps. That perspective is the clearest new angle here, and the experiments are presented as evidence that A-DDPC reduces both bias and variance relative to prior regularized versions. The causality regularizer itself looks like a reasonable addition that fits the framework without obvious contradictions. On the softer side, the relaxation claim depends on the inner identification problem lining up exactly with the implicit one solved by unregularized DDPC; if that match is only approximate, the argument becomes more of a useful reinterpretation than a strict convex-relaxation result. The numerical results are encouraging but rest on protocol details that are not fully spelled out in the abstract, so the size of the gains and the choice of baselines need checking. The suggestion that the same ideas extend to nonlinear cases is noted but left as a direction rather than demonstrated. This is useful reading for control researchers who already work with data-driven predictive methods and want a principled way to think about regularizer design. A reader focused on convex optimization or system identification will see the most direct value. The paper has enough new technical content and a coherent story to deserve a serious referee, even if the bi-level equivalence and experiment reporting need tightening in revision.

Referee Report

2 major / 2 minor

Summary. The paper develops a bi-level optimization framework in which system identification is cast as an inner problem and predictive control as an outer problem. It argues that several regularized data-driven predictive control (DDPC) schemes—l1-norm penalties, projection-based regularizers, and a newly proposed causality-based regularizer—arise as convex relaxations of the corresponding bi-level programs. An iterative algorithm (A-DDPC) is introduced to approximately enforce all identification constraints, and numerical experiments are reported to show that A-DDPC reduces both bias and variance relative to existing regularized DDPC methods.

Significance. A rigorous convex-relaxation interpretation would supply a principled link between direct and indirect data-driven control and could guide the systematic design of regularizers. The introduction of the causality regularizer and the A-DDPC procedure are constructive contributions; the reported numerical improvements, if reproducible, indicate practical value beyond linear time-invariant settings.

major comments (2)

[§3] §3: The central claim that standard regularized DDPC formulations are convex relaxations of bi-level problems requires an explicit demonstration that the inner system-identification optimization used in the bi-level construction coincides with the identification implicitly performed by the unregularized DDPC objective. Without this matching, the relaxation argument is not shown to recover the exact penalty terms appearing in the literature.
[§4.2] §4.2, Algorithm 1: The iterative A-DDPC is presented as an approximate solver of the inner identification problem, yet no convergence analysis or a priori error bound relating the iterates to the true bi-level solution is supplied; this gap directly affects the claim that A-DDPC more faithfully solves the underlying bi-level program.

minor comments (2)

[§2] The definition of the trajectory library and the precise statement of the unregularized DDPC objective should be restated in §2 to make the subsequent relaxation derivations self-contained.
[§5] Figure captions and axis labels in the numerical section would benefit from explicit mention of the regularization weights and the number of Monte-Carlo trials used to compute bias/variance statistics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We address each major comment below and describe the revisions we will make to the manuscript.

read point-by-point responses

Referee: [§3] §3: The central claim that standard regularized DDPC formulations are convex relaxations of bi-level problems requires an explicit demonstration that the inner system-identification optimization used in the bi-level construction coincides with the identification implicitly performed by the unregularized DDPC objective. Without this matching, the relaxation argument is not shown to recover the exact penalty terms appearing in the literature.

Authors: We agree that an explicit matching is required to make the relaxation argument fully rigorous. In the revised version we will insert a dedicated paragraph (or short subsection) in §3 that derives the precise equivalence between the inner identification program of the bi-level formulation and the unregularized DDPC objective. This derivation will recover the exact penalty terms used in the literature as the convex relaxations of the corresponding bi-level programs. revision: yes
Referee: [§4.2] §4.2, Algorithm 1: The iterative A-DDPC is presented as an approximate solver of the inner identification problem, yet no convergence analysis or a priori error bound relating the iterates to the true bi-level solution is supplied; this gap directly affects the claim that A-DDPC more faithfully solves the underlying bi-level program.

Authors: We acknowledge that the present manuscript provides no formal convergence guarantee or a priori error bound for the A-DDPC iterates. While the algorithm is introduced as an approximate procedure and its practical benefit is demonstrated numerically, we will add a brief discussion in §4.2 that reports the observed convergence behavior across the numerical examples and, where possible, a simple contraction argument based on the fixed-point structure of the iteration. This addition will clarify the extent to which A-DDPC approximates the bi-level solution. revision: yes

Circularity Check

0 steps flagged

Bi-level relaxation framework is independent modeling with no reduction to inputs by construction

full rationale

The paper introduces a bi-level optimization framework with system identification as the inner problem and predictive control as the outer problem. It then shows that existing regularized DDPC formulations (l1 penalties, projection regularizers, and a new causality regularizer) can be interpreted as convex relaxations within this framework. The A-DDPC variant is proposed as an iterative approximation to solve the full inner identification problem. This modeling choice is presented as a perspective that unifies direct and indirect data-driven control, and numerical experiments compare performance. No quoted step in the abstract or description reduces a prediction or central claim to a fitted parameter or self-citation by construction; the framework appears self-contained and externally interpretable via standard convex relaxation techniques.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The paper rests on standard domain assumptions from optimization and linear control theory; no new physical entities are postulated.

free parameters (1)

regularization weights
Weights for l1-norm and other penalties are introduced in the formulations and are presumably selected or tuned.

axioms (1)

domain assumption Data trajectories satisfy the conditions required for the bi-level optimization to represent the DDPC problem.
This premise is invoked when the regularized problems are framed as relaxations of the bi-level setup.

pith-pipeline@v0.9.0 · 5721 in / 1310 out tokens · 59585 ms · 2026-05-18T17:05:51.898177+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

bi-level DDPC (7) … inner identification (10) … projection-based norm … causality-based norm … l1-norm … Theorem 1 exact penalty
IndisputableMonolith/Foundation/BranchSelection.lean branch_selection unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

regularized DDPC formulations … convex relaxations of their respective bi-level problems

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages

[1]

Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

Florian D¨ orfler, Jeremy Coulson, and Ivan Markovsky. Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

work page 2022
[2]

Convex approximations for a bi-level formulation of data-enabled predictive con- trol

Xu Shang and Yang Zheng. Convex approximations for a bi-level formulation of data-enabled predictive con- trol. In 6th Annual Learning for Dynamics & Control Conference, pages 1071–1082. PMLR, 2024

work page 2024
[3]

Toward a theoretical foundation of policy optimization for learning control policies

Bin Hu, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, and Tamer Ba¸ sar. Toward a theoretical foundation of policy optimization for learning control policies. Annual Review of Control, Robotics, and Au- tonomous Systems, 6(1):123–158, 2023

work page 2023
[4]

Behavioral sys- tems theory in data-driven analysis, signal processing, and control

Ivan Markovsky and Florian D¨ orfler. Behavioral sys- tems theory in data-driven analysis, signal processing, and control. Annual Reviews in Control, 52:42–64, 2021

work page 2021
[5]

Policy optimization in control: Geometry and algorithmic implications

Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, and Mehran Mesbahi. Policy optimization in control: Geometry and algorithmic implications. arXiv preprint arXiv:2406.04243, 2024

work page arXiv 2024
[6]

Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

Florian D¨ orfler. Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

work page 2023
[7]

System identification

Lennart Ljung. System identification. In Signal analysis and prediction, pages 163–173. Springer, 1998

work page 1998
[8]

System identification: A machine learning perspective

Alessandro Chiuso and Gianluigi Pillonetto. System identification: A machine learning perspective. Annual Review of Control, Robotics, and Autonomous Systems , 2:281–304, 2019

work page 2019
[9]

Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems

Yang Zheng, Luca Furieri, Maryam Kamgarpour, and Na Li. Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems. In Learning for dynamics and control , pages 559–570. PMLR, 2021

work page 2021
[10]

Model predictive control

Basil Kouvaritakis and Mark Cannon. Model predictive control. Switzerland: Springer International Publishing , 38, 2016. 13

work page 2016
[11]

Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

Milan Korda and Igor Mezi´ c. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

work page 2018
[12]

Modeling nonlinear control systems via

Masih Haseli and Jorge Cort´ es. Modeling nonlinear control systems via koopman control family: Universal forms and subspace invariance proximity.arXiv preprint arXiv:2307.15368, 2023

work page arXiv 2023
[13]

Koopman operator in systems and control

Alexandre Mauroy, Y Susuki, and Igor Mezic. Koopman operator in systems and control . Springer, 2020

work page 2020
[14]

Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding

Xu Shang, Jorge Cort´ es, and Yang Zheng. Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding. arXiv preprint arXiv:2409.16389 , 2024

work page arXiv 2024
[15]

A note on persistency of excitation

Jan C Willems, Paolo Rapisarda, Ivan Markovsky, and Bart LM De Moor. A note on persistency of excitation. Systems & Control Letters , 54(4):325–329, 2005

work page 2005
[16]

Data-enabled predictive control: In the shallows of the deepc

Jeremy Coulson, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control: In the shallows of the deepc. In 2019 18th European Control Conference (ECC), pages 307–312. IEEE, 2019

work page 2019
[17]

On the relationship be- tween data-enabled predictive control and subspace pre- dictive control

Felix Fiedler and Sergio Lucia. On the relationship be- tween data-enabled predictive control and subspace pre- dictive control. In 2021 European Control Conference (ECC), pages 222–229. IEEE, 2021

work page 2021
[18]

Data-driven model predictive con- trol with stability and robustness guarantees

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Data-driven model predictive con- trol with stability and robustness guarantees. IEEE Transactions on Automatic Control , 66(4):1702–1717, 2020

work page 2020
[19]

On the design of terminal in- gredients for data-driven mpc

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. On the design of terminal in- gredients for data-driven mpc. IF AC-PapersOnLine, 54(6):257–263, 2021

work page 2021
[20]

Data-enabled predictive control for quadcopters

Ezzat Elokda, Jeremy Coulson, Paul N Beuchat, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control for quadcopters. International Journal of Robust and Nonlinear Control , 31(18):8916–8936, 2021

work page 2021
[21]

DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

Jiawei Wang, Yang Zheng, Keqiang Li, and Qing Xu. DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

work page 2023
[22]

Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result

Yingzhao Lian, Jicheng Shi, Manuel Koch, and Colin Neil Jones. Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result. IEEE Transactions on Control Systems Technol- ogy, 2023

work page 2023
[23]

Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic

Jiawei Wang, Yang Zheng, Jianghong Dong, Chaoyi Chen, Mengchi Cai, Keqiang Li, and Qing Xu. Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic. IEEE Internet of Things Journal , 11(3):4570–4585, 2023

work page 2023
[24]

Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow

Xu Shang, Jiawei Wang, and Yang Zheng. Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow. IEEE Transactions on Intelligent Transportation Systems, 26(2):2075–2090, 2025

work page 2075
[25]

Linear tracking mpc for nonlinear systems—part ii: The data-driven case

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Linear tracking mpc for nonlinear systems—part ii: The data-driven case. IEEE Transac- tions on Automatic Control , 67(9):4406–4421, 2022

work page 2022
[26]

Dimension reduction for efficient data-enabled predictive control

Kaixiang Zhang, Yang Zheng, Chao Shang, and Zhao- jian Li. Dimension reduction for efficient data-enabled predictive control. IEEE Control Systems Letters , 7:3277–3282, 2023

work page 2023
[27]

Deep hankel matrices with random elements

Nathan Lawrence, Philip Loewen, Shuyuan Wang, Michael Forbes, and Bhushan Gopaluni. Deep hankel matrices with random elements. In6th Annual Learning for Dynamics & Control Conference , pages 1579–1591. PMLR, 2024

work page 2024
[28]

Data-driven predictive control in a stochastic setting: A unified framework

Valentina Breschi, Alessandro Chiuso, and Simone For- mentin. Data-driven predictive control in a stochastic setting: A unified framework. Automatica, 152:110961, 2023

work page 2023
[29]

Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments

Linbin Huang, Jianzhe Zhen, John Lygeros, and Florian D¨ orfler. Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments. IF AC-PapersOnLine, 54(7):192– 197, 2021

work page 2021
[30]

Decentralized data-enabled predictive con- trol for power system oscillation damping

Linbin Huang, Jeremy Coulson, John Lygeros, and Flo- rian D¨ orfler. Decentralized data-enabled predictive con- trol for power system oscillation damping. IEEE Trans- actions on Control Systems Technology , 30(3):1065– 1077, 2021

work page 2021
[31]

Causality-informed data-driven pre- dictive control

Malika Sader, Yibo Wang, Dexian Huang, Chao Shang, and Biao Huang. Causality-informed data-driven pre- dictive control. arXiv preprint arXiv:2311.09545 , 2023

work page arXiv 2023
[32]

Max- imum likelihood estimation in data-driven modeling and control

Mingzhou Yin, Andrea Iannelli, and Roy S Smith. Max- imum likelihood estimation in data-driven modeling and control. IEEE Transactions on Automatic Control , 68(1):317–328, 2021

work page 2021
[33]

Implicit predictors in regularized data-driven predictive control

Manuel Kl¨ adtke and Moritz Schulze Darup. Implicit predictors in regularized data-driven predictive control. IEEE Control Systems Letters , 7:2479–2484, 2023

work page 2023
[34]

arXiv preprint arXiv:2312.14788 , year=

Alessandro Chiuso, Marco Fabris, Valentina Breschi, and Simone Formentin. Harnessing the final control error for optimal data-driven predictive control. arXiv preprint arXiv:2312.14788, 2023

work page arXiv 2023
[35]

Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

Wouter Favoreel, Bart De Moor, and Michel Gevers. Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

work page 1999
[36]

A missing data approach to data- driven filtering and control

Ivan Markovsky. A missing data approach to data- driven filtering and control. IEEE Transactions on Au- tomatic Control, 62(4):1972–1978, 2016

work page 1972
[37]

A novel subspace identification approach with enforced causal models

S Joe Qin, Weilu Lin, and Lennart Ljung. A novel subspace identification approach with enforced causal models. Automatica, 41(12):2043–2053, 2005

work page 2043
[38]

Model predictive control: theory, computation, and design, volume 2

James Blake Rawlings, David Q Mayne, Moritz Diehl, et al. Model predictive control: theory, computation, and design, volume 2. Nob Hill Publishing Madison, WI, 2017

work page 2017
[39]

System identi- fication—a survey

Karl Johan ˚Astr¨ om and Peter Eykhoff. System identi- fication—a survey. Automatica, 7(2):123–162, 1971

work page 1971
[40]

An overview of systems-theoretic guarantees in data-driven model pre- dictive control

Julian Berberich and Frank Allg¨ ower. An overview of systems-theoretic guarantees in data-driven model pre- dictive control. Annual Review of Control, Robotics, and Autonomous Systems , 8, 2024

work page 2024
[41]

Data-driven control based on the behavioral approach: From theory to applications in power systems

Ivan Markovsky, Linbin Huang, and Florian D¨ orfler. Data-driven control based on the behavioral approach: From theory to applications in power systems. IEEE Control Systems Magazine , 43(5):28–68, 2023

work page 2023
[42]

On the equivalence of direct and in- direct data-driven predictive control approaches

Per Mattsson, Fabio Bonassi, Valentina Breschi, and Thomas B Sch¨ on. On the equivalence of direct and in- direct data-driven predictive control approaches. IEEE Control Systems Letters , 2024

work page 2024
[43]

Optimization and nonsmooth analysis

Frank H Clarke. Optimization and nonsmooth analysis . SIAM, 1990

work page 1990
[44]

Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems

JJ Ye, DL Zhu, and Qiji Jim Zhu. Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems. SIAM Journal on opti- mization, 7(2):481–507, 1997

work page 1997
[45]

Nonlinear optimization

Andrzej Ruszczynski. Nonlinear optimization. Prince- ton university press, 2011

work page 2011
[46]

Structured low-rank approximation and its applications

Ivan Markovsky. Structured low-rank approximation and its applications. Automatica, 44(4):891–909, 2008

work page 2008
[47]

On low-rank hankel matrix denoising

Mingzhou Yin and Roy S Smith. On low-rank hankel matrix denoising. IF AC-PapersOnLine, 54(7):198–203, 2021

work page 2021
[48]

Alternating projections on nontangential manifolds

Fredrik Andersson and Marcus Carlsson. Alternating projections on nontangential manifolds. Constructive approximation, 38(3):489–525, 2013

work page 2013
[49]

N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems

Peter Van Overschee and Bart De Moor. N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems. Automatica, 30(1):75–93, 1994. A Exact penalty for quadratic optimization The proof for Theorem 1 is divided into two parts: (1) We transform (17) and (18) to two new optimiza- tion problems, where the optimal solutions x...

work page 1994
[50]

for (A.1) such that x∗ 1 is an interior point ofX , there existsλ∗ w > 0, such that (A.1) and (A.2) have same optimal solutions for all λw > λ ∗ w. Proof. The key idea is to prove that the optimal solution (x∗ 1, ¯x∗ 2, x∗

work page
[51]

viola- tion

of (A.1) is a local minimum of (A.2) via the usage of (A.3). This optimal solution also becomes the global minimum since (A.2) is a convex problem. This is inspired by the notion of partial calmness [44]. (1) Problem (A.3) permits a certain degree of “viola- tion” of the constraint ∥¯x2∥p = 0. We first show that, in a local neighborhood of ( x∗ 1, ¯x∗ 2, ...

work page
[52]

(2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

is a local optimal solution of (A.3). (2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

work page
[53]

Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

in which every feasible solution of (A.2) is also feasible for (A.3). Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

work page
[54]

It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1)

is a local optimal solution of (A.2). It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1). Let A3 := [A1, A2D⊥] and P2 := col(I, 0). As x∗ 1 ∈ X ◦, there exists δr > 0 such that Bδr(x∗

work page
[55]

Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1

⊆ X . Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1. For any ϵ ∈ [0, δ), let (x1, ¯x2, x3) ∈ B δ(x∗ 1, ¯x∗ 2, x∗

work page
[56]

We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

be a feasible solution for (A.3). We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

work page
[57]

A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

(A.4) We can represent col(x1, x3) as col(x1, x3) = A† 3(b − A2D†¯x2) + v1, where v1 ∈ Null(A3). A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

work page
[58]

We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1

+ v1. We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1. We have ∥x1 − ˜x1∥p = ∥P2A† 3A2D†¯x2∥p ≤ σrϵ ≤ σrδ. We then obtain ∥x∗ 1 − ˜x1∥p ≤ ∥x∗ 1 − x1∥p + ∥x1 − ˜x1∥p ≤ (σr + σp)δ ≤ δr, which implies ˜x1 ∈ B δr(x∗) ⊆ X . From the optimality condition of (A.1), we have xT 1 M x1 − x∗TM x∗ ≥xT 1 M x1 − ˜xT 1 M ˜x1 ≥ − L∥x1 − ˜x1∥p ≥...

work page
[59]

We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2)

Thus, we can let µ ≥ Lσr so that the inequality (A.4) is satisfied. We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2). We can let λw > λ∗ w ≥ Lσ. Since f(x) := ∥x∥p is a continuous func- tion, there exists δx > 0 such that ∥¯x2∥p < δ for any ¯x2 ∈ B δx(¯x∗ 2). Let δ∗ := min(δx, δ). For any feasible so- lution...

work page
[60]

Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

of (A.2), there ex- ists ϵ ∈ [0, δ) such that it is feasible to (A.3). Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

work page
[61]

It is now clear that all optimal solutions of (A.1) are optimal to (A.2)

is a local minima (as well as the global minima) of (A.2). It is now clear that all optimal solutions of (A.1) are optimal to (A.2). Next, suppose ( x∗ 1w, ¯x∗ 2w, x∗ 3w) is an optimal solution of (A.2) and recall that λw > λ ∗ w ≥ Lσr. We have x∗T 1wM x∗ 1w+λw∥¯x∗ 2w∥p = x∗T 1 M x∗ 1 ≤ x∗T 1wM x∗ 1w+λ∗ w∥¯x∗ 2w∥p, which leads to (µ − ¯µ)∥¯x∗ 2w∥2 ≤ 0 and...

work page
[62]

We combine Propositions 6 and 7 to establish Theorem 1

Thus, (x∗ 1w, ¯x∗ 2w, x∗ 3w) is also feasible for (A.1) and is its optimal solution. We combine Propositions 6 and 7 to establish Theorem 1. Proof of Theorem 1: From Proposition 6, the optimal solution x∗ 1 to (17) and (A.1) is the same. Furthermore, the assumption that (x∗ 1, x∗

work page
[63]

is an optimal solution with x∗ 1 ∈ X ◦ for (17) implies there exists an optimal solution (x∗ 1, ¯x∗ 2, x∗

work page
[64]

Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w

and x∗ 1 ∈ X ◦ for (A.1). Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w. That means those of (17) and (A.2) are the same with λ∗ w. Since we also have the optimal solutions of x∗ 1 are the same for (18) and (A.2), this illustrates (17) and (18) have the same opt...

work page
[65]

into the objective function of (18) we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1, which means ( x∗ 1, x∗

work page
[66]

Then, suppose ( x∗ 1, x∗

is an optimal solution for (18). Then, suppose ( x∗ 1, x∗

work page
[67]

Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0

is an optimal solution for (18). Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0. Thus, (x∗ 1, x∗

work page
[68]

L11 0 L21 L22 #

is an optimal solution for (17). This com- pletes the proof. B Technical proofs B.1 Relation with the classcial SPC The classical SPC is of the following form min σy∈Γ,u∈U ,y∈Y ∥y∥2 Q + ∥u∥2 R + λy∥σy∥2 2 subject to y = YF   UP Yp UF   †  uini yini + σy u   , (B.1) which does not have the variableg. We can establish the following equivalen...

work page
[69]

Let x∗, g∗ be the primal optimal solution of (32) and µ∗ 1, µ∗ 2 be the dual optimal solutions for the inequality constraint and equality constraint of (32) respectively. Since (32) satisfies Slater’s condition, x∗, g∗, µ∗ 1 and µ∗ 2 satisfy the KKT condition and have the follow properties 0 ∈ ∂(f(x) + µ∗ 1(∥g∥1 − αc) + µ∗T 2 (Hg − v∗)) at (x, g) = (x∗, g...

work page
[70]

Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality

Then, the KKT condition (B.8) holds as it becomes equivalent to (B.7). Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality. B.6 Proof of Proposition 5 The key idea for the proof of the Proposition 5 is that we can first partition both Hy and ˜Hy into the part in the ro...

work page

[1] [1]

Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

Florian D¨ orfler, Jeremy Coulson, and Ivan Markovsky. Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

work page 2022

[2] [2]

Convex approximations for a bi-level formulation of data-enabled predictive con- trol

Xu Shang and Yang Zheng. Convex approximations for a bi-level formulation of data-enabled predictive con- trol. In 6th Annual Learning for Dynamics & Control Conference, pages 1071–1082. PMLR, 2024

work page 2024

[3] [3]

Toward a theoretical foundation of policy optimization for learning control policies

Bin Hu, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, and Tamer Ba¸ sar. Toward a theoretical foundation of policy optimization for learning control policies. Annual Review of Control, Robotics, and Au- tonomous Systems, 6(1):123–158, 2023

work page 2023

[4] [4]

Behavioral sys- tems theory in data-driven analysis, signal processing, and control

Ivan Markovsky and Florian D¨ orfler. Behavioral sys- tems theory in data-driven analysis, signal processing, and control. Annual Reviews in Control, 52:42–64, 2021

work page 2021

[5] [5]

Policy optimization in control: Geometry and algorithmic implications

Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, and Mehran Mesbahi. Policy optimization in control: Geometry and algorithmic implications. arXiv preprint arXiv:2406.04243, 2024

work page arXiv 2024

[6] [6]

Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

Florian D¨ orfler. Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

work page 2023

[7] [7]

System identification

Lennart Ljung. System identification. In Signal analysis and prediction, pages 163–173. Springer, 1998

work page 1998

[8] [8]

System identification: A machine learning perspective

Alessandro Chiuso and Gianluigi Pillonetto. System identification: A machine learning perspective. Annual Review of Control, Robotics, and Autonomous Systems , 2:281–304, 2019

work page 2019

[9] [9]

Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems

Yang Zheng, Luca Furieri, Maryam Kamgarpour, and Na Li. Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems. In Learning for dynamics and control , pages 559–570. PMLR, 2021

work page 2021

[10] [10]

Model predictive control

Basil Kouvaritakis and Mark Cannon. Model predictive control. Switzerland: Springer International Publishing , 38, 2016. 13

work page 2016

[11] [11]

Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

Milan Korda and Igor Mezi´ c. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

work page 2018

[12] [12]

Modeling nonlinear control systems via

Masih Haseli and Jorge Cort´ es. Modeling nonlinear control systems via koopman control family: Universal forms and subspace invariance proximity.arXiv preprint arXiv:2307.15368, 2023

work page arXiv 2023

[13] [13]

Koopman operator in systems and control

Alexandre Mauroy, Y Susuki, and Igor Mezic. Koopman operator in systems and control . Springer, 2020

work page 2020

[14] [14]

Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding

Xu Shang, Jorge Cort´ es, and Yang Zheng. Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding. arXiv preprint arXiv:2409.16389 , 2024

work page arXiv 2024

[15] [15]

A note on persistency of excitation

Jan C Willems, Paolo Rapisarda, Ivan Markovsky, and Bart LM De Moor. A note on persistency of excitation. Systems & Control Letters , 54(4):325–329, 2005

work page 2005

[16] [16]

Data-enabled predictive control: In the shallows of the deepc

Jeremy Coulson, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control: In the shallows of the deepc. In 2019 18th European Control Conference (ECC), pages 307–312. IEEE, 2019

work page 2019

[17] [17]

On the relationship be- tween data-enabled predictive control and subspace pre- dictive control

Felix Fiedler and Sergio Lucia. On the relationship be- tween data-enabled predictive control and subspace pre- dictive control. In 2021 European Control Conference (ECC), pages 222–229. IEEE, 2021

work page 2021

[18] [18]

Data-driven model predictive con- trol with stability and robustness guarantees

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Data-driven model predictive con- trol with stability and robustness guarantees. IEEE Transactions on Automatic Control , 66(4):1702–1717, 2020

work page 2020

[19] [19]

On the design of terminal in- gredients for data-driven mpc

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. On the design of terminal in- gredients for data-driven mpc. IF AC-PapersOnLine, 54(6):257–263, 2021

work page 2021

[20] [20]

Data-enabled predictive control for quadcopters

Ezzat Elokda, Jeremy Coulson, Paul N Beuchat, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control for quadcopters. International Journal of Robust and Nonlinear Control , 31(18):8916–8936, 2021

work page 2021

[21] [21]

DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

Jiawei Wang, Yang Zheng, Keqiang Li, and Qing Xu. DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

work page 2023

[22] [22]

Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result

Yingzhao Lian, Jicheng Shi, Manuel Koch, and Colin Neil Jones. Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result. IEEE Transactions on Control Systems Technol- ogy, 2023

work page 2023

[23] [23]

Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic

Jiawei Wang, Yang Zheng, Jianghong Dong, Chaoyi Chen, Mengchi Cai, Keqiang Li, and Qing Xu. Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic. IEEE Internet of Things Journal , 11(3):4570–4585, 2023

work page 2023

[24] [24]

Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow

Xu Shang, Jiawei Wang, and Yang Zheng. Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow. IEEE Transactions on Intelligent Transportation Systems, 26(2):2075–2090, 2025

work page 2075

[25] [25]

Linear tracking mpc for nonlinear systems—part ii: The data-driven case

Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Linear tracking mpc for nonlinear systems—part ii: The data-driven case. IEEE Transac- tions on Automatic Control , 67(9):4406–4421, 2022

work page 2022

[26] [26]

Dimension reduction for efficient data-enabled predictive control

Kaixiang Zhang, Yang Zheng, Chao Shang, and Zhao- jian Li. Dimension reduction for efficient data-enabled predictive control. IEEE Control Systems Letters , 7:3277–3282, 2023

work page 2023

[27] [27]

Deep hankel matrices with random elements

Nathan Lawrence, Philip Loewen, Shuyuan Wang, Michael Forbes, and Bhushan Gopaluni. Deep hankel matrices with random elements. In6th Annual Learning for Dynamics & Control Conference , pages 1579–1591. PMLR, 2024

work page 2024

[28] [28]

Data-driven predictive control in a stochastic setting: A unified framework

Valentina Breschi, Alessandro Chiuso, and Simone For- mentin. Data-driven predictive control in a stochastic setting: A unified framework. Automatica, 152:110961, 2023

work page 2023

[29] [29]

Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments

Linbin Huang, Jianzhe Zhen, John Lygeros, and Florian D¨ orfler. Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments. IF AC-PapersOnLine, 54(7):192– 197, 2021

work page 2021

[30] [30]

Decentralized data-enabled predictive con- trol for power system oscillation damping

Linbin Huang, Jeremy Coulson, John Lygeros, and Flo- rian D¨ orfler. Decentralized data-enabled predictive con- trol for power system oscillation damping. IEEE Trans- actions on Control Systems Technology , 30(3):1065– 1077, 2021

work page 2021

[31] [31]

Causality-informed data-driven pre- dictive control

Malika Sader, Yibo Wang, Dexian Huang, Chao Shang, and Biao Huang. Causality-informed data-driven pre- dictive control. arXiv preprint arXiv:2311.09545 , 2023

work page arXiv 2023

[32] [32]

Max- imum likelihood estimation in data-driven modeling and control

Mingzhou Yin, Andrea Iannelli, and Roy S Smith. Max- imum likelihood estimation in data-driven modeling and control. IEEE Transactions on Automatic Control , 68(1):317–328, 2021

work page 2021

[33] [33]

Implicit predictors in regularized data-driven predictive control

Manuel Kl¨ adtke and Moritz Schulze Darup. Implicit predictors in regularized data-driven predictive control. IEEE Control Systems Letters , 7:2479–2484, 2023

work page 2023

[34] [34]

arXiv preprint arXiv:2312.14788 , year=

Alessandro Chiuso, Marco Fabris, Valentina Breschi, and Simone Formentin. Harnessing the final control error for optimal data-driven predictive control. arXiv preprint arXiv:2312.14788, 2023

work page arXiv 2023

[35] [35]

Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

Wouter Favoreel, Bart De Moor, and Michel Gevers. Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

work page 1999

[36] [36]

A missing data approach to data- driven filtering and control

Ivan Markovsky. A missing data approach to data- driven filtering and control. IEEE Transactions on Au- tomatic Control, 62(4):1972–1978, 2016

work page 1972

[37] [37]

A novel subspace identification approach with enforced causal models

S Joe Qin, Weilu Lin, and Lennart Ljung. A novel subspace identification approach with enforced causal models. Automatica, 41(12):2043–2053, 2005

work page 2043

[38] [38]

Model predictive control: theory, computation, and design, volume 2

James Blake Rawlings, David Q Mayne, Moritz Diehl, et al. Model predictive control: theory, computation, and design, volume 2. Nob Hill Publishing Madison, WI, 2017

work page 2017

[39] [39]

System identi- fication—a survey

Karl Johan ˚Astr¨ om and Peter Eykhoff. System identi- fication—a survey. Automatica, 7(2):123–162, 1971

work page 1971

[40] [40]

An overview of systems-theoretic guarantees in data-driven model pre- dictive control

Julian Berberich and Frank Allg¨ ower. An overview of systems-theoretic guarantees in data-driven model pre- dictive control. Annual Review of Control, Robotics, and Autonomous Systems , 8, 2024

work page 2024

[41] [41]

Data-driven control based on the behavioral approach: From theory to applications in power systems

Ivan Markovsky, Linbin Huang, and Florian D¨ orfler. Data-driven control based on the behavioral approach: From theory to applications in power systems. IEEE Control Systems Magazine , 43(5):28–68, 2023

work page 2023

[42] [42]

On the equivalence of direct and in- direct data-driven predictive control approaches

Per Mattsson, Fabio Bonassi, Valentina Breschi, and Thomas B Sch¨ on. On the equivalence of direct and in- direct data-driven predictive control approaches. IEEE Control Systems Letters , 2024

work page 2024

[43] [43]

Optimization and nonsmooth analysis

Frank H Clarke. Optimization and nonsmooth analysis . SIAM, 1990

work page 1990

[44] [44]

Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems

JJ Ye, DL Zhu, and Qiji Jim Zhu. Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems. SIAM Journal on opti- mization, 7(2):481–507, 1997

work page 1997

[45] [45]

Nonlinear optimization

Andrzej Ruszczynski. Nonlinear optimization. Prince- ton university press, 2011

work page 2011

[46] [46]

Structured low-rank approximation and its applications

Ivan Markovsky. Structured low-rank approximation and its applications. Automatica, 44(4):891–909, 2008

work page 2008

[47] [47]

On low-rank hankel matrix denoising

Mingzhou Yin and Roy S Smith. On low-rank hankel matrix denoising. IF AC-PapersOnLine, 54(7):198–203, 2021

work page 2021

[48] [48]

Alternating projections on nontangential manifolds

Fredrik Andersson and Marcus Carlsson. Alternating projections on nontangential manifolds. Constructive approximation, 38(3):489–525, 2013

work page 2013

[49] [49]

N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems

Peter Van Overschee and Bart De Moor. N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems. Automatica, 30(1):75–93, 1994. A Exact penalty for quadratic optimization The proof for Theorem 1 is divided into two parts: (1) We transform (17) and (18) to two new optimiza- tion problems, where the optimal solutions x...

work page 1994

[50] [50]

for (A.1) such that x∗ 1 is an interior point ofX , there existsλ∗ w > 0, such that (A.1) and (A.2) have same optimal solutions for all λw > λ ∗ w. Proof. The key idea is to prove that the optimal solution (x∗ 1, ¯x∗ 2, x∗

work page

[51] [51]

viola- tion

of (A.1) is a local minimum of (A.2) via the usage of (A.3). This optimal solution also becomes the global minimum since (A.2) is a convex problem. This is inspired by the notion of partial calmness [44]. (1) Problem (A.3) permits a certain degree of “viola- tion” of the constraint ∥¯x2∥p = 0. We first show that, in a local neighborhood of ( x∗ 1, ¯x∗ 2, ...

work page

[52] [52]

(2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

is a local optimal solution of (A.3). (2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

work page

[53] [53]

Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

in which every feasible solution of (A.2) is also feasible for (A.3). Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

work page

[54] [54]

It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1)

is a local optimal solution of (A.2). It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1). Let A3 := [A1, A2D⊥] and P2 := col(I, 0). As x∗ 1 ∈ X ◦, there exists δr > 0 such that Bδr(x∗

work page

[55] [55]

Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1

⊆ X . Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1. For any ϵ ∈ [0, δ), let (x1, ¯x2, x3) ∈ B δ(x∗ 1, ¯x∗ 2, x∗

work page

[56] [56]

We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

be a feasible solution for (A.3). We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

work page

[57] [57]

A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

(A.4) We can represent col(x1, x3) as col(x1, x3) = A† 3(b − A2D†¯x2) + v1, where v1 ∈ Null(A3). A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

work page

[58] [58]

We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1

+ v1. We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1. We have ∥x1 − ˜x1∥p = ∥P2A† 3A2D†¯x2∥p ≤ σrϵ ≤ σrδ. We then obtain ∥x∗ 1 − ˜x1∥p ≤ ∥x∗ 1 − x1∥p + ∥x1 − ˜x1∥p ≤ (σr + σp)δ ≤ δr, which implies ˜x1 ∈ B δr(x∗) ⊆ X . From the optimality condition of (A.1), we have xT 1 M x1 − x∗TM x∗ ≥xT 1 M x1 − ˜xT 1 M ˜x1 ≥ − L∥x1 − ˜x1∥p ≥...

work page

[59] [59]

We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2)

Thus, we can let µ ≥ Lσr so that the inequality (A.4) is satisfied. We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2). We can let λw > λ∗ w ≥ Lσ. Since f(x) := ∥x∥p is a continuous func- tion, there exists δx > 0 such that ∥¯x2∥p < δ for any ¯x2 ∈ B δx(¯x∗ 2). Let δ∗ := min(δx, δ). For any feasible so- lution...

work page

[60] [60]

Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

of (A.2), there ex- ists ϵ ∈ [0, δ) such that it is feasible to (A.3). Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

work page

[61] [61]

It is now clear that all optimal solutions of (A.1) are optimal to (A.2)

is a local minima (as well as the global minima) of (A.2). It is now clear that all optimal solutions of (A.1) are optimal to (A.2). Next, suppose ( x∗ 1w, ¯x∗ 2w, x∗ 3w) is an optimal solution of (A.2) and recall that λw > λ ∗ w ≥ Lσr. We have x∗T 1wM x∗ 1w+λw∥¯x∗ 2w∥p = x∗T 1 M x∗ 1 ≤ x∗T 1wM x∗ 1w+λ∗ w∥¯x∗ 2w∥p, which leads to (µ − ¯µ)∥¯x∗ 2w∥2 ≤ 0 and...

work page

[62] [62]

We combine Propositions 6 and 7 to establish Theorem 1

Thus, (x∗ 1w, ¯x∗ 2w, x∗ 3w) is also feasible for (A.1) and is its optimal solution. We combine Propositions 6 and 7 to establish Theorem 1. Proof of Theorem 1: From Proposition 6, the optimal solution x∗ 1 to (17) and (A.1) is the same. Furthermore, the assumption that (x∗ 1, x∗

work page

[63] [63]

is an optimal solution with x∗ 1 ∈ X ◦ for (17) implies there exists an optimal solution (x∗ 1, ¯x∗ 2, x∗

work page

[64] [64]

Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w

and x∗ 1 ∈ X ◦ for (A.1). Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w. That means those of (17) and (A.2) are the same with λ∗ w. Since we also have the optimal solutions of x∗ 1 are the same for (18) and (A.2), this illustrates (17) and (18) have the same opt...

work page

[65] [65]

into the objective function of (18) we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1, which means ( x∗ 1, x∗

work page

[66] [66]

Then, suppose ( x∗ 1, x∗

is an optimal solution for (18). Then, suppose ( x∗ 1, x∗

work page

[67] [67]

Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0

is an optimal solution for (18). Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0. Thus, (x∗ 1, x∗

work page

[68] [68]

L11 0 L21 L22 #

is an optimal solution for (17). This com- pletes the proof. B Technical proofs B.1 Relation with the classcial SPC The classical SPC is of the following form min σy∈Γ,u∈U ,y∈Y ∥y∥2 Q + ∥u∥2 R + λy∥σy∥2 2 subject to y = YF   UP Yp UF   †  uini yini + σy u   , (B.1) which does not have the variableg. We can establish the following equivalen...

work page

[69] [69]

Let x∗, g∗ be the primal optimal solution of (32) and µ∗ 1, µ∗ 2 be the dual optimal solutions for the inequality constraint and equality constraint of (32) respectively. Since (32) satisfies Slater’s condition, x∗, g∗, µ∗ 1 and µ∗ 2 satisfy the KKT condition and have the follow properties 0 ∈ ∂(f(x) + µ∗ 1(∥g∥1 − αc) + µ∗T 2 (Hg − v∗)) at (x, g) = (x∗, g...

work page

[70] [70]

Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality

Then, the KKT condition (B.8) holds as it becomes equivalent to (B.7). Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality. B.6 Proof of Proposition 5 The key idea for the proof of the Proposition 5 is that we can first partition both Hy and ˜Hy into the part in the ro...

work page