pith. sign in

arxiv: 2509.09027 · v2 · submitted 2025-09-10 · 🧮 math.OC · cs.SY· eess.SY

Regularization in Data-driven Predictive Control: A Convex Relaxation Perspective

Pith reviewed 2026-05-18 17:05 UTC · model grok-4.3

classification 🧮 math.OC cs.SYeess.SY
keywords data-driven predictive controlregularizationconvex relaxationbi-level optimizationsystem identificationcausality-based regularizerA-DDPC
0
0 comments X

The pith

Regularized data-driven predictive control formulations are convex relaxations of bi-level optimization problems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that several common regularization techniques in data-driven predictive control correspond to convex relaxations of a bi-level optimization setup. System identification is treated as the inner problem whose solution is needed for the outer predictive control task. This framing unifies direct and indirect data-driven methods by showing how penalties, projections, and causality constraints implicitly enforce identification. A reader would care because the view explains the strong practical performance of regularization and suggests a new iterative method that solves the identification subproblem more completely.

Core claim

Using a bi-level optimization framework, system identification is modeled as an inner problem and predictive control as an outer problem. Several regularized DDPC formulations, including l1-norm penalties, projection-based regularizers, and a newly introduced causality-based regularizer, are shown to be convex relaxations of their respective bi-level problems. This perspective clarifies the conceptual links between direct and indirect data-driven control and highlights how regularization implicitly enforces system identification. An optimality-based variant called A-DDPC is proposed that approximately solves the inner problem with all identification constraints via an iterative algorithm.

What carries the argument

The bi-level optimization framework that places system identification in the inner level and predictive control in the outer level, which permits regularizers to be read as convex relaxations enforcing identification constraints.

If this is right

  • Regularization implicitly enforces system identification constraints.
  • Direct and indirect data-driven control methods are linked through the shared relaxation view.
  • The A-DDPC variant reduces both bias and variance errors relative to standard regularized formulations.
  • Pre-processing the trajectory library with system identification techniques can improve performance in nonlinear cases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same relaxation lens might be used to derive new regularizers from other identification constraints not yet considered.
  • The framework could be tested on nonlinear systems where identification errors are larger to see whether the performance gains of A-DDPC persist.
  • Similar bi-level relaxations may apply to data-driven methods in adjacent fields such as reinforcement learning.

Load-bearing premise

The bi-level optimization structure with identification as the inner problem accurately represents the data-driven predictive control setting.

What would settle it

Solve the explicit bi-level problem on a given trajectory dataset and compare its outer-level optimum to the solution of a regularized DDPC formulation on the same data; large mismatch would falsify the relaxation interpretation.

Figures

Figures reproduced from arXiv: 2509.09027 by Xu Shang, Yang Zheng.

Figure 1
Figure 1. Figure 1: Schematic of data-driven predictive control ( [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The influence of the penalty parameter λ in (15), where we have replaced y with 1/x. is the same as {(g, σy, u, y) | Dg = col(uini, yini + σy, u, y), h(g) = 0}, where D is a fixed trajectory matrix. We next move the constraint on g to the cost via regularization, leading to a direct DDPC min g,σy∈Γ, u∈U,y∈Y ∥y∥ 2 Q + ∥u∥ 2 R + λy∥σy∥ 2 2 + λw|h(g)| subject to Dg = col(uini, yini + σy, u, y). (16) Section 4… view at source ↗
Figure 3
Figure 3. Figure 3: Numerical illustration of Theorem 1. The optimal [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of realized control cost of different vari [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of realized control cost of different [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
read the original abstract

This paper explores the role of regularization in data-driven predictive control (DDPC) through the lens of convex relaxation. Using a bi-level optimization framework, we model system identification as an inner problem and predictive control as an outer problem. Within this framework, we show that several regularized DDPC formulations, including l1-norm penalties, projection-based regularizers, and a newly introduced causality-based regularizer, can be viewed as convex relaxations of their respective bi-level problems. This perspective clarifies the conceptual links between direct and indirect data-driven control and highlights how regularization implicitly enforces system identification. We further propose an optimality-based variant, A-DDPC, which approximately solves the inner problem with all identification constraints via an iterative algorithm. Numerical experiments demonstrate that A-DDPC outperforms existing regularized DDPC by reducing both bias and variance errors. These results indicate that further benefits may be obtained by applying system identification techniques to pre-process the trajectory library in nonlinear settings. Overall, our analysis contributes to a unified convex relaxation view of regularization in DDPC and sheds light on its strong empirical performance beyond linear time-invariant systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a bi-level optimization framework in which system identification is cast as an inner problem and predictive control as an outer problem. It argues that several regularized data-driven predictive control (DDPC) schemes—l1-norm penalties, projection-based regularizers, and a newly proposed causality-based regularizer—arise as convex relaxations of the corresponding bi-level programs. An iterative algorithm (A-DDPC) is introduced to approximately enforce all identification constraints, and numerical experiments are reported to show that A-DDPC reduces both bias and variance relative to existing regularized DDPC methods.

Significance. A rigorous convex-relaxation interpretation would supply a principled link between direct and indirect data-driven control and could guide the systematic design of regularizers. The introduction of the causality regularizer and the A-DDPC procedure are constructive contributions; the reported numerical improvements, if reproducible, indicate practical value beyond linear time-invariant settings.

major comments (2)
  1. [§3] §3: The central claim that standard regularized DDPC formulations are convex relaxations of bi-level problems requires an explicit demonstration that the inner system-identification optimization used in the bi-level construction coincides with the identification implicitly performed by the unregularized DDPC objective. Without this matching, the relaxation argument is not shown to recover the exact penalty terms appearing in the literature.
  2. [§4.2] §4.2, Algorithm 1: The iterative A-DDPC is presented as an approximate solver of the inner identification problem, yet no convergence analysis or a priori error bound relating the iterates to the true bi-level solution is supplied; this gap directly affects the claim that A-DDPC more faithfully solves the underlying bi-level program.
minor comments (2)
  1. [§2] The definition of the trajectory library and the precise statement of the unregularized DDPC objective should be restated in §2 to make the subsequent relaxation derivations self-contained.
  2. [§5] Figure captions and axis labels in the numerical section would benefit from explicit mention of the regularization weights and the number of Monte-Carlo trials used to compute bias/variance statistics.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive report. We address each major comment below and describe the revisions we will make to the manuscript.

read point-by-point responses
  1. Referee: [§3] §3: The central claim that standard regularized DDPC formulations are convex relaxations of bi-level problems requires an explicit demonstration that the inner system-identification optimization used in the bi-level construction coincides with the identification implicitly performed by the unregularized DDPC objective. Without this matching, the relaxation argument is not shown to recover the exact penalty terms appearing in the literature.

    Authors: We agree that an explicit matching is required to make the relaxation argument fully rigorous. In the revised version we will insert a dedicated paragraph (or short subsection) in §3 that derives the precise equivalence between the inner identification program of the bi-level formulation and the unregularized DDPC objective. This derivation will recover the exact penalty terms used in the literature as the convex relaxations of the corresponding bi-level programs. revision: yes

  2. Referee: [§4.2] §4.2, Algorithm 1: The iterative A-DDPC is presented as an approximate solver of the inner identification problem, yet no convergence analysis or a priori error bound relating the iterates to the true bi-level solution is supplied; this gap directly affects the claim that A-DDPC more faithfully solves the underlying bi-level program.

    Authors: We acknowledge that the present manuscript provides no formal convergence guarantee or a priori error bound for the A-DDPC iterates. While the algorithm is introduced as an approximate procedure and its practical benefit is demonstrated numerically, we will add a brief discussion in §4.2 that reports the observed convergence behavior across the numerical examples and, where possible, a simple contraction argument based on the fixed-point structure of the iteration. This addition will clarify the extent to which A-DDPC approximates the bi-level solution. revision: yes

Circularity Check

0 steps flagged

Bi-level relaxation framework is independent modeling with no reduction to inputs by construction

full rationale

The paper introduces a bi-level optimization framework with system identification as the inner problem and predictive control as the outer problem. It then shows that existing regularized DDPC formulations (l1 penalties, projection regularizers, and a new causality regularizer) can be interpreted as convex relaxations within this framework. The A-DDPC variant is proposed as an iterative approximation to solve the full inner identification problem. This modeling choice is presented as a perspective that unifies direct and indirect data-driven control, and numerical experiments compare performance. No quoted step in the abstract or description reduces a prediction or central claim to a fitted parameter or self-citation by construction; the framework appears self-contained and externally interpretable via standard convex relaxation techniques.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The paper rests on standard domain assumptions from optimization and linear control theory; no new physical entities are postulated.

free parameters (1)
  • regularization weights
    Weights for l1-norm and other penalties are introduced in the formulations and are presumably selected or tuned.
axioms (1)
  • domain assumption Data trajectories satisfy the conditions required for the bi-level optimization to represent the DDPC problem.
    This premise is invoked when the regularized problems are framed as relaxations of the bi-level setup.

pith-pipeline@v0.9.0 · 5721 in / 1310 out tokens · 59585 ms · 2026-05-18T17:05:51.898177+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages

  1. [1]

    Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

    Florian D¨ orfler, Jeremy Coulson, and Ivan Markovsky. Bridging direct and indirect data-driven control formu- lations via regularizations and relaxations.IEEE Trans- actions on Automatic Control , 68(2):883–897, 2022

  2. [2]

    Convex approximations for a bi-level formulation of data-enabled predictive con- trol

    Xu Shang and Yang Zheng. Convex approximations for a bi-level formulation of data-enabled predictive con- trol. In 6th Annual Learning for Dynamics & Control Conference, pages 1071–1082. PMLR, 2024

  3. [3]

    Toward a theoretical foundation of policy optimization for learning control policies

    Bin Hu, Kaiqing Zhang, Na Li, Mehran Mesbahi, Maryam Fazel, and Tamer Ba¸ sar. Toward a theoretical foundation of policy optimization for learning control policies. Annual Review of Control, Robotics, and Au- tonomous Systems, 6(1):123–158, 2023

  4. [4]

    Behavioral sys- tems theory in data-driven analysis, signal processing, and control

    Ivan Markovsky and Florian D¨ orfler. Behavioral sys- tems theory in data-driven analysis, signal processing, and control. Annual Reviews in Control, 52:42–64, 2021

  5. [5]

    Policy optimization in control: Geometry and algorithmic implications

    Shahriar Talebi, Yang Zheng, Spencer Kraisler, Na Li, and Mehran Mesbahi. Policy optimization in control: Geometry and algorithmic implications. arXiv preprint arXiv:2406.04243, 2024

  6. [6]

    Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

    Florian D¨ orfler. Data-driven control: Part two of two: Hot take: Why not go with models? IEEE Control Systems Magazine, 43(6):27–31, 2023

  7. [7]

    System identification

    Lennart Ljung. System identification. In Signal analysis and prediction, pages 163–173. Springer, 1998

  8. [8]

    System identification: A machine learning perspective

    Alessandro Chiuso and Gianluigi Pillonetto. System identification: A machine learning perspective. Annual Review of Control, Robotics, and Autonomous Systems , 2:281–304, 2019

  9. [9]

    Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems

    Yang Zheng, Luca Furieri, Maryam Kamgarpour, and Na Li. Sample complexity of linear quadratic gaussian (lqg) control for output feedback systems. In Learning for dynamics and control , pages 559–570. PMLR, 2021

  10. [10]

    Model predictive control

    Basil Kouvaritakis and Mark Cannon. Model predictive control. Switzerland: Springer International Publishing , 38, 2016. 13

  11. [11]

    Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

    Milan Korda and Igor Mezi´ c. Linear predictors for nonlinear dynamical systems: Koopman operator meets model predictive control.Automatica, 93:149–160, 2018

  12. [12]

    Modeling nonlinear control systems via

    Masih Haseli and Jorge Cort´ es. Modeling nonlinear control systems via koopman control family: Universal forms and subspace invariance proximity.arXiv preprint arXiv:2307.15368, 2023

  13. [13]

    Koopman operator in systems and control

    Alexandre Mauroy, Y Susuki, and Igor Mezic. Koopman operator in systems and control . Springer, 2020

  14. [14]

    Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding

    Xu Shang, Jorge Cort´ es, and Yang Zheng. Willems’ fun- damental lemma for nonlinear systems with koopman linear embedding. arXiv preprint arXiv:2409.16389 , 2024

  15. [15]

    A note on persistency of excitation

    Jan C Willems, Paolo Rapisarda, Ivan Markovsky, and Bart LM De Moor. A note on persistency of excitation. Systems & Control Letters , 54(4):325–329, 2005

  16. [16]

    Data-enabled predictive control: In the shallows of the deepc

    Jeremy Coulson, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control: In the shallows of the deepc. In 2019 18th European Control Conference (ECC), pages 307–312. IEEE, 2019

  17. [17]

    On the relationship be- tween data-enabled predictive control and subspace pre- dictive control

    Felix Fiedler and Sergio Lucia. On the relationship be- tween data-enabled predictive control and subspace pre- dictive control. In 2021 European Control Conference (ECC), pages 222–229. IEEE, 2021

  18. [18]

    Data-driven model predictive con- trol with stability and robustness guarantees

    Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Data-driven model predictive con- trol with stability and robustness guarantees. IEEE Transactions on Automatic Control , 66(4):1702–1717, 2020

  19. [19]

    On the design of terminal in- gredients for data-driven mpc

    Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. On the design of terminal in- gredients for data-driven mpc. IF AC-PapersOnLine, 54(6):257–263, 2021

  20. [20]

    Data-enabled predictive control for quadcopters

    Ezzat Elokda, Jeremy Coulson, Paul N Beuchat, John Lygeros, and Florian D¨ orfler. Data-enabled predictive control for quadcopters. International Journal of Robust and Nonlinear Control , 31(18):8916–8936, 2021

  21. [21]

    DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

    Jiawei Wang, Yang Zheng, Keqiang Li, and Qing Xu. DeeP-LCC: Data-enabled predictive leading cruise con- trol in mixed traffic flow.IEEE Transactions on Control Systems Technology, 31(6):2760–2776, 2023

  22. [22]

    Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result

    Yingzhao Lian, Jicheng Shi, Manuel Koch, and Colin Neil Jones. Adaptive robust data-driven build- ing control via bilevel reformulation: An experimental result. IEEE Transactions on Control Systems Technol- ogy, 2023

  23. [23]

    Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic

    Jiawei Wang, Yang Zheng, Jianghong Dong, Chaoyi Chen, Mengchi Cai, Keqiang Li, and Qing Xu. Imple- mentation and experimental validation of data-driven predictive control for dissipating stop-and-go waves in mixed traffic. IEEE Internet of Things Journal , 11(3):4570–4585, 2023

  24. [24]

    Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow

    Xu Shang, Jiawei Wang, and Yang Zheng. Decentral- ized robust data-driven predictive control for smooth- ing mixed traffic flow. IEEE Transactions on Intelligent Transportation Systems, 26(2):2075–2090, 2025

  25. [25]

    Linear tracking mpc for nonlinear systems—part ii: The data-driven case

    Julian Berberich, Johannes K¨ ohler, Matthias A M¨ uller, and Frank Allg¨ ower. Linear tracking mpc for nonlinear systems—part ii: The data-driven case. IEEE Transac- tions on Automatic Control , 67(9):4406–4421, 2022

  26. [26]

    Dimension reduction for efficient data-enabled predictive control

    Kaixiang Zhang, Yang Zheng, Chao Shang, and Zhao- jian Li. Dimension reduction for efficient data-enabled predictive control. IEEE Control Systems Letters , 7:3277–3282, 2023

  27. [27]

    Deep hankel matrices with random elements

    Nathan Lawrence, Philip Loewen, Shuyuan Wang, Michael Forbes, and Bhushan Gopaluni. Deep hankel matrices with random elements. In6th Annual Learning for Dynamics & Control Conference , pages 1579–1591. PMLR, 2024

  28. [28]

    Data-driven predictive control in a stochastic setting: A unified framework

    Valentina Breschi, Alessandro Chiuso, and Simone For- mentin. Data-driven predictive control in a stochastic setting: A unified framework. Automatica, 152:110961, 2023

  29. [29]

    Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments

    Linbin Huang, Jianzhe Zhen, John Lygeros, and Florian D¨ orfler. Quadratic regularization of data-enabled pre- dictive control: Theory and application to power con- verter experiments. IF AC-PapersOnLine, 54(7):192– 197, 2021

  30. [30]

    Decentralized data-enabled predictive con- trol for power system oscillation damping

    Linbin Huang, Jeremy Coulson, John Lygeros, and Flo- rian D¨ orfler. Decentralized data-enabled predictive con- trol for power system oscillation damping. IEEE Trans- actions on Control Systems Technology , 30(3):1065– 1077, 2021

  31. [31]

    Causality-informed data-driven pre- dictive control

    Malika Sader, Yibo Wang, Dexian Huang, Chao Shang, and Biao Huang. Causality-informed data-driven pre- dictive control. arXiv preprint arXiv:2311.09545 , 2023

  32. [32]

    Max- imum likelihood estimation in data-driven modeling and control

    Mingzhou Yin, Andrea Iannelli, and Roy S Smith. Max- imum likelihood estimation in data-driven modeling and control. IEEE Transactions on Automatic Control , 68(1):317–328, 2021

  33. [33]

    Implicit predictors in regularized data-driven predictive control

    Manuel Kl¨ adtke and Moritz Schulze Darup. Implicit predictors in regularized data-driven predictive control. IEEE Control Systems Letters , 7:2479–2484, 2023

  34. [34]

    arXiv preprint arXiv:2312.14788 , year=

    Alessandro Chiuso, Marco Fabris, Valentina Breschi, and Simone Formentin. Harnessing the final control error for optimal data-driven predictive control. arXiv preprint arXiv:2312.14788, 2023

  35. [35]

    Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

    Wouter Favoreel, Bart De Moor, and Michel Gevers. Spc: Subspace predictive control.IF AC Proceedings Vol- umes, 32(2):4004–4009, 1999

  36. [36]

    A missing data approach to data- driven filtering and control

    Ivan Markovsky. A missing data approach to data- driven filtering and control. IEEE Transactions on Au- tomatic Control, 62(4):1972–1978, 2016

  37. [37]

    A novel subspace identification approach with enforced causal models

    S Joe Qin, Weilu Lin, and Lennart Ljung. A novel subspace identification approach with enforced causal models. Automatica, 41(12):2043–2053, 2005

  38. [38]

    Model predictive control: theory, computation, and design, volume 2

    James Blake Rawlings, David Q Mayne, Moritz Diehl, et al. Model predictive control: theory, computation, and design, volume 2. Nob Hill Publishing Madison, WI, 2017

  39. [39]

    System identi- fication—a survey

    Karl Johan ˚Astr¨ om and Peter Eykhoff. System identi- fication—a survey. Automatica, 7(2):123–162, 1971

  40. [40]

    An overview of systems-theoretic guarantees in data-driven model pre- dictive control

    Julian Berberich and Frank Allg¨ ower. An overview of systems-theoretic guarantees in data-driven model pre- dictive control. Annual Review of Control, Robotics, and Autonomous Systems , 8, 2024

  41. [41]

    Data-driven control based on the behavioral approach: From theory to applications in power systems

    Ivan Markovsky, Linbin Huang, and Florian D¨ orfler. Data-driven control based on the behavioral approach: From theory to applications in power systems. IEEE Control Systems Magazine , 43(5):28–68, 2023

  42. [42]

    On the equivalence of direct and in- direct data-driven predictive control approaches

    Per Mattsson, Fabio Bonassi, Valentina Breschi, and Thomas B Sch¨ on. On the equivalence of direct and in- direct data-driven predictive control approaches. IEEE Control Systems Letters , 2024

  43. [43]

    Optimization and nonsmooth analysis

    Frank H Clarke. Optimization and nonsmooth analysis . SIAM, 1990

  44. [44]

    Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems

    JJ Ye, DL Zhu, and Qiji Jim Zhu. Exact penaliza- tion and necessary optimality conditions for generalized 14 bilevel programming problems. SIAM Journal on opti- mization, 7(2):481–507, 1997

  45. [45]

    Nonlinear optimization

    Andrzej Ruszczynski. Nonlinear optimization. Prince- ton university press, 2011

  46. [46]

    Structured low-rank approximation and its applications

    Ivan Markovsky. Structured low-rank approximation and its applications. Automatica, 44(4):891–909, 2008

  47. [47]

    On low-rank hankel matrix denoising

    Mingzhou Yin and Roy S Smith. On low-rank hankel matrix denoising. IF AC-PapersOnLine, 54(7):198–203, 2021

  48. [48]

    Alternating projections on nontangential manifolds

    Fredrik Andersson and Marcus Carlsson. Alternating projections on nontangential manifolds. Constructive approximation, 38(3):489–525, 2013

  49. [49]

    N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems

    Peter Van Overschee and Bart De Moor. N4sid: Subspace algorithms for the identification of com- bined deterministic-stochastic systems. Automatica, 30(1):75–93, 1994. A Exact penalty for quadratic optimization The proof for Theorem 1 is divided into two parts: (1) We transform (17) and (18) to two new optimiza- tion problems, where the optimal solutions x...

  50. [50]

    for (A.1) such that x∗ 1 is an interior point ofX , there existsλ∗ w > 0, such that (A.1) and (A.2) have same optimal solutions for all λw > λ ∗ w. Proof. The key idea is to prove that the optimal solution (x∗ 1, ¯x∗ 2, x∗

  51. [51]

    viola- tion

    of (A.1) is a local minimum of (A.2) via the usage of (A.3). This optimal solution also becomes the global minimum since (A.2) is a convex problem. This is inspired by the notion of partial calmness [44]. (1) Problem (A.3) permits a certain degree of “viola- tion” of the constraint ∥¯x2∥p = 0. We first show that, in a local neighborhood of ( x∗ 1, ¯x∗ 2, ...

  52. [52]

    (2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

    is a local optimal solution of (A.3). (2) Next, we establish that there exists a local region around (x∗ 1, ¯x∗ 2, x∗

  53. [53]

    Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

    in which every feasible solution of (A.2) is also feasible for (A.3). Combining this fact with the result of Step 1, we conclude that (x∗ 1, ¯x∗ 2, x∗

  54. [54]

    It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1)

    is a local optimal solution of (A.2). It is thus globally optimal due to the convexity of (A.2) (3) Finally, we demonstrate that all optimal solutions of (A.2) are also optimal solutions of (A.1). Let A3 := [A1, A2D⊥] and P2 := col(I, 0). As x∗ 1 ∈ X ◦, there exists δr > 0 such that Bδr(x∗

  55. [55]

    Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1

    ⊆ X . Let δr σr+σp > 15 δ > 0 where σr is the induced p-norm of P2A† 3A2D† and σp is the constant such that∥v∥p ≤ σp∥v∥2 for any vector v with the same dimension of x1. For any ϵ ∈ [0, δ), let (x1, ¯x2, x3) ∈ B δ(x∗ 1, ¯x∗ 2, x∗

  56. [56]

    We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

    be a feasible solution for (A.3). We first show that there exists µ, such that xT 1 M x1 + µϵ ≥ x∗T 1 M x∗

  57. [57]

    A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

    (A.4) We can represent col(x1, x3) as col(x1, x3) = A† 3(b − A2D†¯x2) + v1, where v1 ∈ Null(A3). A feasible solution (˜x1, ¯x∗ 2, ˜x3) for (A.1) can be constructed as col(˜x1, ˜x3) = A† 3(b − A2D†¯x∗

  58. [58]

    We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1

    + v1. We then demonstrate ˜x1 ∈ X via deriving the difference between x1 and ˜x1. We have ∥x1 − ˜x1∥p = ∥P2A† 3A2D†¯x2∥p ≤ σrϵ ≤ σrδ. We then obtain ∥x∗ 1 − ˜x1∥p ≤ ∥x∗ 1 − x1∥p + ∥x1 − ˜x1∥p ≤ (σr + σp)δ ≤ δr, which implies ˜x1 ∈ B δr(x∗) ⊆ X . From the optimality condition of (A.1), we have xT 1 M x1 − x∗TM x∗ ≥xT 1 M x1 − ˜xT 1 M ˜x1 ≥ − L∥x1 − ˜x1∥p ≥...

  59. [59]

    We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2)

    Thus, we can let µ ≥ Lσr so that the inequality (A.4) is satisfied. We then prove that the optimal solution ( x∗ 1, ¯x∗ 2, x∗ 3) of (A.1) is a local minima of (A.2). We can let λw > λ∗ w ≥ Lσ. Since f(x) := ∥x∥p is a continuous func- tion, there exists δx > 0 such that ∥¯x2∥p < δ for any ¯x2 ∈ B δx(¯x∗ 2). Let δ∗ := min(δx, δ). For any feasible so- lution...

  60. [60]

    Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

    of (A.2), there ex- ists ϵ ∈ [0, δ) such that it is feasible to (A.3). Thus, from the inequality (A.4), we have for any (x1, ¯x2, x3) ∈ Bδ∗(x∗ 1, ¯x∗ 2, x∗ 3), xT 1 M x1 + λw∥¯x2∥p ≥ x∗T 1 M x∗ 1, which means ( x∗ 1, ¯x∗ 2, x∗

  61. [61]

    It is now clear that all optimal solutions of (A.1) are optimal to (A.2)

    is a local minima (as well as the global minima) of (A.2). It is now clear that all optimal solutions of (A.1) are optimal to (A.2). Next, suppose ( x∗ 1w, ¯x∗ 2w, x∗ 3w) is an optimal solution of (A.2) and recall that λw > λ ∗ w ≥ Lσr. We have x∗T 1wM x∗ 1w+λw∥¯x∗ 2w∥p = x∗T 1 M x∗ 1 ≤ x∗T 1wM x∗ 1w+λ∗ w∥¯x∗ 2w∥p, which leads to (µ − ¯µ)∥¯x∗ 2w∥2 ≤ 0 and...

  62. [62]

    We combine Propositions 6 and 7 to establish Theorem 1

    Thus, (x∗ 1w, ¯x∗ 2w, x∗ 3w) is also feasible for (A.1) and is its optimal solution. We combine Propositions 6 and 7 to establish Theorem 1. Proof of Theorem 1: From Proposition 6, the optimal solution x∗ 1 to (17) and (A.1) is the same. Furthermore, the assumption that (x∗ 1, x∗

  63. [63]

    is an optimal solution with x∗ 1 ∈ X ◦ for (17) implies there exists an optimal solution (x∗ 1, ¯x∗ 2, x∗

  64. [64]

    Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w

    and x∗ 1 ∈ X ◦ for (A.1). Thus, using Proposi- tion 7, there existsλ∗ w such that the optimal solution sets x∗ 1 for (A.1) and (A.2) are equivalent for all λw > λ ∗ w. That means those of (17) and (A.2) are the same with λ∗ w. Since we also have the optimal solutions of x∗ 1 are the same for (18) and (A.2), this illustrates (17) and (18) have the same opt...

  65. [65]

    into the objective function of (18) we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1, which means ( x∗ 1, x∗

  66. [66]

    Then, suppose ( x∗ 1, x∗

    is an optimal solution for (18). Then, suppose ( x∗ 1, x∗

  67. [67]

    Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0

    is an optimal solution for (18). Since x∗ 1 is also an optimal solution for (17) and their optimal values are the same, we have x∗T 1 M x∗ 1 + λw∥Dx∗ 2∥p = x∗T 1 M x∗ 1 ⇒ ∥ Dx∗ 2∥p = 0. Thus, (x∗ 1, x∗

  68. [68]

    L11 0 L21 L22 #

    is an optimal solution for (17). This com- pletes the proof. B Technical proofs B.1 Relation with the classcial SPC The classical SPC is of the following form min σy∈Γ,u∈U ,y∈Y ∥y∥2 Q + ∥u∥2 R + λy∥σy∥2 2 subject to y = YF   UP Yp UF   †  uini yini + σy u   , (B.1) which does not have the variableg. We can establish the following equivalen...

  69. [69]

    Let x∗, g∗ be the primal optimal solution of (32) and µ∗ 1, µ∗ 2 be the dual optimal solutions for the inequality constraint and equality constraint of (32) respectively. Since (32) satisfies Slater’s condition, x∗, g∗, µ∗ 1 and µ∗ 2 satisfy the KKT condition and have the follow properties 0 ∈ ∂(f(x) + µ∗ 1(∥g∥1 − αc) + µ∗T 2 (Hg − v∗)) at (x, g) = (x∗, g...

  70. [70]

    Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality

    Then, the KKT condition (B.8) holds as it becomes equivalent to (B.7). Thus, ( σ∗ y, u∗, y∗, g∗) ( i.e., (x∗, g∗)) is also an optimal solution for (33) as the KKT condition is sufficient to guarantee optimality. B.6 Proof of Proposition 5 The key idea for the proof of the Proposition 5 is that we can first partition both Hy and ˜Hy into the part in the ro...