Real-Time Auto-Optimization in Unknown Environments via Structure-Exploiting Dual Control for Exploration and Exploitation

Haoyang Yang; Qiwei Liu; Shiying Dong; Wen-Hua Chen

arxiv: 2605.22431 · v1 · pith:ANYTWMCKnew · submitted 2026-05-21 · 💻 cs.RO

Real-Time Auto-Optimization in Unknown Environments via Structure-Exploiting Dual Control for Exploration and Exploitation

Shiying Dong , Haoyang Yang , Qiwei Liu , Wen-Hua Chen This is my paper

Pith reviewed 2026-05-22 05:46 UTC · model grok-4.3

classification 💻 cs.RO

keywords dual controlexploration and exploitationauto-optimizationstructure-exploiting methodGauss-Newton approximationunknown environmentsreal-time controlembedded computation

0 comments

The pith

A convex-over-nonlinear reward structure allows real-time dual control for auto-optimization by linearizing only the nonlinear map.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a numerical dual control method for exploration and exploitation that addresses auto-optimization when the best operating point is unknown and changes with the environment. It identifies that the reward function combines exploitation and exploration terms into a nonlinear residual map under a convex outer loss. This structure lets the method linearize only the nonlinear residual while keeping the convex loss intact, turning each iteration into a reliably solvable convex subproblem. The resulting generalized Gauss-Newton approximation uses only first-order derivatives and stays positive semidefinite, cutting computation time enough for embedded hardware. Tests on a vehicle cruising task show both better performance and roughly tenfold faster solves than prior approaches.

Core claim

The reward function in DCEE has an inherent convex-over-nonlinear structure, where the exploitation and exploration terms form a unified nonlinear residual map equipped with a convex outer loss. Benefiting from this structure, a structure-exploiting numerical method is developed by linearizing only the nonlinear residual map while preserving the convex outer loss. Thus each subproblem is transformed into a structured convex form that can be solved reliably. The resulting generalized Gauss-Newton Hessian approximation is positive semidefinite and depends only on first-order derivatives, thereby supporting fast online computation.

What carries the argument

Convex-over-nonlinear structure of the DCEE reward function, which allows linearizing only the nonlinear residual map while retaining the convex outer loss to produce structured convex subproblems solved by generalized Gauss-Newton approximation.

If this is right

The method improves control performance on the vehicle cruising auto-optimization problem.
Computation time reaches a maximum of 83 microseconds on a typical vehicle embedded CPU.
The approach achieves an approximate order-of-magnitude speedup over existing DCEE realizations.
Each iteration remains a reliably solvable structured convex problem without needing general-purpose solvers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same linearization trick could apply to other dual-control or adaptive-optimization settings whose objectives separate into convex outer losses and nonlinear maps.
Embedded systems with similar timing constraints might adopt the technique once the reward structure is verified for their specific task.
Extending the method to fully time-varying environments would require checking whether the convex-over-nonlinear property holds across changing operating regimes.

Load-bearing premise

The reward function in DCEE possesses an inherent convex-over-nonlinear structure that permits linearizing only the nonlinear residual map while preserving the convex outer loss to obtain structured convex subproblems.

What would settle it

A hardware test in which the generalized Gauss-Newton subproblems lose positive-semidefiniteness or exceed real-time deadlines on the target embedded CPU would falsify the claimed speedup and reliability.

Figures

Figures reproduced from arXiv: 2605.22431 by Haoyang Yang, Qiwei Liu, Shiying Dong, Wen-Hua Chen.

**Figure 4.** Figure 4: CPU time comparison between the proposed SCP-DCEE method and [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 3.** Figure 3: Radar-chart comparison of Numerical DCEE and Classical DCEE [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 6.** Figure 6: The experiment result of HiL under the changing driving condition. [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 5.** Figure 5: The HiL experiment platform. an online optimization problem. However, as shown in the previous comparisons, this computational simplicity comes at the cost of larger transient oscillations and higher cumulative regret. The proposed SCP-DCEE method provides a fast and accurate numerical implementation. C. Hardware-in-the-loop Experiment To verify the real-time capability of the proposed numerical method on … view at source ↗

read the original abstract

This paper develops a fast numerical dual control for exploration and exploitation (DCEE) method to address auto-optimization problems in unknown environments. In auto-optimization problems, the optimal operating condition is unknown a priori and may vary with the environment. As in classical dual control techniques, computational burden remains a major concern in DCEE for active learning. Existing DCEE methods provide a principled exploration-exploitation objective, but mainly realized through standard optimization packages or explicit gradient-type update laws, where the numerical structure of the DCEE has not been fully exploited. This paper shows that the reward function in DCEE has an inherent convex-over-nonlinear structure, where the exploitation and exploration terms form a unified nonlinear residual map equipped with a convex outer loss. Benefiting from this structure, a structure-exploiting numerical method is developed by linearizing only the nonlinear residual map while preserving the convex outer loss. Thus, each subproblem is transformed into a structured convex form that can be solved reliably. The resulting generalized Gauss-Newton Hessian approximation is positive semidefinite and depends only on first-order derivatives, thereby supporting fast online computation. The proposed method is evaluated on a vehicle cruising auto-optimization problem and compared with existing methods. Simulation and hardware-in-the-loop experimental results show that the proposed method improves control performance and achieves a speedup of approximately one order of magnitude, with a microsecond-level maximum computation time of only 83 {\mu}s on a typical vehicle embedded CPU.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a practical numerical speedup for DCEE by exploiting a convex-over-nonlinear reward structure, but the gains rest on a decomposition that may be example-specific.

read the letter

The main takeaway is that this paper presents a structure-exploiting numerical method for dual control for exploration and exploitation that converts subproblems into structured convex programs and reports an order-of-magnitude speedup with a maximum computation time of 83 microseconds on a vehicle embedded CPU. The approach linearizes only the nonlinear residual map while preserving the convex outer loss, then applies a generalized Gauss-Newton approximation that stays positive semidefinite and first-order only. This is the concrete technical step beyond calling standard solvers or using explicit gradient updates. The vehicle cruising auto-optimization example is worked out with direct comparisons, simulation runs, and hardware-in-the-loop tests that include timing numbers. Those results make the real-time claim easier to evaluate and show measurable control performance gains. The paper does a reasonable job tying the method back to the known computational burden in DCEE. The soft spot is the assumption that the reward function always possesses this clean convex-over-nonlinear separation. The stress-test note is on target here: if the outer loss is not convex or the residual does not separate for other exploration-exploitation objectives, the subproblems lose their convexity guarantee and the Hessian approximation can fail to be reliable. The paper demonstrates the structure for the chosen vehicle reward, but without a general proof or checks on varied formulations the scope remains unclear. Derivation details and experimental protocol would also benefit from more explicit coverage to support reproducibility. This work is aimed at control researchers and engineers who need fast online optimization for adaptive systems on embedded hardware, especially in robotics or automotive settings. A reader interested in numerical techniques for real-time dual control would find the solver construction useful. It has enough technical substance and empirical grounding to warrant a serious referee, even if revisions are needed on generality and documentation.

Referee Report

2 major / 2 minor

Summary. The manuscript presents a structure-exploiting numerical method for real-time dual control for exploration and exploitation (DCEE) in auto-optimization problems in unknown environments. It identifies an inherent convex-over-nonlinear structure in the DCEE reward function, where exploitation and exploration terms form a unified nonlinear residual map with a convex outer loss. By linearizing only the nonlinear residual map while preserving the convex outer loss, each subproblem is transformed into a structured convex program solved via a generalized Gauss-Newton approximation whose Hessian is positive semidefinite and depends only on first-order derivatives. The method is evaluated on a vehicle cruising auto-optimization problem, with simulation and hardware-in-the-loop results claiming improved control performance, roughly one order of magnitude speedup, and a maximum computation time of 83 μs on a typical embedded vehicle CPU.

Significance. If the structural decomposition holds generally, the work could enable practical real-time DCEE on embedded hardware by converting otherwise expensive dual-control subproblems into reliably convex forms without sacrificing the principled exploration-exploitation objective. The emphasis on exploiting an exact mathematical structure (rather than fitted parameters or black-box solvers) and the reported microsecond-level timings on vehicle CPUs are concrete strengths that would support broader adoption in robotics if the convexity guarantee is rigorously established.

major comments (2)

[Abstract and Method] Abstract and core method description: the claim that the DCEE reward possesses an 'inherent convex-over-nonlinear structure' allowing linearization of only the nonlinear residual map while keeping the convex outer loss intact is load-bearing for the PSD property of the generalized Gauss-Newton Hessian and the reliability of the structured convex subproblems. The manuscript asserts this decomposition but supplies no general proof, set of sufficient conditions, or counter-example verification that the separation survives for other DCEE reward formulations beyond the vehicle-cruising example (which may satisfy the structure by construction).
[Experimental Results] Experimental evaluation: the abstract reports positive simulation and hardware-in-the-loop results with concrete timing numbers (83 μs maximum), yet the provided description contains no derivation details, error analysis, ablation on the structure assumption, or full experimental protocol. This leaves only moderate support for the performance and speedup claims when the convexity guarantee is stressed.

minor comments (2)

[Notation and Method] Define the nonlinear residual map and convex outer loss with explicit mathematical notation and an equation reference early in the method section to improve clarity and reproducibility.
[Figures and Results] In timing and performance figures, include multiple runs or statistical measures (e.g., mean and standard deviation) rather than single-run or best-case values to substantiate the reported order-of-magnitude speedup.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential of our structure-exploiting approach to enable real-time DCEE on embedded hardware. We address each major comment below and commit to revisions that strengthen the manuscript without altering its core contributions.

read point-by-point responses

Referee: [Abstract and Method] Abstract and core method description: the claim that the DCEE reward possesses an 'inherent convex-over-nonlinear structure' allowing linearization of only the nonlinear residual map while keeping the convex outer loss intact is load-bearing for the PSD property of the generalized Gauss-Newton Hessian and the reliability of the structured convex subproblems. The manuscript asserts this decomposition but supplies no general proof, set of sufficient conditions, or counter-example verification that the separation survives for other DCEE reward formulations beyond the vehicle-cruising example (which may satisfy the structure by construction).

Authors: We agree that a clearer statement of the conditions under which the convex-over-nonlinear decomposition holds would improve rigor. The structure arises directly from the standard DCEE reward formulation in auto-optimization, in which both exploitation and exploration terms are expressed as a convex outer loss applied to a nonlinear residual that encodes the unknown environment model. In the revised manuscript we will insert a short subsection deriving the decomposition from the general DCEE objective, stating the mild assumptions (convex loss, differentiable nonlinear map) that guarantee the generalized Gauss-Newton Hessian remains positive semidefinite, and briefly discussing how the same structure appears in other common DCEE reward designs. This addition addresses the referee's concern while remaining faithful to the paper's focus on the vehicle-cruising application. revision: yes
Referee: [Experimental Results] Experimental evaluation: the abstract reports positive simulation and hardware-in-the-loop results with concrete timing numbers (83 μs maximum), yet the provided description contains no derivation details, error analysis, ablation on the structure assumption, or full experimental protocol. This leaves only moderate support for the performance and speedup claims when the convexity guarantee is stressed.

Authors: We accept that the experimental section would benefit from greater transparency. In the revision we will: (i) provide the complete experimental protocol, including all hyper-parameters, environment variation ranges, and hardware specifications; (ii) report timing statistics with standard deviations and measurement methodology on the embedded CPU; (iii) add an ablation that disables the structure-exploiting linearization and compares both convexity and runtime; and (iv) include a short error analysis of the performance metrics. These changes will give stronger empirical grounding for the reported one-order-of-magnitude speedup and 83 μs maximum latency. revision: yes

Circularity Check

0 steps flagged

No significant circularity; structure exploitation is an independent mathematical observation

full rationale

The central derivation begins from the stated DCEE reward formulation and identifies an inherent convex-over-nonlinear decomposition (exploitation/exploration terms as nonlinear residual map with convex outer loss). This decomposition is then used to justify linearizing only the residual map, preserving convexity, and applying a generalized Gauss-Newton Hessian that is first-order and PSD by construction of the outer loss. No step reduces the claimed result to a fitted parameter renamed as prediction, a self-citation chain, or a definition that presupposes the target speedup or convexity guarantee. The vehicle-cruising example is presented as an instance that satisfies the structure, not as the source that defines it. The paper's claims about microsecond-level solve times and performance improvement are supported by simulation/HIL experiments rather than by tautological reduction to the input equations. This is the normal case of a paper that exploits an observed algebraic property without circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim depends on the domain assumption that the DCEE reward function exhibits an exploitable convex-over-nonlinear structure; no free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption The reward function in DCEE has an inherent convex-over-nonlinear structure.
Invoked to justify linearizing only the nonlinear residual map while preserving the convex outer loss.

pith-pipeline@v0.9.0 · 5806 in / 1252 out tokens · 52967 ms · 2026-05-22T05:46:17.805353+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the reward function in DCEE has an inherent convex-over-nonlinear structure, where the exploitation and exploration terms form a unified nonlinear residual map equipped with a convex outer loss
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean costAlphaLog_high_calibrated_iff unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

the generalized Gauss-Newton Hessian approximation is positive semidefinite and depends only on first-order derivatives

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages

[1]

Control for societal-scale challenges: Road map 2030,

A. M. Annaswamy, K. H. Johansson, and G. Pappas, “Control for societal-scale challenges: Road map 2030,”IEEE Control Systems Magazine, vol. 44, no. 3, pp. 30–32, 2024

work page 2030
[2]

Sciarretta, A

A. Sciarretta, A. Vahidiet al.,Energy-efficient driving of road vehicles. Springer, 2020

work page 2020
[3]

Fundamentals of energy efficient driving for combustion engine and electric vehicles: An optimal control perspective,

J. Han, A. Vahidi, and A. Sciarretta, “Fundamentals of energy efficient driving for combustion engine and electric vehicles: An optimal control perspective,”Automatica, vol. 103, pp. 558–572, 2019

work page 2019
[4]

Energy-aware optimization of connected and automated electric vehicles considering vehicle-traffic nexus,

Y . Zhang, J. Chen, T. You, Y . Zhang, Z. Liu, and C. Du, “Energy-aware optimization of connected and automated electric vehicles considering vehicle-traffic nexus,”IEEE Transactions on Industrial Electronics, vol. 71, no. 1, pp. 282–293, 2024

work page 2024
[5]

Information-based search for an atmospheric release using a mobile robot: Algorithm and ex- periments,

M. Hutchinson, C. Liu, and W.-H. Chen, “Information-based search for an atmospheric release using a mobile robot: Algorithm and ex- periments,”IEEE Transactions on Control Systems Technology, vol. 27, no. 6, pp. 2388–2402, 2018

work page 2018
[6]

Autonomous source term estima- tion in unknown environments: From a dual control concept to UA V deployment,

C. Rhodes, C. Liu, and W.-H. Chen, “Autonomous source term estima- tion in unknown environments: From a dual control concept to UA V deployment,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2274–2281, 2022

work page 2022
[7]

Adaptive efficiency optimization control of VLF-PM motors considering operation uncertainties with DCEE,

X. Zhu, Y . Wu, L. Zhang, W.-H. Chen, L. Hu, and Y . Wang, “Adaptive efficiency optimization control of VLF-PM motors considering operation uncertainties with DCEE,”IEEE Transactions on Industrial Electronics, vol. 73, no. 5, pp. 6712–6721, 2026. 10

work page 2026
[8]

Auto-optimization of energy genera- tion for wave energy converters with active learning,

S. Tang, W.-H. Chen, and C. Liu, “Auto-optimization of energy genera- tion for wave energy converters with active learning,”Ocean Engineer- ing, vol. 351, p. 124313, 2026

work page 2026
[9]

Adaptive coordinated motion control: Automated tuning for predictive safety in electric vehicles,

H. Sun, L. Zhang, Y . Yang, X. Ye, X. Liu, and H. Chen, “Adaptive coordinated motion control: Automated tuning for predictive safety in electric vehicles,”IEEE Transactions on Industrial Electronics, vol. 72, no. 7, pp. 7415–7425, 2025

work page 2025
[10]

J. B. Rawlings, D. Q. Mayne, and M. Diehl,Model predictive control: theory, computation, and design. Nob Hill Publishing Madison, WI, 2017, vol. 2

work page 2017
[11]

Synthesis of model predictive control and reinforcement learning: Survey and classification,

R. Reiter, J. Hoffmann, D. Reinhardt, F. Messerer, K. Baumg ¨artner, S. Sawant, J. Boedecker, M. Diehl, and S. Gros, “Synthesis of model predictive control and reinforcement learning: Survey and classification,” Annual Reviews in Control, vol. 61, p. 101045, 2026

work page 2026
[12]

Active learn- ing of discrete-time dynamics for uncertainty-aware model predictive control,

A. Saviolo, J. Frey, A. Rathod, M. Diehl, and G. Loianno, “Active learn- ing of discrete-time dynamics for uncertainty-aware model predictive control,”IEEE Transactions on Robotics, vol. 40, pp. 1273–1291, 2023

work page 2023
[13]

Auto-optimization with active learning in uncertain environment: A predictive control approach,

Y . Tan, J. Yang, Z. Li, W.-H. Chen, and S. Li, “Auto-optimization with active learning in uncertain environment: A predictive control approach,” arXiv preprint arXiv:2512.04647, 2025

work page arXiv 2025
[14]

Perspective view of autonomous control in unknown envi- ronment: Dual control for exploitation and exploration vs reinforcement learning,

W.-H. Chen, “Perspective view of autonomous control in unknown envi- ronment: Dual control for exploitation and exploration vs reinforcement learning,”Neurocomputing, vol. 497, pp. 50–63, 2022

work page 2022
[15]

Dual control of exploration and exploitation for auto-optimization control with active learning,

Z. Li, W.-H. Chen, J. Yang, and Y . Yan, “Dual control of exploration and exploitation for auto-optimization control with active learning,”IEEE Transactions on Automation Science and Engineering, vol. 22, pp. 2145– 2158, 2025

work page 2025
[16]

k-step look-ahead active concurrent learning-based dual control of exploration and exploitation for auto-optimization,

Y . Yu, J. Jiang, W.-H. Chen, and Y . Zuo, “k-step look-ahead active concurrent learning-based dual control of exploration and exploitation for auto-optimization,”IEEE Transactions on Cybernetics, 2026

work page 2026
[17]

Dual control for exploitation and exploration (DCEE) in autonomous search,

W.-H. Chen, C. Rhodes, and C. Liu, “Dual control for exploitation and exploration (DCEE) in autonomous search,”Automatica, vol. 133, p. 109851, 2021

work page 2021
[18]

Concurrent active learning in au- tonomous airborne source search: Dual control for exploration and exploitation,

Z. Li, W.-H. Chen, and J. Yang, “Concurrent active learning in au- tonomous airborne source search: Dual control for exploration and exploitation,”IEEE Transactions on Automatic Control, vol. 68, no. 5, pp. 3123–3130, 2022

work page 2022
[19]

Multistep dual control for exploration and exploitation in autonomous search with convergence guarantee,

Y . Tan, J. Yang, W.-H. Chen, and S. Li, “Multistep dual control for exploration and exploitation in autonomous search with convergence guarantee,”IEEE Transactions on Industrial Informatics, vol. 20, no. 6, pp. 8207–8217, 2024

work page 2024
[20]

Dual control for autonomous airborne source search with nesterov accelerated gradient descent: Algorithm and performance analysis,

G. Tan, W.-H. Chen, J. Yang, X.-T. Tran, and Z. Li, “Dual control for autonomous airborne source search with nesterov accelerated gradient descent: Algorithm and performance analysis,”Neurocomputing, vol. 630, p. 129729, 2025

work page 2025
[21]

Cooperative active learning- based dual control for exploration and exploitation in autonomous search,

Z. Li, W.-H. Chen, J. Yang, and C. Liu, “Cooperative active learning- based dual control for exploration and exploitation in autonomous search,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 2, pp. 2221–2233, 2024

work page 2024
[22]

Dual control for active estimation and path planning in the automation of robotic assembly tasks,

P. Pashupathy, M. Coombes, W.-H. Chen, D. Lake, Y . Yu, C. Sun, M. S. Bahraini, P. Kinnell, and N. Lohse, “Dual control for active estimation and path planning in the automation of robotic assembly tasks,”IEEE Transactions on Automation Science and Engineering, 2026

work page 2026
[23]

Autonomous sensorless control strategy for FP-PMSM considering operation uncer- tainties with DCEE,

L. Zhang, Y . Wu, X. Zhu, W.-H. Chen, L. Shi, and S. Luo, “Autonomous sensorless control strategy for FP-PMSM considering operation uncer- tainties with DCEE,”IEEE Transactions on Transportation Electrifica- tion, 2026

work page 2026
[24]

Adaptive dual control,

B. Wittenmark, “Adaptive dual control,”Control Systems, Robotics and Automation, vol. 10, pp. 122–132, 2008

work page 2008
[25]

Survey of adaptive dual control methods,

N. M. Filatov and H. Unbehauen, “Survey of adaptive dual control methods,”IEE Proceedings-Control Theory and Applications, vol. 147, no. 1, pp. 118–128, 2000

work page 2000
[26]

Survey of sequential convex programming and generalized Gauss-Newton methods,

F. Messerer, K. Baumg ¨artner, and M. Diehl, “Survey of sequential convex programming and generalized Gauss-Newton methods,”ESAIM: Proceedings and Surveys, vol. 71, pp. 64–88, 2021

work page 2021
[27]

Local convergence of generalized gauss- newton and sequential convex programming,

M. Diehl and F. Messerer, “Local convergence of generalized gauss- newton and sequential convex programming,” in2019 IEEE 58th Con- ference on Decision and Control (CDC). IEEE, 2019, pp. 3942–3947

work page 2019
[28]

On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,

A. W ¨achter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006

work page 2006
[29]

CasADi—a software framework for nonlinear optimization and optimal control,

J. Andersson, J. Gillis, G. Horn, J. Rawlings, and M. Diehl, “CasADi—a software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, vol. 11, no. 1, pp. 1–36, 2018

work page 2018
[30]

BLAS- FEO: Basic linear algebra subroutines for embedded optimization,

G. Frison, D. Kouzoupis, T. Sartor, A. Zanelli, and M. Diehl, “BLAS- FEO: Basic linear algebra subroutines for embedded optimization,”ACM Transactions on Mathematical Software (TOMS), vol. 44, no. 4, pp. 1– 30, 2018. Shiying Dongreceived the B.S. degree in automa- tion and the Ph.D. degree in control science and en- gineering from Jilin University, Changc...

work page 2018

[1] [1]

Control for societal-scale challenges: Road map 2030,

A. M. Annaswamy, K. H. Johansson, and G. Pappas, “Control for societal-scale challenges: Road map 2030,”IEEE Control Systems Magazine, vol. 44, no. 3, pp. 30–32, 2024

work page 2030

[2] [2]

Sciarretta, A

A. Sciarretta, A. Vahidiet al.,Energy-efficient driving of road vehicles. Springer, 2020

work page 2020

[3] [3]

Fundamentals of energy efficient driving for combustion engine and electric vehicles: An optimal control perspective,

J. Han, A. Vahidi, and A. Sciarretta, “Fundamentals of energy efficient driving for combustion engine and electric vehicles: An optimal control perspective,”Automatica, vol. 103, pp. 558–572, 2019

work page 2019

[4] [4]

Energy-aware optimization of connected and automated electric vehicles considering vehicle-traffic nexus,

Y . Zhang, J. Chen, T. You, Y . Zhang, Z. Liu, and C. Du, “Energy-aware optimization of connected and automated electric vehicles considering vehicle-traffic nexus,”IEEE Transactions on Industrial Electronics, vol. 71, no. 1, pp. 282–293, 2024

work page 2024

[5] [5]

Information-based search for an atmospheric release using a mobile robot: Algorithm and ex- periments,

M. Hutchinson, C. Liu, and W.-H. Chen, “Information-based search for an atmospheric release using a mobile robot: Algorithm and ex- periments,”IEEE Transactions on Control Systems Technology, vol. 27, no. 6, pp. 2388–2402, 2018

work page 2018

[6] [6]

Autonomous source term estima- tion in unknown environments: From a dual control concept to UA V deployment,

C. Rhodes, C. Liu, and W.-H. Chen, “Autonomous source term estima- tion in unknown environments: From a dual control concept to UA V deployment,”IEEE Robotics and Automation Letters, vol. 7, no. 2, pp. 2274–2281, 2022

work page 2022

[7] [7]

Adaptive efficiency optimization control of VLF-PM motors considering operation uncertainties with DCEE,

X. Zhu, Y . Wu, L. Zhang, W.-H. Chen, L. Hu, and Y . Wang, “Adaptive efficiency optimization control of VLF-PM motors considering operation uncertainties with DCEE,”IEEE Transactions on Industrial Electronics, vol. 73, no. 5, pp. 6712–6721, 2026. 10

work page 2026

[8] [8]

Auto-optimization of energy genera- tion for wave energy converters with active learning,

S. Tang, W.-H. Chen, and C. Liu, “Auto-optimization of energy genera- tion for wave energy converters with active learning,”Ocean Engineer- ing, vol. 351, p. 124313, 2026

work page 2026

[9] [9]

Adaptive coordinated motion control: Automated tuning for predictive safety in electric vehicles,

H. Sun, L. Zhang, Y . Yang, X. Ye, X. Liu, and H. Chen, “Adaptive coordinated motion control: Automated tuning for predictive safety in electric vehicles,”IEEE Transactions on Industrial Electronics, vol. 72, no. 7, pp. 7415–7425, 2025

work page 2025

[10] [10]

J. B. Rawlings, D. Q. Mayne, and M. Diehl,Model predictive control: theory, computation, and design. Nob Hill Publishing Madison, WI, 2017, vol. 2

work page 2017

[11] [11]

Synthesis of model predictive control and reinforcement learning: Survey and classification,

R. Reiter, J. Hoffmann, D. Reinhardt, F. Messerer, K. Baumg ¨artner, S. Sawant, J. Boedecker, M. Diehl, and S. Gros, “Synthesis of model predictive control and reinforcement learning: Survey and classification,” Annual Reviews in Control, vol. 61, p. 101045, 2026

work page 2026

[12] [12]

Active learn- ing of discrete-time dynamics for uncertainty-aware model predictive control,

A. Saviolo, J. Frey, A. Rathod, M. Diehl, and G. Loianno, “Active learn- ing of discrete-time dynamics for uncertainty-aware model predictive control,”IEEE Transactions on Robotics, vol. 40, pp. 1273–1291, 2023

work page 2023

[13] [13]

Auto-optimization with active learning in uncertain environment: A predictive control approach,

Y . Tan, J. Yang, Z. Li, W.-H. Chen, and S. Li, “Auto-optimization with active learning in uncertain environment: A predictive control approach,” arXiv preprint arXiv:2512.04647, 2025

work page arXiv 2025

[14] [14]

Perspective view of autonomous control in unknown envi- ronment: Dual control for exploitation and exploration vs reinforcement learning,

W.-H. Chen, “Perspective view of autonomous control in unknown envi- ronment: Dual control for exploitation and exploration vs reinforcement learning,”Neurocomputing, vol. 497, pp. 50–63, 2022

work page 2022

[15] [15]

Dual control of exploration and exploitation for auto-optimization control with active learning,

Z. Li, W.-H. Chen, J. Yang, and Y . Yan, “Dual control of exploration and exploitation for auto-optimization control with active learning,”IEEE Transactions on Automation Science and Engineering, vol. 22, pp. 2145– 2158, 2025

work page 2025

[16] [16]

k-step look-ahead active concurrent learning-based dual control of exploration and exploitation for auto-optimization,

Y . Yu, J. Jiang, W.-H. Chen, and Y . Zuo, “k-step look-ahead active concurrent learning-based dual control of exploration and exploitation for auto-optimization,”IEEE Transactions on Cybernetics, 2026

work page 2026

[17] [17]

Dual control for exploitation and exploration (DCEE) in autonomous search,

W.-H. Chen, C. Rhodes, and C. Liu, “Dual control for exploitation and exploration (DCEE) in autonomous search,”Automatica, vol. 133, p. 109851, 2021

work page 2021

[18] [18]

Concurrent active learning in au- tonomous airborne source search: Dual control for exploration and exploitation,

Z. Li, W.-H. Chen, and J. Yang, “Concurrent active learning in au- tonomous airborne source search: Dual control for exploration and exploitation,”IEEE Transactions on Automatic Control, vol. 68, no. 5, pp. 3123–3130, 2022

work page 2022

[19] [19]

Multistep dual control for exploration and exploitation in autonomous search with convergence guarantee,

Y . Tan, J. Yang, W.-H. Chen, and S. Li, “Multistep dual control for exploration and exploitation in autonomous search with convergence guarantee,”IEEE Transactions on Industrial Informatics, vol. 20, no. 6, pp. 8207–8217, 2024

work page 2024

[20] [20]

Dual control for autonomous airborne source search with nesterov accelerated gradient descent: Algorithm and performance analysis,

G. Tan, W.-H. Chen, J. Yang, X.-T. Tran, and Z. Li, “Dual control for autonomous airborne source search with nesterov accelerated gradient descent: Algorithm and performance analysis,”Neurocomputing, vol. 630, p. 129729, 2025

work page 2025

[21] [21]

Cooperative active learning- based dual control for exploration and exploitation in autonomous search,

Z. Li, W.-H. Chen, J. Yang, and C. Liu, “Cooperative active learning- based dual control for exploration and exploitation in autonomous search,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 2, pp. 2221–2233, 2024

work page 2024

[22] [22]

Dual control for active estimation and path planning in the automation of robotic assembly tasks,

P. Pashupathy, M. Coombes, W.-H. Chen, D. Lake, Y . Yu, C. Sun, M. S. Bahraini, P. Kinnell, and N. Lohse, “Dual control for active estimation and path planning in the automation of robotic assembly tasks,”IEEE Transactions on Automation Science and Engineering, 2026

work page 2026

[23] [23]

Autonomous sensorless control strategy for FP-PMSM considering operation uncer- tainties with DCEE,

L. Zhang, Y . Wu, X. Zhu, W.-H. Chen, L. Shi, and S. Luo, “Autonomous sensorless control strategy for FP-PMSM considering operation uncer- tainties with DCEE,”IEEE Transactions on Transportation Electrifica- tion, 2026

work page 2026

[24] [24]

Adaptive dual control,

B. Wittenmark, “Adaptive dual control,”Control Systems, Robotics and Automation, vol. 10, pp. 122–132, 2008

work page 2008

[25] [25]

Survey of adaptive dual control methods,

N. M. Filatov and H. Unbehauen, “Survey of adaptive dual control methods,”IEE Proceedings-Control Theory and Applications, vol. 147, no. 1, pp. 118–128, 2000

work page 2000

[26] [26]

Survey of sequential convex programming and generalized Gauss-Newton methods,

F. Messerer, K. Baumg ¨artner, and M. Diehl, “Survey of sequential convex programming and generalized Gauss-Newton methods,”ESAIM: Proceedings and Surveys, vol. 71, pp. 64–88, 2021

work page 2021

[27] [27]

Local convergence of generalized gauss- newton and sequential convex programming,

M. Diehl and F. Messerer, “Local convergence of generalized gauss- newton and sequential convex programming,” in2019 IEEE 58th Con- ference on Decision and Control (CDC). IEEE, 2019, pp. 3942–3947

work page 2019

[28] [28]

On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,

A. W ¨achter and L. T. Biegler, “On the implementation of an interior- point filter line-search algorithm for large-scale nonlinear programming,” Mathematical Programming, vol. 106, no. 1, pp. 25–57, 2006

work page 2006

[29] [29]

CasADi—a software framework for nonlinear optimization and optimal control,

J. Andersson, J. Gillis, G. Horn, J. Rawlings, and M. Diehl, “CasADi—a software framework for nonlinear optimization and optimal control,” Mathematical Programming Computation, vol. 11, no. 1, pp. 1–36, 2018

work page 2018

[30] [30]

BLAS- FEO: Basic linear algebra subroutines for embedded optimization,

G. Frison, D. Kouzoupis, T. Sartor, A. Zanelli, and M. Diehl, “BLAS- FEO: Basic linear algebra subroutines for embedded optimization,”ACM Transactions on Mathematical Software (TOMS), vol. 44, no. 4, pp. 1– 30, 2018. Shiying Dongreceived the B.S. degree in automa- tion and the Ph.D. degree in control science and en- gineering from Jilin University, Changc...

work page 2018