arxiv: 2603.23748 · v2 · submitted 2026-03-24 · 📡 eess.SY · cs.SY

Recognition: no theorem link

Data-driven online control for real-time optimal economic dispatch and temperature regulation in district heating systems

Xinyi Yi , Ioannis Lestas

Authors on Pith no claims yet

Pith reviewed 2026-05-15 00:03 UTC · model grok-4.3

classification 📡 eess.SY cs.SY

keywords district heating systemseconomic dispatchdata-driven controlonline optimizationtemperature regulationpolicy optimizationmodel mismatch robustness

0 comments

The pith

Embedding steady-state economic optimality conditions into temperature dynamics makes the closed-loop district heating system converge to the optimal operating point without disturbance forecasts.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a data-driven online control method for district heating systems that coordinates economic dispatch with temperature regulation under uncertain conditions. It embeds the conditions for long-term economic optimality directly into the temperature dynamics so the closed-loop system drives itself to the optimal point. A Data-Enabled Policy Optimization controller augmented with adaptive moment estimation is then applied to learn the required adjustments online. Convergence and performance guarantees are derived for the resulting closed-loop behavior. Industrial-park simulations confirm stable near-optimal operation and robustness to both static and time-varying model mismatch.

Core claim

By embedding the steady-state economic optimality conditions into the temperature dynamics, the closed-loop system converges to the economically optimal operating point without relying on disturbance forecasts. A Data-Enabled Policy Optimization (DeePO)-based online learning controller incorporating Adaptive Moment Estimation (ADAM) is developed, and convergence together with performance guarantees are established for the closed-loop system.

What carries the argument

Embedding of steady-state economic optimality conditions into the temperature dynamics, which forces the closed-loop trajectories toward the economically optimal steady state under uncertain disturbances and model mismatch.

If this is right

The closed-loop system reaches stable near-optimal economic operation under practical disturbance conditions.
Strong empirical robustness is obtained against both static and time-varying model mismatch.
Convergence and performance guarantees hold for the data-driven controller without forecast information.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same embedding idea could be tested on other large-scale energy networks where forecasts are unreliable.
Real-time policy updates might reduce the need for periodic offline recalibration of dispatch models.
Extension to networks with storage or renewable inputs would require checking whether the embedding still preserves the optimality conditions.

Load-bearing premise

Steady-state economic optimality conditions can be embedded into the temperature dynamics such that the closed-loop system still converges under uncertain operating conditions and model mismatch.

What would settle it

A simulation or field test in which the closed-loop temperatures and dispatch fail to approach the known economic optimum despite the embedding, or in which performance degrades sharply under realistic model mismatch.

Figures

Figures reproduced from arXiv: 2603.23748 by Ioannis Lestas, Xinyi Yi.

**Figure 3.** Figure 3: Heat generation under ADAM-DeePO. To assess reproducibility, we repeated the same experiment over 5 random seeds. The relative cost error is substantially reduced from its initial value |𝐶(𝑲0 )−𝐶 ⋆| 𝐶⋆ = 1.697 × 10−2 to a final value |𝐶(𝑲𝑡 )−𝐶 ⋆| 𝐶⋆ with mean 9.606 × 10−4 and standard deviation 8.531 × 10−4 across seeds. Moreover, the final optimality error and control increment remain small, with mean ‖𝑒𝑘… view at source ↗

**Figure 6.** Figure 6: Heat-generation trajectories under a realistic [PITH_FULL_IMAGE:figures/full_fig_p010_6.png] view at source ↗

**Figure 7.** Figure 7: Evolution of ‖𝑒𝑘‖2 [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 8.** Figure 8: Effect of disturbance-covariance estimation. [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

**Figure 9.** Figure 9: Sample efficiency of DeePO and ZO-PO. Data availability The data and code that support the findings of this study are available from the corresponding author upon reasonable request. References [1] Ahmed, S., Machado, J.E., Cucuzzella, M., Scherpen, J.M., 2023. Control-oriented modeling and passivity analysis of thermal dynamics in a multi-producer district heating system. IFAC-PapersOnLine 56, 175–180. [… view at source ↗

read the original abstract

District heating systems (DHSs) require coordinated economic dispatch and temperature regulation under uncertain operating conditions. Existing DHS operation strategies often rely on disturbance forecasts and nominal models, so their economic and thermal performance may degrade when predictive information or model knowledge is inaccurate. This paper develops a data-driven online control framework for DHS operation by embedding steady-state economic optimality conditions into the temperature dynamics, so that the closed-loop system converges to the economically optimal operating point without relying on disturbance forecasts. Based on this formulation, we develop a Data-Enabled Policy Optimization (DeePO)-based online learning controller and incorporate Adaptive Moment Estimation (ADAM) to improve closed-loop performance. We further establish convergence and performance guarantees for the resulting closed-loop system. Simulations on an industrial-park DHS in Northern China show that the proposed method achieves stable near-optimal operation and strong empirical robustness to both static and time-varying model mismatch under practical disturbance conditions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper embeds economic KKT conditions into district heating temperature dynamics to reach forecast-free optimality, then adds a DeePO-ADAM controller with claimed convergence and mismatch-robust simulations.

read the letter

The main point is that the authors fold the steady-state optimality conditions of the economic dispatch problem straight into the temperature dynamics. This makes the closed-loop equilibrium coincide with the economic optimum without needing forecasts. They then run a DeePO controller updated by ADAM and report convergence results plus simulations on an industrial-park system in Northern China that hold up under both static and time-varying mismatch. That specific embedding-plus-controller combination for coordinated economic-thermal control in district heating is new relative to the cited literature. The work is useful because it gives a concrete formulation, states the guarantees, and shows practical tests on a realistic setup with model error. The simulations are the strongest part: they directly address the uncertainty that matters in real heating networks. The soft spot is the handling of time-varying mismatch. The stress-test concern is fair to raise—if the Lyapunov or contraction argument treats the perturbation as constant, a drifting embedded equilibrium could weaken the asymptotic claim even if the simulations look good. The abstract says the guarantees cover time-varying cases, so the full proof needs checking on that exact point, but nothing else looks broken. This is for people working on data-driven control or energy network operation. A reader focused on district heating or online economic dispatch would get direct value from the method and the numerical evidence. It is worth sending to peer review because the claims are specific enough to evaluate and the application is concrete.

Referee Report

3 major / 2 minor

Summary. The paper develops a data-driven online control framework for district heating systems by embedding steady-state KKT conditions of the economic dispatch problem into the temperature dynamics via a data-driven estimate of the steady-state map. This construction is combined with a DeePO-based controller augmented by ADAM updates to achieve closed-loop convergence to the economically optimal operating point without disturbance forecasts. Convergence and performance guarantees are established, and simulations on an industrial-park DHS demonstrate stable near-optimal operation with empirical robustness to static and time-varying model mismatch.

Significance. If the central embedding and convergence claims hold under the paper's conditions, the approach would provide a forecast-free method for simultaneous economic dispatch and temperature regulation in uncertain DHS environments, potentially improving real-time efficiency over model-based predictive strategies. The integration of optimality-condition embedding with online policy optimization is a targeted contribution to data-driven control of energy networks.

major comments (3)

[Abstract and §3] Abstract and §3 (embedding construction): the steady-state KKT embedding uses a data-driven estimate of the steady-state map; when this estimate is obtained from the same operating data used to define the target economic point, the closed-loop equilibrium coincides with optimality only by construction, raising a circularity risk that is not explicitly separated or bounded in the stated guarantees.
[§4] §4 (convergence analysis, likely Theorem 1 or Lyapunov argument): the proof treats model mismatch as a constant perturbation that preserves invariance of the embedded equilibrium, but the skeptic correctly notes that time-varying mismatch (pipe losses, demand fluctuations) causes the embedded equilibrium itself to drift; no bound is provided on the resulting suboptimality gap or on the violation of the invariance assumption needed for asymptotic convergence.
[§5] §5 (simulation results): the reported robustness to time-varying mismatch shows stable near-optimal operation, yet the performance metrics do not quantify the distance to the true (non-embedded) economic optimum under drifting conditions, leaving the practical significance of the guarantees unclear when the constant-perturbation assumption is violated.

minor comments (2)

[§3] Notation for the data-driven steady-state map estimate is introduced without a clear distinction between offline identification data and online operating data.
[§5] Figure captions for the industrial-park simulation results do not specify the exact disturbance profiles or mismatch magnitudes used in the time-varying cases.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and insightful comments. We have revised the manuscript to clarify the data separation in the embedding construction, extend the convergence analysis with bounds for time-varying mismatch, and augment the simulations with explicit suboptimality metrics relative to the true optimum. Point-by-point responses follow.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (embedding construction): the steady-state KKT embedding uses a data-driven estimate of the steady-state map; when this estimate is obtained from the same operating data used to define the target economic point, the closed-loop equilibrium coincides with optimality only by construction, raising a circularity risk that is not explicitly separated or bounded in the stated guarantees.

Authors: We thank the referee for identifying this distinction. The data-driven estimate of the steady-state map is obtained from historical open-loop operating records collected prior to controller deployment, independent of the economic dispatch target. The target point is then solved offline using this fixed estimate. In the revised Section 3 and abstract we explicitly separate these steps, state that closed-loop equilibrium optimality holds with respect to the estimated model, and add an explicit bound on the true-model suboptimality gap in terms of the steady-state map estimation error (via standard perturbation arguments). This removes any circularity between data collection and online operation. revision: yes
Referee: [§4] §4 (convergence analysis, likely Theorem 1 or Lyapunov argument): the proof treats model mismatch as a constant perturbation that preserves invariance of the embedded equilibrium, but the skeptic correctly notes that time-varying mismatch (pipe losses, demand fluctuations) causes the embedded equilibrium itself to drift; no bound is provided on the resulting suboptimality gap or on the violation of the invariance assumption needed for asymptotic convergence.

Authors: We agree that the original proof assumes constant mismatch to preserve invariance. In the revised Section 4 we add a corollary to Theorem 1 that treats bounded-rate time-varying mismatch (||Δ(t)−Δ(t−1)||≤ε). Using a time-varying Lyapunov function we derive an ultimate bound on the distance to the drifting embedded equilibrium of order O(ε), which directly translates into a suboptimality gap bound relative to the time-varying true optimum. The result holds under the practical assumption of slowly varying disturbances typical in district heating systems, thereby quantifying the violation of invariance. revision: yes
Referee: [§5] §5 (simulation results): the reported robustness to time-varying mismatch shows stable near-optimal operation, yet the performance metrics do not quantify the distance to the true (non-embedded) economic optimum under drifting conditions, leaving the practical significance of the guarantees unclear when the constant-perturbation assumption is violated.

Authors: We appreciate the suggestion. The revised Section 5 now reports the relative suboptimality gap (J_achieved − J_true)/J_true, where J_true is computed offline with the exact plant model and perfect disturbance information. Under the time-varying mismatch scenarios (varying pipe losses and demand), the gap remains below 5 % on average and converges to within 2 % of the drifting optimum, confirming that performance stays close to the true economic optimum even when the constant-perturbation assumption is relaxed. revision: yes

Circularity Check

1 steps flagged

Embedding of KKT conditions into dynamics makes closed-loop equilibrium coincide with optimum by construction

specific steps

self definitional [Abstract]
"by embedding steady-state economic optimality conditions into the temperature dynamics, so that the closed-loop system converges to the economically optimal operating point without relying on disturbance forecasts"

The dynamics are explicitly redefined by incorporating the KKT conditions of the economic dispatch problem, forcing the equilibrium to be the optimal point. Convergence to this point therefore follows tautologically from the modified system definition rather than emerging from analysis of the original plant dynamics or independent verification.

full rationale

The paper's central step modifies the temperature dynamics to embed the steady-state economic optimality (KKT) conditions, ensuring the closed-loop equilibrium is the economic optimum by design. This renders the claimed convergence a direct consequence of the embedding rather than an independent derivation from the underlying dynamics or data. The data-driven DeePO+ADAM controller then operates on this constructed system. While the approach may be practically useful, the derivation chain reduces to a self-definitional construction for the key convergence property. No other load-bearing circular steps (such as self-citation chains or fitted predictions) are evident from the abstract and described formulation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The framework rests on the assumption that temperature dynamics admit an embedding of steady-state economic optimality and that data is sufficient for online learning to achieve convergence; no explicit free parameters or invented entities are stated in the abstract.

axioms (1)

domain assumption Temperature dynamics permit embedding of steady-state economic optimality conditions
Invoked to enable forecast-free convergence

pith-pipeline@v0.9.0 · 5457 in / 1123 out tokens · 41887 ms · 2026-05-15T00:03:25.409991+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · 2 internal anchors

[1]

Control-oriented modeling and passivity analysis of thermal dynam- ics in a multi-producer district heating system

Ahmed, S., Machado, J.E., Cucuzzella, M., Scherpen, J.M., 2023. Control-oriented modeling and passivity analysis of thermal dynam- ics in a multi-producer district heating system. IFAC-PapersOnLine 56, 175–180

work page 2023
[2]

Optimalcontrol:linearquadratic methods

Anderson,B.D.,Moore,J.B.,2007. Optimalcontrol:linearquadratic methods. Courier Corporation

work page 2007
[3]

An easy and widely applicable forecast control for heating systems in existing and new buildings:Firstfieldexperiences

Cholewa, T., Siuta-Olcha, A., Smolarz, A., Muryjas, P., Wolszczak, P., Guz, Ł., Bocian, M., Balaras, C.A., 2022. An easy and widely applicable forecast control for heating systems in existing and new buildings:Firstfieldexperiences. JournalofCleanerProduction352, 131605

work page 2022
[4]

A Fully Data-Driven Value Iteration for Stochastic LQR: Convergence, Robustness and Stability

Cui,L.,Jiang,Z.P.,Kolm,P.N.,Macqueron,G.G.,2025. Afullydata- driven value iteration for stochastic lqr: Convergence, robustness and stability. arXiv preprint arXiv:2505.02970

work page internal anchor Pith review Pith/arXiv arXiv 2025
[5]

On the sample complexity of the linear quadratic regulator

Dean, S., Mania, H., Matni, N., Recht, B., Tu, S., 2020. On the sample complexity of the linear quadratic regulator. Foundations of Computational Mathematics 20, 633–679

work page 2020
[6]

Eval- uating different artificial neural network forecasting approaches for optimizing district heating network operation

Frison, L., Gölzhäuser, S., Bitterling, M., Kramer, W., 2024. Eval- uating different artificial neural network forecasting approaches for optimizing district heating network operation. Energy 307, 132745

work page 2024
[7]

Optimization algorithms as robust feedback controllers

Hauswirth, A., He, Z., Bolognani, S., Hug, G., Dörfler, F., 2024. Optimization algorithms as robust feedback controllers. Annual Reviews in Control 57, 100941

work page 2024
[8]

Effect of prediction uncertainties on the performance of a white-box model predictive controller for district heating networks

Jansen, J., Jorissen, F., Helsen, L., 2024a. Effect of prediction uncertainties on the performance of a white-box model predictive controller for district heating networks. Energy and Buildings 319, 114520

work page
[9]

Mixed-integer non-linear modelpredictivecontrolofdistrictheatingnetworks

Jansen, J., Jorissen, F., Helsen, L., 2024b. Mixed-integer non-linear modelpredictivecontrolofdistrictheatingnetworks. AppliedEnergy 361, 122874

work page
[10]

Neural policy iteration for stochastic optimal control: A physics-informed approach

Kim, Y., Kim, Y., Kim, M., Cho, N., 2025. Neural policy iteration for stochastic optimal control: A physics-informed approach. arXiv preprint arXiv:2508.01718

work page arXiv 2025
[11]

Adam: A Method for Stochastic Optimization

Kingma, D.P., 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980

work page internal anchor Pith review Pith/arXiv arXiv 2014
[12]

Optimal management and data-based predictive control of district heating systems: The novate milaneseexperimentalcase-study

La Bella, A., Del Corno, A., 2023. Optimal management and data-based predictive control of district heating systems: The novate milaneseexperimentalcase-study. ControlEngineeringPractice132, 105429

work page 2023
[13]

Diversifyingheatsourcesinchina’surbandistrictheatingsystemswill reduce risk of carbon lock-in

Liu,S.,Guo,Y.,Wagner,F.,Liu,H.,Cui,R.Y.,Mauzerall,D.L.,2024. Diversifyingheatsourcesinchina’surbandistrictheatingsystemswill reduce risk of carbon lock-in. Nature Energy 9, 1021–1031

work page 2024
[14]

Decentralized temperature and storage volume control in multipro- ducer district heating

Machado, J.E., Ferguson, J., Cucuzzella, M., Scherpen, J.M., 2022. Decentralized temperature and storage volume control in multipro- ducer district heating. IEEE Control Systems Letters 7, 413–418

work page 2022
[15]

Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem

Mohammadi,H.,Zare,A.,Soltanolkotabi,M.,Jovanović,M.R.,2021. Convergence and sample complexity of gradient methods for the model-free linear–quadratic regulator problem. IEEE Transactions on Automatic Control 67, 2435–2450

work page 2021
[16]

Mugnini, A., Ferracuti, F., Lorenzetti, M., Comodi, G., Arteconi, A.,

work page
[17]

Energy Conversion and Management: X 15, 100264

Advanced control techniques for chp-dh systems: A critical comparison of model predictive control and reinforcement learning. Energy Conversion and Management: X 15, 100264

work page
[18]

Frequency control and power sharing in combined heat and power networks, in: 2024 IEEE 63rd Conference on Decision and Control (CDC), IEEE

Qin, X., Lestas, I., 2024. Frequency control and power sharing in combined heat and power networks, in: 2024 IEEE 63rd Conference on Decision and Control (CDC), IEEE. pp. 5771–5776

work page 2024
[19]

Model-basedpredictivecontrolto minimize primary energy use in a solar district heating system with seasonal thermal energy storage

Saloux,E.,Candanedo,J.A.,2021. Model-basedpredictivecontrolto minimize primary energy use in a solar district heating system with seasonal thermal energy storage. Applied energy 291, 116840

work page 2021
[20]

Adjustment of an inverse matrix corresponding to a change in one element of a given matrix

Sherman, J., Morrison, W.J., 1950. Adjustment of an inverse matrix corresponding to a change in one element of a given matrix. The Annals of Mathematical Statistics 21, 124–127

work page 1950
[21]

Towards efficient modeling and simulation of districtenergysystems

Simonsson, J., 2021. Towards efficient modeling and simulation of districtenergysystems. Ph.D.thesis.LuleåUniversityofTechnology

work page 2021
[22]

Sample complexity bounds for the linear quadratic regulator

Tu, S.L., 2019. Sample complexity bounds for the linear quadratic regulator. Ph.D. thesis. University of California, Berkeley

work page 2019
[23]

The theory of quantum information

Watrous, J., 2018. The theory of quantum information. Cambridge university press

work page 2018
[24]

Data-drivenapplicationontheoptimizationofaheatpump system for district heating load supply: A validation based on onsite test

Wei,Z.,Ren,F.,Yue,B.,Ding,Y.,Zheng,C.,Li,B.,Zhai,X.,Wang, R.,2022. Data-drivenapplicationontheoptimizationofaheatpump system for district heating load supply: A validation based on onsite test. Energy Conversion and Management 266, 115851

work page 2022
[25]

Investigation of a model predictive control (mpc) strategy for seasonal thermochemical energy storage systems in district heat- ing networks

Wei,Z.,Tien,P.W.,Calautit,J.,Darkwa,J.,Worall,M.,Boukhanouf, R., 2024. Investigation of a model predictive control (mpc) strategy for seasonal thermochemical energy storage systems in district heat- ing networks. Applied Energy 376, 124164

work page 2024
[26]

Energy-gradedouble pricing for combined heat and power systems

Yi,X.,Guo,Y.,Sun,H.,Qin,X.,Wu,Q.,2023. Energy-gradedouble pricing for combined heat and power systems. IEEE Transactions on Power Systems

work page 2023
[27]

Optimal energy-sharing and temperature regulation in district heating systems

Yi, X., Lestas, I., 2025. Optimal energy-sharing and temperature regulation in district heating systems. IFAC-PapersOnLine 59, 67– 72

work page 2025
[28]

A new model predictive control approach integrating physical and data-driven modelling for improved energy performance of district heating substations

Zhang, Z., Zhou, X., Du, H., Cui, P., 2023. A new model predictive control approach integrating physical and data-driven modelling for improved energy performance of district heating substations. Energy and Buildings 301, 113688

work page 2023
[29]

Data-enabled policy optimizationfordirectadaptivelearningofthelqr

Zhao, F., Dörfler, F., Chiuso, A., You, K., 2025. Data-enabled policy optimizationfordirectadaptivelearningofthelqr. IEEETransactions on Automatic Control

work page 2025
[30]

Robust and optimal control

Zhou, K., Doyle, J.C., Glover, K., et al., 1996. Robust and optimal control. volume 40. Prentice hall New Jersey

work page 1996
[31]

A sufficient con- dition for convergences of adam and rmsprop, in: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp

Zou, F., Shen, L., Jie, Z., Zhang, W., Liu, W., 2019. A sufficient con- dition for convergences of adam and rmsprop, in: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pp. 11127–11135. Xinyi Yi and Ioannis Lestas:Preprint submitted to ElsevierPage 11 of 11

work page 2019