arxiv: 2604.20935 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.AI

Recognition: unknown

Data-Driven Open-Loop Simulation for Digital-Twin Operator Decision Support in Wastewater Treatment

Gary Simethy , Daniel Ortiz Arroyo , Petar Durdevic

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:04 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords wastewater treatmentdigital twincontinuous-time state-spacemissing datairregular samplingsimulationdecision supporttime series forecasting

0 comments

The pith

CCSS-RS simulates wastewater treatment plant responses to control plans over long horizons despite irregular sampling and 43% missing data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CCSS-RS as a data-driven simulator that separates past state inference from future control-driven rollouts to support digital-twin planning in wastewater treatment. It combines context encoding, weighted forcing of prescribed inputs, consistent time evolution, and robust output distributions to handle real sensor gaps and irregular timing. On the full-scale Avedøre benchmark of nearly 907,000 timesteps, the model reaches RMSE of 0.696 and CRPS of 0.349 at 1000-step horizons across 10,000 windows, cutting error 40-46% below neural continuous-time baselines. Case studies with a fixed model show it tracks the effects of oxygen setpoint changes, ranks control plans, and stays accurate during sensor outages while beating simple persistence baselines.

Core claim

CCSS-RS is a controlled continuous-time state-space model that decouples historical state inference from future control and exogenous rollout. It uses typed context encoding, gain-weighted forcing of drivers, semigroup-consistent rollouts, and Student-t plus hurdle outputs suited to heavy-tailed and zero-inflated WWTP data. On the public Avedøre dataset with 906,815 timesteps, 43% missingness, and 1-20 minute irregular sampling, it achieves RMSE 0.696 and CRPS 0.349 at horizon 1000 over 10,000 test windows, outperforming Neural CDE baselines by 40-46% and simplified internal variants by 31-35%. Four frozen-model case studies confirm that oxygen-setpoint perturbations produce ammonium shifts,

What carries the argument

CCSS-RS, a controlled continuous-time state-space model that separates historical inference from future control rollout via typed context encoding and gain-weighted forcing of prescribed drivers.

If this is right

Oxygen setpoint perturbations produce ammonium shifts of -2.3 to +1.4 over horizons 300-1000.
A smoothed setpoint plan ranks highest under multi-criterion screening.
Context-only sensor outages raise monitored-variable RMSE by at most 10%.
Ammonium, nitrate, and oxygen predictions stay more accurate than persistence throughout the 1000-step rollout.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The separation of inference from rollout could allow direct combination with existing mechanistic WWTP models for hybrid simulators.
Retraining or fine-tuning on data from additional plants would test whether the reported gains survive changes in process configuration.
The approach could extend to other industrial processes that share irregular sensing and long planning horizons, such as chemical plants or power grids.

Load-bearing premise

That the accuracy achieved on the single Avedøre benchmark's missingness pattern and sampling will hold for other plants, different control regimes, and true 12-36 hour operational horizons without retraining.

What would settle it

Run the frozen CCSS-RS checkpoint on data from a second full-scale WWTP with different sensor density, control practices, and missingness statistics; large degradation in RMSE or CRPS at H=1000 would falsify generalization.

Figures

Figures reproduced from arXiv: 2604.20935 by Daniel Ortiz Arroyo, Gary Simethy, Petar Durdevic.

**Figure 1.** Figure 1: CCSS-RS architecture. Phase 1 (left): historical context is encoded by a TCN and attention pooling into a context summary env and four typed initial hidden states (𝑧0 , 𝑠0 , 𝑐0 , 𝑠𝑢0 ). Phase 2 (right): three parallel encoders process prescribed controls and forecast exogenous drivers; the core dynamics engine integrates the state trajectory via four parallel affine scans modulated by regime-specific exper… view at source ↗

**Figure 2.** Figure 2: RMSE (top), MAE (middle), and CRPS (bottom) vs. rollout horizon 𝐻 on 5,000 fully-observed windows. Bootstrap 95% confidence intervals (3,000 resamples) are plotted but negligibly narrow at this scale. levels but drifts progressively upward, and its N2O predictions collapse to near-zero without meaningful uncertainty. Both Neural CDEs produce near-constant trajectories for all five state variables, failing… view at source ↗

**Figure 3.** Figure 3: Prediction trajectories for Window 1 (December 18–20), characterised by rapid NH4 oscillations, sharp SS drops, and near-zero N2O. Shaded regions: 95% predictive intervals. change at 𝐻 = 300. These results do not validate the perturbed plans as operationally correct, but they do show that the learned simulator is sensitive enough to discriminate among plausible candidate control trajectories. The non-monot… view at source ↗

**Figure 4.** Figure 4: Prediction trajectories for Window 2 (April 27–28), featuring a pronounced NH4 decay from ∼10 to ∼2, rising N2O, and sustained O2 oscillations. Shaded regions: 95% predictive intervals. (21-step kernel, ≈20 min), reducing biomass stress; frontload ±0.2 applies a ±0.2 mg/L setpoint push only in the first third of the horizon (≈4 h) then reverts to the observed schedule, testing whether an early aeration ad… view at source ↗

**Figure 5.** Figure 5: Conceptual workflow for using CCSS-RS as a learned simulator for scenario screening and decision support. Recent plant context is paired with candidate future control plans, rolled out under the learned simulator, compared through trajectories and simple heuristics, and then reviewed by an operator or engineer before deeper analysis [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: What-if control comparison for the selected ammonium-transient test window. The same historical context is rolled out under the observed future plan and three moderate control perturbations. The learned simulator predicts substantially larger trajectory separation under O2 setpoint perturbations than under the tested valve perturbation. up-weighting the N2O term if greenhouse-gas mitigation is the primary … view at source ↗

**Figure 7.** Figure 7: What-if control comparison for the selected oxygen-cycling test window. The same context produces clearly different NH4 and N2O trajectories under alternative future setpoint plans, illustrating how a learned simulator can support relative plan comparison even when the directional response is regime dependent [PITH_FULL_IMAGE:figures/full_fig_p013_7.png] view at source ↗

**Figure 8.** Figure 8: Candidate-plan screening results for the selected phase-rich test window. (a) Composite heuristic ranking (lower score = more preferred); Pareto-efficient plans are shown in blue, the observed plan in grey, and dominated plans in light blue. (b) Per-criterion normalised scores (0 = best, 1 = worst) with raw values annotated. Bold text marks Pareto-efficient plans; the observed plan row is highlighted with … view at source ↗

**Figure 9.** Figure 9: Operator-support summary from the robustness and decision-horizon analyses. (a) Per-variable RMSE change under context-only sensor outages (three conditions, 64 test windows); boxplots below show the distribution of overall window-level RMSE ratios. (b) Skill against a persistence baseline as a function of rollout horizon (same 64 windows). NH4 , NO3 , and O2 retain positive skill across the full 1000-step… view at source ↗

**Figure 10.** Figure 10: RMSE (top), MAE (middle), and CRPS (bottom) vs. horizon 𝐻 for CCSS-RS and three ablated variants across three training seeds. All three components contribute increasingly with horizon length, but with distinct growth profiles. Three findings emerge. First, innovation forcing and semigroup consistency make almost identical contributions at 𝐻 = 1000 (+11.0% and +11.7% RMSE degradation), yet they address di… view at source ↗

read the original abstract

Wastewater treatment plants (WWTPs) need digital-twin-style decision support tools that can simulate plant response under prescribed control plans, tolerate irregular and missing sensing, and remain informative over 12-36 h planning horizons. Meeting these requirements with full-scale plant data remains an open engineering-AI challenge. We present CCSS-RS, a controlled continuous-time state-space model that separates historical state inference from future control and exogenous rollout. The model combines typed context encoding, gain-weighted forcing of prescribed and forecast drivers, semigroup-consistent rollouts, and Student-t plus hurdle outputs for heavy-tailed and zero-inflated WWTP sensor data. On the public Aved{\o}re full-scale benchmark, with 906,815 timesteps, 43% missingness, and 1-20 min irregular sampling, CCSS-RS achieves RMSE 0.696 and CRPS 0.349 at H=1000 across 10,000 test windows. This reduces RMSE by 40-46% relative to Neural CDE baselines and by 31-35% relative to simplified internal variants. Four case studies using a frozen checkpoint on test data demonstrate operational value: oxygen-setpoint perturbations shift predicted ammonium by -2.3 to +1.4 over horizons 300-1000; a smoothed setpoint plan ranks first in multi-criterion screening; context-only sensor outages raise monitored-variable RMSE by at most 10%; and ammonium, nitrate, and oxygen remain more accurate than persistence throughout the rollout. These results establish CCSS-RS as a practical learned simulator for offline scenario screening in industrial wastewater treatment, complementary to mechanistic models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper introduces CCSS-RS, a controlled continuous-time state-space model for open-loop simulation of wastewater treatment plant dynamics under prescribed controls. It separates state inference from future rollout using typed context encoding, gain-weighted forcing, semigroup-consistent dynamics, and Student-t/hurdle output heads to handle irregular sampling, 43% missingness, and heavy-tailed/zero-inflated sensor data. On the public Avedøre dataset (906,815 timesteps), it reports RMSE 0.696 and CRPS 0.349 at H=1000 over 10,000 test windows, with 40-46% RMSE reduction versus Neural CDE baselines and 31-35% versus internal ablations, plus four frozen-checkpoint case studies on setpoint perturbations, plan ranking, sensor outages, and persistence comparison.

Significance. If the results hold, this supplies a practical data-driven open-loop simulator for 12-36 h scenario screening in WWTP digital twins that tolerates real-world sensing imperfections and complements mechanistic models. The concrete, reproducible metrics on a large public benchmark with explicit baselines and 10,000 test windows, together with operational case studies, constitute a clear strength for applied ML in environmental engineering.

major comments (1)

[Results and Case Studies sections] All headline metrics (RMSE 0.696, CRPS 0.349 at H=1000) and the four case studies are obtained exclusively from the single Avedøre benchmark with its specific missingness pattern and sampling regime. No cross-plant evaluation, synthetic transfer tests, or ablations under altered hydraulics, sensor suites, or control regimes are reported, which is load-bearing for the central claim that CCSS-RS constitutes a practical learned simulator for industrial wastewater treatment.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for the constructive feedback and for recognizing the practical relevance of CCSS-RS for WWTP digital-twin applications. We address the major comment point by point below.

read point-by-point responses

Referee: [Results and Case Studies sections] All headline metrics (RMSE 0.696, CRPS 0.349 at H=1000) and the four case studies are obtained exclusively from the single Avedøre benchmark with its specific missingness pattern and sampling regime. No cross-plant evaluation, synthetic transfer tests, or ablations under altered hydraulics, sensor suites, or control regimes are reported, which is load-bearing for the central claim that CCSS-RS constitutes a practical learned simulator for industrial wastewater treatment.

Authors: We agree that all quantitative results and case studies are derived from the single public Avedøre dataset. This benchmark was selected for its scale (906,815 timesteps), realistic 43% missingness, irregular 1-20 min sampling, and representation of full-scale industrial WWTP operations. The reported 40-46% RMSE improvement over Neural CDE baselines and the four frozen-checkpoint case studies (setpoint perturbations, plan ranking, sensor outages, persistence) are intended to demonstrate operational utility under these documented conditions. We acknowledge, however, that the absence of cross-plant or synthetic transfer experiments limits the strength of broader generalizability claims. In the revised manuscript we will add a new subsection to the Discussion that (1) explicitly notes the single-benchmark scope, (2) explains why the architectural choices (typed context encoding, gain-weighted forcing, semigroup-consistent dynamics, and robust output heads) are designed to address common WWTP data issues that recur across plants, and (3) outlines planned future work on transfer and multi-plant validation. We will also revise the abstract and conclusion to frame the contribution more precisely as a validated open-loop simulator on a representative large-scale public benchmark rather than a universally demonstrated industrial solution. This is a partial revision that incorporates the referee's concern through added limitations analysis and tempered claims. revision: partial

standing simulated objections not resolved

New cross-plant evaluations, synthetic transfer tests, or ablations on altered hydraulics/sensor suites would require additional independent datasets that are not currently available to the authors.

Circularity Check

0 steps flagged

No significant circularity in model derivation or performance claims

full rationale

The paper defines CCSS-RS via explicit architectural choices (typed context encoding, gain-weighted forcing of drivers, semigroup-consistent rollouts, Student-t/hurdle outputs) motivated by WWTP data characteristics, then trains and evaluates the resulting model on held-out test windows from the public Avedøre benchmark (906k timesteps, 43% missingness). Reported gains (RMSE 0.696, 40-46% vs Neural CDE baselines, 31-35% vs internal ablations) are obtained by direct comparison to external methods and standard splits; no equation or claim equates a prediction to its training inputs by construction, and no load-bearing step relies on self-citation chains or imported uniqueness results. The evaluation remains falsifiable against the stated external benchmark.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The central performance claims rest on empirical evaluation of a new model architecture rather than on additional free parameters, unstated axioms, or newly invented physical entities. Standard machine-learning assumptions about stationarity and generalization are implicit but not enumerated in the abstract.

pith-pipeline@v0.9.0 · 5604 in / 1287 out tokens · 70626 ms · 2026-05-10T01:04:45.393568+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

49 extracted references · 4 canonical work pages · 3 internal anchors

[1]

2007 , publisher =

Model Predictive Control , author =. 2007 , publisher =

2007
[2]

Control Engineering Practice , volume =

A survey of industrial model predictive control technology , author =. Control Engineering Practice , volume =. 2003 , doi =

2003
[3]

Advances in Neural Information Processing Systems , volume =

Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting , author =. Advances in Neural Information Processing Systems , volume =
[4]

AAAI Conference on Artificial Intelligence , year =

Informer: Beyond efficient transformer for long sequence time-series forecasting , author =. AAAI Conference on Artificial Intelligence , year =
[5]

International Conference on Learning Representations , year =

A time series is worth 64 words: Long-term forecasting with transformers , author =. International Conference on Learning Representations , year =
[6]

Liu, Yong and Hu, Tengge and Zhang, Haoran and Wu, Haixu and Wang, Shiyu and Ma, Lintao and Long, Mingsheng , booktitle =
[7]

International Journal of Forecasting , volume =

Temporal fusion transformers for interpretable multi-horizon time series forecasting , author =. International Journal of Forecasting , volume =
[8]

Advances in Neural Information Processing Systems , year =

TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables , author =. Advances in Neural Information Processing Systems , year =
[9]

International Conference on Learning Representations , year =

Efficiently modeling long sequences with structured state spaces , author =. International Conference on Learning Representations , year =
[10]

First Conference on Language Modeling , year =

Mamba: Linear-time sequence modeling with selective state spaces , author =. First Conference on Language Modeling , year =
[11]

Cai, Xiuding and Zhu, Yaoyao and Wang, Xueyao and Yao, Yu , journal =
[12]

International Conference on Learning Representations (ICLR) , year =

Simplified state space layers for sequence modeling , author =. International Conference on Learning Representations (ICLR) , year =
[13]

Advances in Neural Information Processing Systems , volume =

Neural controlled differential equations for irregular time series , author =. Advances in Neural Information Processing Systems , volume =
[14]

International Conference on Machine Learning , year =

Log Neural Controlled Differential Equations: The Lie Brackets Make a Difference , author =. International Conference on Machine Learning , year =
[15]

International Conference on Learning Representations , year =

Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data , author =. International Conference on Learning Representations , year =
[16]

Advances in Neural Information Processing Systems , volume =

Latent ordinary differential equations for irregularly-sampled time series , author =. Advances in Neural Information Processing Systems , volume =
[17]

De Brouwer, Edward and Simm, Jaak and Arany, Adam and Moreau, Yves , journal =
[18]

Chen, Junfeng and Wu, Kailiang , journal =. Deep-. 2023 , doi =

2023
[19]

Journal of Basic Engineering , volume =

A new approach to linear filtering and prediction problems , author =. Journal of Basic Engineering , volume =
[20]

Hierarchical

Teh, Yee W and Jordan, Michael I and Beal, Matthew J and Blei, David M , journal =. Hierarchical
[21]

A sticky

Fox, Emily B and Sudderth, Erik B and Jordan, Michael I and Willsky, Alan S , journal =. A sticky
[22]

Scientific Reports , volume =

Recurrent neural networks for multivariate time series with missing values , author =. Scientific Reports , volume =
[23]

Cao, Wei and Wang, Dong and Li, Jian and Zhou, Hao and Li, Lei and Li, Yitan , journal =
[24]

Du, Wenjie and Cote, David and Liu, Yan , journal =
[25]

Patient subtyping via time-aware

Baytas, Inci M and Xiao, Cao and Zhang, Xi and Wang, Fei and Jain, Anil K and Zhou, Jiayu , journal =. Patient subtyping via time-aware
[26]

World Models

World models , author =. arXiv preprint arXiv:1803.10122 , year =

work page internal anchor Pith review arXiv
[27]

International Conference on Machine Learning , pages =

Learning latent dynamics for planning from pixels , author =. International Conference on Machine Learning , pages =
[28]

International Conference on Learning Representations , year =

Dream to control: Learning behaviors by latent imagination , author =. International Conference on Learning Representations , year =
[29]

Mastering

Hafner, Danijar and Lillicrap, Timothy and Norouzi, Mohammad and Ba, Jimmy , journal =. Mastering
[30]

Mastering Diverse Domains through World Models

Mastering diverse domains through world models , author =. arXiv preprint arXiv:2301.04104 , year =

work page internal anchor Pith review arXiv
[31]

Salinas, David and Flunkert, Valentin and Gasthaus, Jan and Januschowski, Tim , journal =
[32]

Advances in Neural Information Processing Systems , volume =

Deep state space models for time series forecasting , author =. Advances in Neural Information Processing Systems , volume =
[33]

International Conference on Machine Learning , pages =

Autoregressive denoising diffusion models for multivariate probabilistic time series forecasting , author =. International Conference on Machine Learning , pages =
[34]

Tashiro, Yusuke and Song, Jiaming and Song, Yang and Ermon, Stefano , journal =
[35]

Journal of Econometrics , volume =

Specification and testing of some modified count data models , author =. Journal of Econometrics , volume =
[36]

Time series dataset for modeling and forecasting of

Hansen, Laura Debel and Rani, Anju and Stokholm-Bjerregaard, Mikkel Algren and Stentoft, Peter Alexander and Ortiz Arroyo, Daniel and Durdevic, Petar , journal =. Time series dataset for modeling and forecasting of. 2024 , doi =

2024
[37]

Time series dataset for modeling and forecasting of

Hansen, Laura Debel and Rani, Anju and Ortiz Arroyo, Daniel and Durdevic, Petar , year =. Time series dataset for modeling and forecasting of. doi:10.17632/xmbxhscgpr.4 , note =

work page doi:10.17632/xmbxhscgpr.4
[38]

Water Science and Technology , volume =

Activated sludge modelling: Development and potential use of a practical applications database , author =. Water Science and Technology , volume =. 2011 , doi =

2011
[39]

Environmental Technology , volume =

A review of the impact and potential of intermittent aeration on continuous flow nitrifying activated sludge , author =. Environmental Technology , volume =. 2012 , doi =

2012
[40]

Activated sludge models

Henze, Mogens and Gujer, Willi and Mino, Takashi and van Loosdrecht, Mark , year =. Activated sludge models
[41]

Environmental Modelling & Software , volume =

Activated sludge wastewater treatment plant modelling and simulation: State of the art , author =. Environmental Modelling & Software , volume =. 2004 , doi =

2004
[42]

Water Research , volume =

Data-driven performance analyses of wastewater treatment plants: A review , author =. Water Research , volume =. 2019 , doi =

2019
[43]

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

An empirical evaluation of generic convolutional and recurrent networks for sequence modeling , author =. arXiv preprint arXiv:1803.01271 , year =

work page internal anchor Pith review arXiv
[44]

International Conference on Machine Learning , pages =

Resurrecting recurrent neural networks for long sequences , author =. International Conference on Machine Learning , pages =
[45]

1999 , publisher =

System identification: Theory for the user , author =. 1999 , publisher =

1999
[46]

A review and comparison of strategies for multi-step ahead time series forecasting based on the

Ben Taieb, Souhaib and Bontempi, Gianluca and Atiya, Amir F and Sorjamaa, Antti , journal =. A review and comparison of strategies for multi-step ahead time series forecasting based on the
[47]

Environmental Science & Technology ,volume =

The fourth-revolution in the water sector encounters the digital revolution , author =. Environmental Science & Technology ,volume =. 2020 , doi =

2020
[48]

Biotechnology and Bioengineering , volume =

Effect of oxygen concentration on nitrification and denitrification in single activated sludge flocs , author =. Biotechnology and Bioengineering , volume =. 2003 , doi =

2003
[49]

Mechanisms of

Wunderlin, Pascal and Mohn, Joachim and Joss, Adriano and Emmenegger, Lukas and Siegrist, Hansruedi , journal =. Mechanisms of. 2012 , doi =

2012