Can Machine Learning Identify Governing Laws For Dynamics in Complex Engineered Systems ? : A Study in Chemical Engineering
Pith reviewed 2026-05-24 19:52 UTC · model grok-4.3
The pith
SINDy reduces a distillation column's dynamics from thousands of equations to 13 while recovering some terms consistent with Fick's and Henry's laws.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Applying SINDy to ASPEN-generated trajectories of a distillation column produces a sparse set of 13 differential equations that reproduce the column's dynamic response inside the training perturbation range; several of the retained nonlinear terms take the functional form of Fick's law (concentration differences driving flux) and Henry's law (equilibrium ratios of concentration to partial pressure).
What carries the argument
The SINDy algorithm, which solves a sparse regression problem over a user-supplied library of candidate nonlinear functions to recover a minimal set of ordinary differential equations whose right-hand sides best match numerically estimated time derivatives.
If this is right
- Dynamic simulation of the distillation column can be performed with only 13 equations instead of the original thousands.
- The discovered model remains interpretable because each term corresponds to a concrete nonlinear function of the states.
- Some physical mechanisms (diffusion and gas-liquid equilibrium) are recoverable directly from the data-driven equations.
- Prediction accuracy holds only for conditions similar to those used to generate the training trajectories.
Where Pith is reading between the lines
- Extending the library with additional engineering-specific functions could increase the fraction of terms that map to known laws.
- The same reduction approach might be tested on other chemical unit operations whose full models are also too large for real-time use.
- Replacing ASPEN-generated data with measurements from an operating plant would reveal whether measurement noise prevents recovery of the same sparse structure.
Load-bearing premise
The true continuous-time dynamics are exactly a sparse combination of functions from the chosen library, and the simulation trajectories excite all relevant modes without substantial mismatch or noise.
What would settle it
Forward integration of the 13 discovered equations diverges from the ASPEN trajectories even for new inputs inside the original perturbation range, or none of the retained terms can be rewritten in the exact mathematical form of Fick's or Henry's law.
read the original abstract
Machine learning recently has been used to identify the governing equations for dynamics in physical systems. The promising results from applications on systems such as fluid dynamics and chemical kinetics inspire further investigation of these methods on complex engineered systems. Dynamics of these systems play a crucial role in design and operations. Hence, it would be advantageous to learn about the mechanisms that may be driving the complex dynamics of systems. In this work, our research question was aimed at addressing this open question about applicability and usefulness of novel machine learning approach in identifying the governing dynamical equations for engineered systems. We focused on distillation column which is an ubiquitous unit operation in chemical engineering and demonstrates complex dynamics i.e. it's dynamics is a combination of heuristics and fundamental physical laws. We tested the method of Sparse Identification of Non-Linear Dynamics (SINDy) because of it's ability to produce white-box models with terms that can be used for physical interpretation of dynamics. Time series data for dynamics was generated from simulation of distillation column using ASPEN Dynamics. One promising result was reduction of number of equations for dynamic simulation from 1000s in ASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data from system within the perturbation range, however outside perturbation range equations did not perform well. In terms of physical law extraction, some terms were interpretable as related to Fick's law of diffusion (with concentration terms) and Henry's law (with ratio of concentration and pressure terms). While some terms were interpretable, we conclude that more research is needed on combining engineering systems with machine learning approach to improve understanding of governing laws for unknown dynamics.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript applies the SINDy sparse regression method to time-series trajectories generated by an ASPEN Dynamics simulation of a binary distillation column. It reports that the procedure yields a 13-equation ODE model (one per state) whose terms include some that can be post-hoc matched to Fick's law and Henry's law, while noting that in-range prediction accuracy is high but degrades outside the training perturbation range.
Significance. Demonstrating SINDy on an industrially relevant engineered system with data from a commercial simulator is a useful test of applicability. However, the reported out-of-range failure directly undermines the central claim that the recovered equations constitute governing laws rather than a local interpolant; true governing laws are expected to hold for any reasonable input. The work therefore illustrates both the promise and the current limitations of the approach for complex chemical-engineering dynamics.
major comments (2)
- [Abstract] Abstract: the claim that SINDy identifies 'governing laws' is load-bearing for the title and research question, yet the text states that 'outside perturbation range equations did not perform well.' Because the underlying physical laws (mass, energy, equilibrium) are expected to hold beyond the training perturbations, this extrapolation failure falsifies the assumption that the 13-term model recovers the true continuous-time dynamics.
- [Abstract] Abstract: physical-term matching is reported only qualitatively ('some terms were interpretable as related to Fick's law of diffusion... and Henry's law'). No quantitative comparison (e.g., recovered coefficients versus known values from the ASPEN model or independent derivation) is provided, so the interpretability claim rests on post-hoc inspection rather than falsifiable validation.
minor comments (1)
- [Abstract] The abstract contains the grammatical error 'it's dynamics' (should be 'its dynamics').
Simulated Author's Rebuttal
We thank the referee for the constructive feedback highlighting the tension between our claims and the reported limitations. We address each major comment below and will revise the manuscript accordingly where possible.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that SINDy identifies 'governing laws' is load-bearing for the title and research question, yet the text states that 'outside perturbation range equations did not perform well.' Because the underlying physical laws (mass, energy, equilibrium) are expected to hold beyond the training perturbations, this extrapolation failure falsifies the assumption that the 13-term model recovers the true continuous-time dynamics.
Authors: We agree that the extrapolation failure indicates the recovered 13-equation model does not capture the true governing dynamics expected to hold universally. The abstract already states this limitation explicitly ('outside perturbation range equations did not perform well') and the conclusion emphasizes that more research is needed. The title is phrased as a question to reflect the exploratory intent rather than a definitive assertion. We will revise the abstract to more prominently qualify the 'governing laws' language and clarify that the model is a local approximation rather than a universal recovery of the underlying physics. revision: yes
-
Referee: [Abstract] Abstract: physical-term matching is reported only qualitatively ('some terms were interpretable as related to Fick's law of diffusion... and Henry's law'). No quantitative comparison (e.g., recovered coefficients versus known values from the ASPEN model or independent derivation) is provided, so the interpretability claim rests on post-hoc inspection rather than falsifiable validation.
Authors: The term matching was performed qualitatively via structural inspection of the recovered terms against known physical laws. A quantitative comparison of coefficients is not possible because ASPEN Dynamics is a proprietary black-box simulator whose internal parameters for diffusion and vapor-liquid equilibrium are not exposed for direct validation. We will revise the abstract to explicitly state that the interpretability is qualitative and structural only. revision: partial
Circularity Check
SINDy-derived 13-equation model is the direct output of sparse regression fitted to ASPEN trajectories
specific steps
-
fitted input called prediction
[Abstract]
"One promising result was reduction of number of equations for dynamic simulation from 1000s in ASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data from system within the perturbation range, however outside perturbation range equations did not perform well. In terms of physical law extraction, some terms were interpretable as related to Fick's law of diffusion (with concentration terms) and Henry's law (with ratio of concentration and pressure terms)."
The 13 equations and their physical interpretations are produced by SINDy sparse regression on the ASPEN-generated trajectories. The 'governing laws' are therefore the fitted model by construction; the noted failure to extrapolate outside the training perturbations confirms the result is a local data-driven approximation rather than recovery of the underlying continuous-time dynamics.
full rationale
The paper's central result (reduction from thousands of equations to 13, with terms interpreted as Fick's and Henry's laws) is obtained by applying SINDy sparse regression to time-series data generated from the ASPEN simulation. The abstract explicitly states that prediction accuracy holds only inside the training perturbation range and degrades outside it. This matches the fitted_input_called_prediction pattern: the claimed governing laws are defined by the fitted coefficients on the input trajectories rather than derived independently. The extrapolation failure is consistent with a local interpolative approximation whose library terms correlate with the data inside the training domain. No external verification or first-principles derivation is supplied that would make the 13 equations independent of the fit.
Axiom & Free-Parameter Ledger
free parameters (2)
- SINDy sparsity threshold
- Candidate function library
axioms (1)
- domain assumption System dynamics are exactly sparse in a pre-specified finite library of nonlinear functions of the state.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.