pith. sign in

arxiv: 1907.07755 · v1 · pith:O4RO2BATnew · submitted 2019-07-18 · 📡 eess.SY · cs.LG· cs.SY· stat.ML

Can Machine Learning Identify Governing Laws For Dynamics in Complex Engineered Systems ? : A Study in Chemical Engineering

Pith reviewed 2026-05-24 19:52 UTC · model grok-4.3

classification 📡 eess.SY cs.LGcs.SYstat.ML
keywords SINDydistillation columngoverning equationsmachine learningchemical engineeringsparse regressiondynamics identificationASPEN simulation
0
0 comments X

The pith

SINDy reduces a distillation column's dynamics from thousands of equations to 13 while recovering some terms consistent with Fick's and Henry's laws.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether the SINDy algorithm can discover governing dynamical equations for a complex engineered system, specifically a distillation column whose behavior combines physical laws with heuristics. Time-series data generated from an ASPEN Dynamics simulation is used to fit a sparse model consisting of one ordinary differential equation per state variable. The resulting 13-equation model predicts system trajectories accurately inside the range of input perturbations applied during data collection. Several terms in the discovered equations can be interpreted as concentration gradients or concentration-pressure ratios that match the forms of known physical relations. The work concludes that further development is needed before the approach reliably extracts governing laws from engineered-system data.

Core claim

Applying SINDy to ASPEN-generated trajectories of a distillation column produces a sparse set of 13 differential equations that reproduce the column's dynamic response inside the training perturbation range; several of the retained nonlinear terms take the functional form of Fick's law (concentration differences driving flux) and Henry's law (equilibrium ratios of concentration to partial pressure).

What carries the argument

The SINDy algorithm, which solves a sparse regression problem over a user-supplied library of candidate nonlinear functions to recover a minimal set of ordinary differential equations whose right-hand sides best match numerically estimated time derivatives.

If this is right

  • Dynamic simulation of the distillation column can be performed with only 13 equations instead of the original thousands.
  • The discovered model remains interpretable because each term corresponds to a concrete nonlinear function of the states.
  • Some physical mechanisms (diffusion and gas-liquid equilibrium) are recoverable directly from the data-driven equations.
  • Prediction accuracy holds only for conditions similar to those used to generate the training trajectories.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Extending the library with additional engineering-specific functions could increase the fraction of terms that map to known laws.
  • The same reduction approach might be tested on other chemical unit operations whose full models are also too large for real-time use.
  • Replacing ASPEN-generated data with measurements from an operating plant would reveal whether measurement noise prevents recovery of the same sparse structure.

Load-bearing premise

The true continuous-time dynamics are exactly a sparse combination of functions from the chosen library, and the simulation trajectories excite all relevant modes without substantial mismatch or noise.

What would settle it

Forward integration of the 13 discovered equations diverges from the ASPEN trajectories even for new inputs inside the original perturbation range, or none of the retained terms can be rewritten in the exact mathematical form of Fick's or Henry's law.

read the original abstract

Machine learning recently has been used to identify the governing equations for dynamics in physical systems. The promising results from applications on systems such as fluid dynamics and chemical kinetics inspire further investigation of these methods on complex engineered systems. Dynamics of these systems play a crucial role in design and operations. Hence, it would be advantageous to learn about the mechanisms that may be driving the complex dynamics of systems. In this work, our research question was aimed at addressing this open question about applicability and usefulness of novel machine learning approach in identifying the governing dynamical equations for engineered systems. We focused on distillation column which is an ubiquitous unit operation in chemical engineering and demonstrates complex dynamics i.e. it's dynamics is a combination of heuristics and fundamental physical laws. We tested the method of Sparse Identification of Non-Linear Dynamics (SINDy) because of it's ability to produce white-box models with terms that can be used for physical interpretation of dynamics. Time series data for dynamics was generated from simulation of distillation column using ASPEN Dynamics. One promising result was reduction of number of equations for dynamic simulation from 1000s in ASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data from system within the perturbation range, however outside perturbation range equations did not perform well. In terms of physical law extraction, some terms were interpretable as related to Fick's law of diffusion (with concentration terms) and Henry's law (with ratio of concentration and pressure terms). While some terms were interpretable, we conclude that more research is needed on combining engineering systems with machine learning approach to improve understanding of governing laws for unknown dynamics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript applies the SINDy sparse regression method to time-series trajectories generated by an ASPEN Dynamics simulation of a binary distillation column. It reports that the procedure yields a 13-equation ODE model (one per state) whose terms include some that can be post-hoc matched to Fick's law and Henry's law, while noting that in-range prediction accuracy is high but degrades outside the training perturbation range.

Significance. Demonstrating SINDy on an industrially relevant engineered system with data from a commercial simulator is a useful test of applicability. However, the reported out-of-range failure directly undermines the central claim that the recovered equations constitute governing laws rather than a local interpolant; true governing laws are expected to hold for any reasonable input. The work therefore illustrates both the promise and the current limitations of the approach for complex chemical-engineering dynamics.

major comments (2)
  1. [Abstract] Abstract: the claim that SINDy identifies 'governing laws' is load-bearing for the title and research question, yet the text states that 'outside perturbation range equations did not perform well.' Because the underlying physical laws (mass, energy, equilibrium) are expected to hold beyond the training perturbations, this extrapolation failure falsifies the assumption that the 13-term model recovers the true continuous-time dynamics.
  2. [Abstract] Abstract: physical-term matching is reported only qualitatively ('some terms were interpretable as related to Fick's law of diffusion... and Henry's law'). No quantitative comparison (e.g., recovered coefficients versus known values from the ASPEN model or independent derivation) is provided, so the interpretability claim rests on post-hoc inspection rather than falsifiable validation.
minor comments (1)
  1. [Abstract] The abstract contains the grammatical error 'it's dynamics' (should be 'its dynamics').

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback highlighting the tension between our claims and the reported limitations. We address each major comment below and will revise the manuscript accordingly where possible.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim that SINDy identifies 'governing laws' is load-bearing for the title and research question, yet the text states that 'outside perturbation range equations did not perform well.' Because the underlying physical laws (mass, energy, equilibrium) are expected to hold beyond the training perturbations, this extrapolation failure falsifies the assumption that the 13-term model recovers the true continuous-time dynamics.

    Authors: We agree that the extrapolation failure indicates the recovered 13-equation model does not capture the true governing dynamics expected to hold universally. The abstract already states this limitation explicitly ('outside perturbation range equations did not perform well') and the conclusion emphasizes that more research is needed. The title is phrased as a question to reflect the exploratory intent rather than a definitive assertion. We will revise the abstract to more prominently qualify the 'governing laws' language and clarify that the model is a local approximation rather than a universal recovery of the underlying physics. revision: yes

  2. Referee: [Abstract] Abstract: physical-term matching is reported only qualitatively ('some terms were interpretable as related to Fick's law of diffusion... and Henry's law'). No quantitative comparison (e.g., recovered coefficients versus known values from the ASPEN model or independent derivation) is provided, so the interpretability claim rests on post-hoc inspection rather than falsifiable validation.

    Authors: The term matching was performed qualitatively via structural inspection of the recovered terms against known physical laws. A quantitative comparison of coefficients is not possible because ASPEN Dynamics is a proprietary black-box simulator whose internal parameters for diffusion and vapor-liquid equilibrium are not exposed for direct validation. We will revise the abstract to explicitly state that the interpretability is qualitative and structural only. revision: partial

Circularity Check

1 steps flagged

SINDy-derived 13-equation model is the direct output of sparse regression fitted to ASPEN trajectories

specific steps
  1. fitted input called prediction [Abstract]
    "One promising result was reduction of number of equations for dynamic simulation from 1000s in ASPEN to only 13 - one for each state variable. Prediction accuracy was high on the test data from system within the perturbation range, however outside perturbation range equations did not perform well. In terms of physical law extraction, some terms were interpretable as related to Fick's law of diffusion (with concentration terms) and Henry's law (with ratio of concentration and pressure terms)."

    The 13 equations and their physical interpretations are produced by SINDy sparse regression on the ASPEN-generated trajectories. The 'governing laws' are therefore the fitted model by construction; the noted failure to extrapolate outside the training perturbations confirms the result is a local data-driven approximation rather than recovery of the underlying continuous-time dynamics.

full rationale

The paper's central result (reduction from thousands of equations to 13, with terms interpreted as Fick's and Henry's laws) is obtained by applying SINDy sparse regression to time-series data generated from the ASPEN simulation. The abstract explicitly states that prediction accuracy holds only inside the training perturbation range and degrades outside it. This matches the fitted_input_called_prediction pattern: the claimed governing laws are defined by the fitted coefficients on the input trajectories rather than derived independently. The extrapolation failure is consistent with a local interpolative approximation whose library terms correlate with the data inside the training domain. No external verification or first-principles derivation is supplied that would make the 13 equations independent of the fit.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The claim depends on the SINDy sparsity assumption and on user-selected hyperparameters that are tuned to the data; no independent evidence is supplied that the recovered terms are the true governing physics.

free parameters (2)
  • SINDy sparsity threshold
    Controls which candidate terms survive; value chosen to produce the reported 13-equation model.
  • Candidate function library
    The set of nonlinear terms offered to the regressor is defined by the authors and directly determines which physical interpretations are possible.
axioms (1)
  • domain assumption System dynamics are exactly sparse in a pre-specified finite library of nonlinear functions of the state.
    Core modeling assumption of SINDy invoked to justify the reduction from thousands to 13 equations.

pith-pipeline@v0.9.0 · 5843 in / 1365 out tokens · 29048 ms · 2026-05-24T19:52:34.059815+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.