pith. sign in

arxiv: 2606.05191 · v1 · pith:PTQXBSXSnew · submitted 2026-05-07 · 💻 cs.LG · eess.SP

PyCC.id: A package for hypothesis-driven equation discovery with structural identifiability

Pith reviewed 2026-06-30 23:05 UTC · model grok-4.3

classification 💻 cs.LG eess.SP
keywords equation discoveryODEstructural identifiabilityPython libraryhypothesis-driventime-seriesskeletondifferential equations
0
0 comments X

The pith

PyCC is a Python library for hypothesis-driven ODE discovery using structural skeletons with identifiability properties.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces PyCC as a tool to address the ill-conditioned inverse problem of discovering ODEs from time-series data by allowing definition of structural skeletons. These skeletons represent families of equations and can incorporate user hypotheses and priors iteratively. Some skeletons offer structural identifiability, enabling checks on whether the skeleton is suitable. The library's modularity supports different discovery methods like neural networks and symbolic regression.

Core claim

In this work, we present the Python library PyCC, which condenses these efforts into a flexible tool that allows researchers and engineers to seamlessly define their skeletons and hypotheses to discover ODEs from time-dependent data. The methodology uses structural skeletons inspired by characteristic curves to define a hypothesis-driven approach where identifiability properties help validate the skeleton.

What carries the argument

The structural skeleton associated with a family of ODEs, enabling hypothesis incorporation and providing structural identifiability properties for validation.

If this is right

  • Users can check if a skeleton is correct or should be discarded using identifiability properties.
  • The search space is reduced by incorporating domain knowledge as priors.
  • Multiple equation discovery paradigms can be used due to modularity.
  • Iterative refinement of models is supported based on hypotheses.
  • The ill-conditioned nature of the inverse problem is mitigated by pre-defined constraints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This approach might extend to discovering other types of equations beyond ODEs.
  • It could connect to identifiability analysis in parameter estimation problems.
  • Testing on standard benchmark datasets for equation discovery could validate the identifiability checks.
  • Automation of hypothesis suggestion could be a natural next step.

Load-bearing premise

Some skeletons have demonstrable structural identifiability properties that allow checking whether the skeleton is correct or should be discarded.

What would settle it

An experiment showing that identifiability analysis on a skeleton fails to distinguish between correct and incorrect models when applied to time-series data from a known system.

Figures

Figures reproduced from arXiv: 2606.05191 by Federico J. Gonzalez.

Figure 1
Figure 1. Figure 1: The architecture for a second-order system. Two independent neural networks (NN [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
read the original abstract

Data-driven equation discovery is fundamentally an inverse problem that seeks to infer the governing differential equations of a system directly from time-series measurements. A known issue is the ill-conditioned nature of the inverse problem, which frequently produces multiple mathematical models that fit the data similarly well. One path to address this issue is by incorporating known hypotheses and constraints into the training phase beforehand. While this approach effectively reduces the search space, it still results in multiple candidate models, forcing practitioners to rely on post-hoc manual filtering based on their own domain expertise. A recent approach incorporates structural `skeletons' inspired by characteristic curves (CCs), defining a hypothesis-driven methodology. In this methodology, practitioners define a skeleton, which is associated with a family of ordinary differential equations (ODEs), and then add their hypotheses and priors based on their domain knowledge to refine the obtained model iteratively. An important advantage of this approach is that some skeletons have demonstrable structural identifiability properties, which are useful for checking whether the skeleton is correct or should be discarded. Furthermore, this formalism enables the use of multiple equation discovery paradigms due to its modularity (such as neural networks, symbolic regression, and sparse regression). In this work, we present the Python library PyCC, which condenses these efforts into a flexible tool that allows researchers and engineers to seamlessly define their skeletons and hypotheses to discover ODEs from time-dependent data.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript presents PyCC (also styled PyCC.id), a Python library for hypothesis-driven discovery of ODEs from time-series data. Users define structural skeletons that represent families of ODEs, incorporate domain hypotheses and priors, and exploit structural identifiability properties of certain skeletons to prune or validate candidate models. The formalism is modular, supporting multiple discovery back-ends including neural networks, symbolic regression, and sparse regression.

Significance. If the implementation correctly delivers the advertised modularity and identifiability checks, the package would supply a practical, extensible framework that integrates prior knowledge directly into the discovery pipeline, potentially mitigating the ill-posedness of inverse problems in equation discovery. The explicit support for structural identifiability as a model-selection criterion is a distinctive feature that could improve reliability over purely data-driven approaches.

minor comments (3)
  1. The title uses 'PyCC.id' while the abstract and body consistently refer to 'PyCC'; the naming convention should be unified and explained.
  2. The abstract asserts that 'some skeletons have demonstrable structural identifiability properties' and that the formalism 'enables the use of multiple equation discovery paradigms,' yet provides no concrete code examples, skeleton definitions, or usage snippets to illustrate these features.
  3. No installation instructions, dependency list, or link to the source repository appear in the provided text; these are standard requirements for a software-package manuscript.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, recognition of the package's potential significance, and recommendation for minor revision. The report contains no specific major comments requiring point-by-point responses.

Circularity Check

0 steps flagged

No circularity: software library description with no derivations

full rationale

The manuscript presents PyCC as a modular Python package for users to define ODE skeletons and hypotheses. No equations, fitted parameters, predictions, or derivation steps appear in the abstract or description. The statement that 'some skeletons have demonstrable structural identifiability properties' is presented as a pre-existing advantage of the approach rather than a result derived or fitted within this paper. No self-citation chains, ansatzes, or renamings are invoked as load-bearing steps. The work is self-contained as a tool release and contains no internal reduction of outputs to inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no information on free parameters, axioms or invented entities; the contribution is a software library implementing an existing approach.

pith-pipeline@v0.9.1-grok · 5777 in / 999 out tokens · 25773 ms · 2026-06-30T23:05:06.867263+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

14 extracted references · 11 canonical work pages · 2 internal anchors

  1. [1]

    Scientific machine learning

    Felix Dietrich and Wil Schilders. “Scientific machine learning”. In:Mathematische Semester- berichte72.2 (2025), pp. 89–115.issn: 1432-1815.doi:10.1007/s00591-025-00399-4

  2. [2]

    Carola-Bibiane Sch¨ onlieb and Zakhar Shumaylov.Data-driven approaches to inverse problems

  3. [3]

    arXiv:2506.11732 [math.NA].url:https://arxiv.org/abs/2506.11732

  4. [4]

    Determination of the characteristic curves of a nonlinear first order system from Fourier analysis

    Federico J. Gonzalez. “Determination of the characteristic curves of a nonlinear first order system from Fourier analysis”. en. In:Sci. Rep.13.1 (Feb. 2023), p. 1955.doi:10.1038/s41598-023- 29151-5

  5. [5]

    System identification based on characteristic curves: a mathematical connection between power series and Fourier analysis for first-order nonlinear systems

    F. J. Gonzalez. “System identification based on characteristic curves: a mathematical connection between power series and Fourier analysis for first-order nonlinear systems”. In:Nonlinear Dyn. 112.18 (July 2024), pp. 16167–16197.issn: 1573-269X.doi:10.1007/s11071-024-09890-4

  6. [6]

    Interpretable neural network system identification method for two families of second-order systems based on characteristic curves

    Federico J. Gonzalez and Luis P. Lara. “Interpretable neural network system identification method for two families of second-order systems based on characteristic curves”. In:Nonlin- ear Dyn.113.24 (Sept. 2025), pp. 33063–33086.issn: 1573-269X.doi:10.1007/s11071- 025- 11744-6

  7. [7]

    Integrating prior knowledge in equation discovery: Interpretable symmetry- informed neural networks and symbolic regression via characteristic curves

    Federico J. Gonzalez. “Integrating prior knowledge in equation discovery: Interpretable symmetry- informed neural networks and symbolic regression via characteristic curves”. In: (2026). arXiv: 2601.21720 [nlin.CD].url:https://arxiv.org/abs/2601.21720

  8. [8]

    Nayfeh and P

    Ali H. Nayfeh and P. Frank Pai.Linear and Nonlinear Structural Mechanics. New York, NY: John Wiley & Sons, Aug. 2004.doi:10.1002/9783527617562

  9. [9]

    Discovering governing equations from data by sparse identification of nonlinear dynamical systems

    Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. “Discovering governing equations from data by sparse identification of nonlinear dynamical systems”. In:Proc. Natl. Acad. Sci. 113.15 (2016), pp. 3932–3937.doi:10.1073/pnas.1517384113

  10. [10]

    Miles Cranmer.Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl

  11. [11]

    arXiv:2305.01582 [astro-ph.IM]

  12. [12]

    Ricky T. Q. Chen et al.Neural Ordinary Differential Equations. 2018. arXiv:1806 . 07366 [cs.LG].url:https://arxiv.org/abs/1806.07366

  13. [13]

    Raissi, P

    M. Raissi, P. Perdikaris, and G.E. Karniadakis. “Physics-informed neural networks: A deep learn- ing framework for solving forward and inverse problems involving nonlinear partial differential equations”. In:Journal of Computational Physics378 (2019), pp. 686–707.issn: 0021-9991.doi: 10.1016/j.jcp.2018.10.045

  14. [14]

    PyTorch: An Imperative Style, High-Performance Deep Learning Library

    Adam Paszke et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library”. In:NeurIPS 32. Ed. by H. Wallach et al. Curran Associates, Inc., 2019, pp. 8024–8035. 11