pith. sign in

arxiv: 2508.19420 · v2 · submitted 2025-08-26 · 🧬 q-bio.QM

Using PyBioNetFit to Leverage Qualitative and Quantitative Data in Biological Model Parameterization and Uncertainty Quantification

Pith reviewed 2026-05-18 20:27 UTC · model grok-4.3

classification 🧬 q-bio.QM
keywords systems biologymodel parameterizationqualitative dataquantitative datauncertainty quantificationODE modelscellular signalingERK activation
0
0 comments X

The pith

PyBioNetFit systematically incorporates qualitative observations as constraints alongside quantitative data to parameterize ODE models and quantify uncertainties.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper converts qualitative observations, such as rank orderings of signaling responses under different mutations, into formal mathematical constraints that an ODE model must satisfy. These constraints are then used together with quantitative measurements to automatically fit model parameters via PyBioNetFit, replacing the manual tuning performed in a 2013 study of MEK isoforms in ERK activation. The same workflow also enables uncertainty quantification for parameters and predictions, which was absent previously. A sympathetic reader would care because qualitative data from cellular experiments is common yet often ignored or handled inconsistently, and this approach makes parameterization more reproducible and reliable for studies of regulatory systems.

Core claim

Starting from the same data and the same ordinary differential equation model structure as the earlier study, we generate formalized statements of qualitative observations, making these observations more reusable, and we improve the model parameterization procedure by applying a systematic and automated approach enabled by the software package PyBioNetFit. We also demonstrate uncertainty quantification, which was absent in the original study.

What carries the argument

PyBioNetFit, the software that automates model fitting by treating formalized qualitative observations as mathematical constraints to be satisfied jointly with quantitative data.

If this is right

  • Model parameters are obtained through automated optimization rather than manual adjustment.
  • Uncertainty in both estimated parameters and model predictions can be quantified directly.
  • Qualitative data becomes reusable in explicit mathematical form for future analyses.
  • Reproducibility of parameterization and model analyses increases for cellular regulatory systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same formalization step could be applied to other signaling pathways where rank-order or presence/absence data is more available than precise measurements.
  • Joint fitting of mixed data types may yield models with improved predictive accuracy for interventions such as targeted inhibitors.
  • The workflow could be tested on stochastic or spatial extensions of the current ODE structure to assess robustness.

Load-bearing premise

Qualitative observations from the original experiments can be translated into accurate, reusable mathematical constraints without introducing inconsistencies or bias when combined with quantitative data during fitting.

What would settle it

An experiment showing that parameter values obtained under the formalized qualitative constraints violate one or more original qualitative observations or that the uncertainty intervals fail to cover independent validation measurements.

Figures

Figures reproduced from arXiv: 2508.19420 by Abhishek Mallela, Ely F. Miller, Jacob Neumann, Richard G. Posner, William S. Hlavacek, Yen Ting Lin.

Figure 1
Figure 1. Figure 1: Comparison of model-generated best-fit trajectories with experimental phosphorylation data in WT conditions. Each panel displays simulated trajectories for phosphorylated EGFR (solid orange), SOS1 (dashed light blue), and ERK (dotted green), plotted alongside their respective experimental data points (colored to match model predictions). All trajectories represent Maximum Likelihood Estimates (MLEs). Panel… view at source ↗
Figure 2
Figure 2. Figure 2: Comparison of model-predicted phosphorylated MEK trajectories under different parameterizations for five model variants. Each subplot illustrates the maximum likelihood estimates (MLEs) of phosphorylated MEK (in molecules ×10⁻⁵) for a given model. Curves are color-coded by model variant: wild type (WT, solid blue), knockout (KO, dashed orange), N78G (dotted light blue), T292A (dashed-dotted green), and T29… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of model-predicted phosphorylated ERK trajectories under different parameterizations for five model variants. Each subplot illustrates the maximum likelihood estimates (MLEs) of phosphorylated ERK (in molecules ×10⁻6 ) for a given model. Curves are color-coded by model variant: wild type (WT, solid blue), knockout (KO, dashed orange), N78G (dotted light blue), [PITH_FULL_IMAGE:figures/full_fig_… view at source ↗
Figure 7
Figure 7. Figure 7: Marginal posterior distributions for the two sensitive parameters used in adaptive MCMC sampling of the 5 models. Panel (A) displays the marginal posterior distribution for the EGFR dimer degradation rate constant 𝑑3 , and panel (B) shows the posterior distribution for the Sos1 dephosphorylation rate constant 𝑢3 . These histograms were generated from the adaptive MCMC samples used in Figures 5 and 6 and il… view at source ↗
read the original abstract

Data generated in studies of cellular regulatory systems are often qualitative. For example, measurements of signaling readouts in the presence and absence of mutations may reveal a rank ordering of responses across conditions but not the precise extents of mutation-induced differences. Qualitative data are often ignored by mathematical modelers or are considered in an ad hoc manner, as in the study of Kocieniewski and Lipniacki (2013) [Phys Biol 10: 035006], which was focused on the roles of MEK isoforms in ERK activation. In this earlier study, model parameter values were tuned manually to obtain consistency with a combination of qualitative and quantitative data. This approach is not reproducible, nor does it provide insights into parametric or prediction uncertainties. Here, starting from the same data and the same ordinary differential equation (ODE) model structure, we generate formalized statements of qualitative observations, making these observations more reusable, and we improve the model parameterization procedure by applying a systematic and automated approach enabled by the software package PyBioNetFit. We also demonstrate uncertainty quantification (UQ), which was absent in the original study. Our results show that PyBioNetFit enables qualitative data to be leveraged, together with quantitative data, in parameterization of systems biology models and facilitates UQ. These capabilities are important for reliable estimation of model parameters and model analyses in studies of cellular regulatory systems and reproducibility.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The manuscript applies PyBioNetFit to re-parameterize the ODE model of MEK isoforms in ERK activation from Kocieniewski and Lipniacki (2013). Starting from the same data and model structure, the authors formalize qualitative observations (rank orderings of responses under mutations) into mathematical constraints, combine them with quantitative data for automated fitting, and perform uncertainty quantification absent from the original manual tuning.

Significance. If the formalization step is robust and documented, the work provides a reproducible, systematic alternative to ad hoc manual tuning and demonstrates how mixed qualitative-quantitative data can be leveraged for parameterization and UQ in systems biology. This has practical value for improving reliability and reproducibility in cellular regulatory modeling.

major comments (1)
  1. The section describing generation of formalized statements of qualitative observations: the translation of rank orderings into inequalities on model outputs (steady-state or transients) is not accompanied by sensitivity analysis on choices such as which variables to constrain or inequality strictness. This is load-bearing because different formalizations alter the feasible parameter region and thus the reported UQ; without cross-validation against held-out qualitative observations or explicit documentation of the translation rules, the reproducibility advantage over the 2013 manual approach cannot be fully assessed.
minor comments (2)
  1. The abstract would be strengthened by reporting at least one concrete quantitative result, such as a change in parameter uncertainty ranges or fit residual metrics relative to the original study.
  2. All formalized qualitative constraints should be explicitly tabulated (with the exact inequalities and the model variables they apply to) to support the claim of reusability.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thoughtful review and for highlighting an important aspect of reproducibility in our approach. We address the major comment below and will revise the manuscript to strengthen documentation and analysis of the formalization process.

read point-by-point responses
  1. Referee: The section describing generation of formalized statements of qualitative observations: the translation of rank orderings into inequalities on model outputs (steady-state or transients) is not accompanied by sensitivity analysis on choices such as which variables to constrain or inequality strictness. This is load-bearing because different formalizations alter the feasible parameter region and thus the reported UQ; without cross-validation against held-out qualitative observations or explicit documentation of the translation rules, the reproducibility advantage over the 2013 manual approach cannot be fully assessed.

    Authors: We agree that explicit documentation of the translation rules and sensitivity analysis on formalization choices would improve the manuscript. In the revision we will add a dedicated subsection that lists each qualitative observation from Kocieniewski and Lipniacki (2013), states the exact inequality applied to the corresponding model output (including whether steady-state or transient values are used), and provides the rationale drawn directly from the original text. We will also include a sensitivity study that perturbs inequality strictness (e.g., replacing strict inequalities with relaxed thresholds differing by small epsilon values) and reports the resulting changes in the posterior parameter distributions and key UQ metrics. Because the study incorporates every qualitative observation reported in the 2013 paper, a true held-out cross-validation set is not available; we will therefore discuss this data limitation explicitly and note that the formalization rules are intended to be reusable for future studies that may permit such validation. revision: yes

Circularity Check

0 steps flagged

No significant circularity; parameterization uses external 2013 data and independent software

full rationale

The paper takes the ODE model structure and both quantitative and qualitative observations from the independent 2013 Kocieniewski & Lipniacki study, formalizes the qualitative observations into reusable constraints, and applies the external PyBioNetFit package to perform joint fitting and UQ. No step reduces by construction to a fitted parameter renamed as a prediction, a self-defined quantity, or a load-bearing self-citation chain; the central demonstration rests on the software's documented ability to handle mixed data types against an external benchmark model. The formalization step is presented as an explicit modeling choice rather than a derived result, and the reported outcomes (parameter estimates and uncertainty ranges) are generated from the combined inputs rather than presupposed by them.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the ODE model structure and data from the 2013 study, plus the assumption that qualitative rank-order observations can be converted into optimization constraints. No new physical entities are postulated. Free parameters are the kinetic rate constants of the signaling model being estimated from the combined data.

free parameters (1)
  • kinetic parameters of the ODE model
    Biological rate constants and other parameters in the MEK-ERK signaling ODE model that are fitted to the combined qualitative and quantitative data.
axioms (1)
  • domain assumption Qualitative experimental observations (e.g., rank orderings of responses across mutation conditions) can be translated into formal mathematical constraints suitable for automated optimization.
    Invoked when the paper states it generates formalized statements of qualitative observations starting from the same data as the 2013 study.

pith-pipeline@v0.9.0 · 5802 in / 1696 out tokens · 58212 ms · 2026-05-18T20:27:53.934782+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Then, following the inference approach of Mitra et al

    (Supplemental Tables 1–5). Then, following the inference approach of Mitra et al. (2018), which is elaborated above, we applied, in a single global optimization, a parallelized metaheuristic optimization method implemented in PyBioNetFit (Mitra et al., 2019) to find maximum likelihood estimates (MLEs) for 28 model parameters and 3 scaling factors that rel...

  2. [2]

    D., Hlavacek, W

    Mitra, E. D., Hlavacek, W. S. (2019). Parameter estimation and uncertainty quantification for systems biology models. Current Opinion in Systems Biology, 18, 9–18

  3. [3]

    C., Csikasz-Nagy, A., Gyorffy, B., Val, J., Novak, B., Tyson, J

    Chen, K. C., Csikasz-Nagy, A., Gyorffy, B., Val, J., Novak, B., Tyson, J. J. (2000). Kinetic analysis of a molecular model of the budding yeast cell cycle. Molecular Biology of the Cell, 11, 369–391

  4. [4]

    C., Calzone, L., Csikasz-Nagy, A., Cross, F

    Chen, K. C., Calzone, L., Csikasz-Nagy, A., Cross, F. R., Novak, B., Tyson, J. J. (2004). Integrative analysis of cell cycle control in budding yeast. Molecular Biology of the Cell, 15, 3841–3862

  5. [5]

    C., Laomettachit, T., Murali, T

    Kraikivski, P., Chen, K. C., Laomettachit, T., Murali, T. M., Tyson, J. J. (2015). From START to FINISH: computational analysis of cell cycle control in budding yeast. NPJ Systems Biology and Applications, 1, 1–9

  6. [6]

    A., Peccoud, J., Tyson, J

    Barik, D., Ball, D. A., Peccoud, J., Tyson, J. J. (2016) A stochastic model of the yeast cell cycle reveals roles for feedback regulation in limiting cellular variability. PLoS Computational Biology, 12, e1005230

  7. [7]

    J., Laomettachit, T., Kraikivski, P

    Tyson, J. J., Laomettachit, T., Kraikivski, P. (2019). Modeling the dynamic behavior of biochemical regulatory networks. Journal of Theoretical Biology, 462, 514–527

  8. [8]

    B., Tkacik, G., Callan, C

    Kinney, J. B., Tkacik, G., Callan, C. G. (2007). Precise physical models of protein-DNA interaction from high-throughput data. Proceedings of the National Academy of Sciences USA, 104, 501–506

  9. [9]

    B., Murugan, A., Gallan, C

    Kinney, J. B., Murugan, A., Gallan, C. G., Jr., Cox, E. C. (2010). Using deep sequencing to characterize the biophysical mechanism of a transcriptional regulatory sequence. Proceedings of the National Academy of Sciences USA, 107, 9158–9163

  10. [10]

    S., Kinney, J

    Atwal, G. S., Kinney, J. B. (2016). Learning quantitative sequence–function relationships from massively parallel experiments. Journal of Statistical Physics, 162, 1203–1243

  11. [11]

    B., McCandlish, D

    Kinney, J. B., McCandlish, D. M. (2019). Massively parallel assays and quantitative sequence-function relationships. Annual Review of Genomics and Human Genetics, 20, 99– 127. 12

  12. [12]

    Toni, T., Jovanovic, G., Huvet, M., Buck, M., Stumpf, M. P. H. (2011). From qualitative data to quantitative models: analysis of the phage shock protein stress response in Escherichia coli. BMC Systems Biology, 5, 69

  13. [13]

    Pargett, M., Umulis, D. M. (2013). Quantitative model analysis with diverse biological data: applications in developmental pattern formation. Methods, 62, 56–67

  14. [14]

    E., Buzzard, G

    Pargett, M., Rundell, A. E., Buzzard, G. T., Umulis, D. M. (2014). Model-based analysis for qualitative data: an application in Drosophila germline stem cell regulation. PLOS Computational Biology, 10, e1003498

  15. [15]

    Schmiester, L., Weindl, D., Hasenauer, J. (2020). Parameterization of mechanistic models from qualitative data using an efficient optimal scaling approach. Journal of Mathematical Biology, 81, 603–623

  16. [16]

    Schmiester, L., Weindl, D., Hasenauer, J. (2021). Efficient gradient-based parameter estimation for dynamic models using qualitative data. Bioinformatics, 37, 4493–4500

  17. [17]

    Dorešić, D., Grein, S., Hasenauer, J. (2024). Efficient parameter estimation for ODE models of cellular processes using semi-quantitative data. Bioinformatics, 40, i558–i566 (btae210)

  18. [18]

    C., Watson, L

    Oguz, C., Laomettachit, T., Chen, K. C., Watson, L. T., Baumann, W. T., Tyson, J. J. (2013). Optimization and model reduction in the high dimensional parameter space of a budding yeast cell cycle model. BMC Systems Biology, 7, 1–17

  19. [19]

    D., Dias, R., Posner, R

    Mitra, E. D., Dias, R., Posner, R. G., Hlavacek, W. S. (2018). Using both qualitative and quantitative data in parameter identification for systems biology models. Nature Communications, 9, 3901

  20. [20]

    D., Hlavacek, W

    Mitra, E. D., Hlavacek, W. S. (2020). Bayesian inference using qualitative observations of underlying continuous variables. Bioinformatics, 36, 3177–3184

  21. [21]

    D., Suderman, R., Colvin, J., Ionkov, A., Hu, A., Sauro, H

    Mitra, E. D., Suderman, R., Colvin, J., Ionkov, A., Hu, A., Sauro, H. M., Posner, R. G., Hlavacek, W. S. (2019). PyBioNetFit and the Biological Property Specification Language. iScience, 19, 1012–1036

  22. [22]

    Schälte, Y., Fröhlich, F., Jost, P. J., Vanhoefer, J., Pathirana, D., Stapor, P., Lakrisenko, P., Wang, D., Raimúndez, E., Merkt, S., Schmiester, L., Städter, P., Grein, S., Dudkin, E., Dorešić, D., Weindl, D., Hasenauer, J. (2023). pyPESTO: A modular and scalable tool for parameter estimation for dynamic models. Bioinformatics, 39, btad711

  23. [23]

    Kocieniewski, P., Lipniacki, T. (2013). MEK1 and MEK2 differentially control the duration and amplitude of the ERK cascade response. Physical Biology, 10, 035006

  24. [24]

    Catalanotti, F., Reyes, G., Jesenberger, V., Galabova-Kovacs, G., de Matos Simoes, R., Carugo, O., Baccarini, M. (2009). A Mek1–Mek2 heterodimer determines the strength and duration of the Erk signal. Nature Structural & Molecular Biology, 16, 294–303

  25. [25]

    Kamioka, Y., Yasuda, S., Fujita, Y., Aoki, K., Matsuda, M. (2010). Multiple decisive phosphorylation sites for the negative feedback regulation of SOS1 via ERK. Journal of Biological Chemistry, 285, 33540–33548

  26. [26]

    Kreutz, C., Raue, A., Kaschek, D., Timmer, J. (2013). Profile likelihood in systems biology. FEBS Journal, 280, 2564–2571

  27. [27]

    and Thoms, J

    Andrieu, C. and Thoms, J. (2008). A tutorial on adaptive MCMC. Statistics and computing, 18, 343–373

  28. [28]

    T., Mallela, A., Miller, E

    Neumann, J., Lin, Y. T., Mallela, A., Miller, E. F., Colvin, J., Duprat, A. T., Chen, Y., Hlavacek, W. S., Posner, R. G. (2022). Implementation of a practical Markov chain Monte Carlo sampling algorithm in PyBioNetFit. Bioinformatics, 38, 1770–1772

  29. [29]

    R., Blinov, M

    Faeder, J. R., Blinov, M. L., Hlavacek, W. S. (2009). Rule-based modeling of biochemical systems with BioNetGen. Methods in Molecular Biology, 500, 113–167. 13

  30. [30]

    M., Waltemath, D., König, M., Zhang, F., Dräger, A., Chaouiya, C., Bergmann, F

    Keating, S. M., Waltemath, D., König, M., Zhang, F., Dräger, A., Chaouiya, C., Bergmann, F. T., Finney, A., Gillespie, C. S., Helikar, T., Hoops, S., Malik-Sheriff, R. S., Moodie, S. L., Moraru, I. I., Myers, C. J., Naldi, A., Olivier, B. G., Sahle, S., Schaff, J. C., Smith, L. P., Swat, M. J., Thieffry, D., Watanabe, L., Wilkinson, D. J., Blinov, M. L., ...

  31. [31]

    A., Hogg, J

    Harris, L. A., Hogg, J. S., Tapia, J.J., Sekar, J. A. P., Gupta, S., Korsunsky, I., Arora, A., Barua, D., Sheehan, R. P., Faeder, J. R. (2016). BioNetGen 2.2: advances in rule-based modeling. Bioinformatics, 32, 3366–3368

  32. [32]

    Vehtari, A., Gelman, A., Simpson, D., Carpenter, B., Bürkner, P. C. (2021). Rank- normalization, folding, and localization: an improved 𝑅̂ for assessing convergence of MCMC (with discussion). Bayesian Analysis, 16, 667–718. 9 Tables Parameter Original Parameter Value Parameter Description c1L 2.0x10−2 c1 value after stimulation with ligand c2 2.0x10−7 EGF...