$PG-NODE^{TB}$: Physics-Guided Neural Ordinary Differential Equations for Tuberculosis Transmission Dynamics

Emmanuel M. Kabengele; Eric M. Mafuta; Fadi Al Machot; Jean Chamberlain Chedjou; Kyandoghere Kyamakya; Selain K. Kasereka

arxiv: 2604.15620 · v1 · submitted 2026-04-17 · 🧮 math.DS · math.OC

PG-NODE^(TB): Physics-Guided Neural Ordinary Differential Equations for Tuberculosis Transmission Dynamics

Selain K. Kasereka , Eric M. Mafuta , Fadi Al Machot , Emmanuel M. Kabengele , Jean Chamberlain Chedjou , Kyandoghere Kyamakya This is my paper

Pith reviewed 2026-05-10 08:19 UTC · model grok-4.3

classification 🧮 math.DS math.OC

keywords tuberculosisSLIR modelneural ordinary differential equationsphysics-guided neural ODEcompartmental modelstransmission dynamicsepidemiological modelingbasic reproduction number

0 comments

The pith

PG-NODE reformulates the SLIR TB model so neural networks can learn unknown or time-varying rates while preserving conservation laws and biological constraints.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that embeds neural network components into a standard SLIR compartmental model for tuberculosis transmission. This lets the system adapt rate functions to data-driven variations or missing effects while enforcing the same conservation laws and interpretability that classical fixed-parameter ODEs provide. A sympathetic reader cares because TB causes over a million deaths yearly and rigid models often fail to capture real-world changes in transmission or treatment, leading to poorer forecasts for policy choices. Mathematical analysis of the base SLIR system, including the basic reproduction number and sensitivity indices, precedes three simulation tests that compare adaptive tracking, error reduction on unmodeled dynamics, and long-horizon intervention comparisons. Full training on real surveillance data is noted as the remaining validation step.

Core claim

Reformulating the SLIR system as a physics-guided neural ODE allows neural components to learn unknown or time-varying rate functions from data while the overall structure continues to obey compartmental conservation laws and biological constraints, yielding lower RMSE than the classical SLIR model in simulations of unmodeled treatment and relapse effects.

What carries the argument

The PG-NODE reformulation of the SLIR ODE system, in which neural networks parameterize rate functions subject to explicit conservation and constraint penalties.

If this is right

Time-varying transmission rates can be tracked adaptively without manual re-parameterization.
Unmodeled treatment and relapse dynamics can be corrected with 27% lower RMSE than the classical SLIR model.
Competing intervention policies can be compared over a 20-year horizon with retained epidemiological interpretability.
Predictive accuracy improves while the model stays biologically interpretable.
The framework supports simulation-based testing before empirical training on surveillance data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same constrained neural-ODE structure could be applied to other compartmental disease models that share conservation requirements.
Integration with real surveillance streams might reduce reliance on manual fitting of rate parameters in public-health modeling.
Longer-term policy comparisons could become more responsive to observed shifts in transmission behavior.
Uncertainty estimates around the learned rates could be added to support risk-aware intervention planning.

Load-bearing premise

Neural network components can learn unknown or time-varying rate functions from data without violating the SLIR model's compartmental conservation laws or biological constraints.

What would settle it

Full adjoint-based training of the PG-NODE on real WHO TB surveillance data followed by out-of-sample prediction error and constraint-violation checks against the classical SLIR model on held-out periods; no improvement or clear constraint breaches would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.15620 by Emmanuel M. Kabengele, Eric M. Mafuta, Fadi Al Machot, Jean Chamberlain Chedjou, Kyandoghere Kyamakya, Selain K. Kasereka.

**Figure 2.** Figure 2: Scenario 1. (a) Active TB cases I(t) over 30 years: classical SLIR (blue dashed) vs. PG-NODE with learned βθ(t) (red solid). The gray dotted line marks the intervention onset (year 8). (b) Learned time-varying transmission rate βθ(t): the PG-NODE captures both seasonal fluctuations and the progressive reduction due to the intervention, which the classical fixed-β model cannot represent. health context that… view at source ↗

**Figure 3.** Figure 3: Scenario 2. (a) Infectious compartment I(t): SLIRT ground truth (black solid), classical SLIR (blue dashed), and PG-NODE with neural correction (red solid). (b) Combined treated+recovered population. (c) Absolute approximation error |Imodel(t) − Itrue(t)|: PG-NODE achieves 27% lower RMSE (5.20k vs. 7.10k) than classical SLIR, with the green shaded area indicating the improvement region. without requiring e… view at source ↗

**Figure 4.** Figure 4: Scenario 3. (a) Active TB cases forecast over 20 years under four strategies. (b) Cumulative TB cases averted relative to no-intervention baseline. Strategy B (treatment scale-up) averts the most cases in absolute terms over the 20-year window (57.7k), while PG-NODE Strategy D (combined optimal) averts 51.0k cases but achieves a substantially lower final R0 (1.49 vs. 2.53 for B), indicating superior long-t… view at source ↗

**Figure 5.** Figure 5: PG-NODE architecture for TB epidemic modeling. The neural network [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

read the original abstract

Tuberculosis (TB) remains a leading global infectious disease, causing approximately 1.3 million deaths and 10.6 million new infections annually. Classical compartmental ODE models are the standard epidemiological tool for TB, yet their fixed-parameter structure cannot adapt to time-varying dynamics, unmodeled effects, or heterogeneous real-world data. This paper presents a methodological framework and proof-of-concept for applying Physics-Guided Neural Ordinary Differential Equations (PG-NODE) to TB transmission modeling within a SLIR (Susceptible, Latent, Infectious, Recovered) compartmental framework. We perform a rigorous mathematical analysis of the SLIR model, including derivation of the basic reproduction number $\mathcal{R}_0$, equilibrium analysis, and normalized sensitivity indices. We then reformulate the SLIR system as a PG-NODE, preserving compartmental conservation laws and biological constraints while enabling neural network components to learn unknown or time-varying rate functions from data. Three simulation scenarios illustrate the framework's intended capabilities: (i) adaptive tracking of time-varying transmission rates, (ii) correcting for unmodeled treatment and relapse dynamics with 27\% lower RMSE than the classical SLIR, and (iii) comparative forecasting of competing intervention policies over a 20-year horizon. Simulation results indicate that PG-NODE has strong potential for improving predictive accuracy while maintaining epidemiological interpretability; full adjoint-based training on real WHO surveillance data is identified as the key next step for empirical validation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This is a clean proof-of-concept that embeds a standard SLIR model in a physics-guided NODE while keeping the compartmental constraints intact, but all gains are shown only on synthetic data.

read the letter

The paper first works through the usual SLIR analysis: it derives R0, finds the equilibria, and computes normalized sensitivity indices. That part is straightforward and independent of the neural component. They then recast the system as a PG-NODE so that neural pieces can learn time-varying or missing rates while the conservation laws and non-negativity stay enforced by construction. The three simulation scenarios are the main new content: one tracks a changing transmission rate, one adds unmodeled treatment and relapse terms and reports a 27% RMSE drop versus plain SLIR, and one compares long-horizon intervention policies. The authors are explicit that these are illustrations and that adjoint training on real WHO data is still required. That honesty keeps the claims proportionate. The soft spot is exactly what the stress-test note flags: every reported number comes from trajectories generated by the same or lightly modified SLIR dynamics, so the performance edge is shown under conditions where the data-generating process is known in advance. No real surveillance data, no error bars on the neural fits, and no code release are mentioned. The math itself looks internally consistent and the citation pattern is normal for this niche. Readers already working on hybrid or constrained neural ODEs in epidemiology will find a useful worked example here. I would bring it to a reading group to talk through the constraint design and how one would actually move to real data. It is not a paradigm shift, but the framework is coherent enough that a serious editor should send it out for review rather than desk-reject.

Referee Report

2 major / 3 minor

Summary. The manuscript presents a physics-guided neural ordinary differential equation (PG-NODE) framework for modeling tuberculosis transmission dynamics using an SLIR compartmental model. It includes a mathematical analysis deriving the basic reproduction number R0, analyzing equilibria, and computing sensitivity indices for the classical SLIR ODE system. The SLIR system is then reformulated as a PG-NODE where neural network components learn unknown or time-varying rate functions subject to constraints preserving compartmental conservation laws and non-negativity. Three simulation scenarios on synthetic data are used to illustrate adaptive rate tracking, correction for unmodeled treatment/relapse (with 27% RMSE improvement), and long-term policy forecasting.

Significance. The work provides a promising hybrid approach that combines the interpretability of classical epidemiological models with the flexibility of neural ODEs for handling time-varying parameters in TB modeling. The rigorous analysis of the SLIR model and the emphasis on preserving physical constraints are strengths that could facilitate adoption in the field. If the method proves effective on real surveillance data as proposed, it could improve predictive modeling for intervention planning. Currently, the significance is methodological, as empirical validation on real data is left for future work.

major comments (2)

[Simulation scenarios] Simulation scenarios section: The 27% RMSE reduction reported for the unmodeled treatment/relapse case is presented as a single scalar without error bars, standard deviations from repeated runs, or details on the number of independent trajectories, which is load-bearing for the quantitative performance claim relative to classical SLIR.
[PG-NODE reformulation] PG-NODE reformulation section: While the architecture is stated to enforce compartmental conservation and non-negativity, the manuscript provides no explicit mechanism (e.g., projection layer, constrained loss term, or post-training verification) or numerical check confirming that these invariants hold throughout training and inference on the synthetic trajectories.

minor comments (3)

[Abstract] Abstract: The title uses PG-NODE^{TB} but the abstract does not define the acronym or give a one-sentence overview of the constrained neural architecture, reducing standalone readability.
[Methods] Reproducibility: No statement on code or data availability is present; given that all results are simulation-based, releasing the synthetic data generators and training scripts would strengthen the proof-of-concept.
[Forecasting scenario] Forecasting scenario: The 20-year intervention policy comparison lacks explicit parameter values or functional forms for the competing policies, making it difficult to reproduce or extend the qualitative conclusions.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the recommendation of minor revision. The positive assessment of the hybrid PG-NODE framework's potential for TB modeling is appreciated. We address each major comment point by point below, with clear indications of the revisions that will be incorporated.

read point-by-point responses

Referee: [Simulation scenarios] Simulation scenarios section: The 27% RMSE reduction reported for the unmodeled treatment/relapse case is presented as a single scalar without error bars, standard deviations from repeated runs, or details on the number of independent trajectories, which is load-bearing for the quantitative performance claim relative to classical SLIR.

Authors: We agree that reporting the 27% RMSE improvement as a single scalar without accompanying statistics weakens the claim. In the revised manuscript we will perform and report results from multiple independent training runs (specifying the exact number of trials and random seeds used), provide the mean RMSE together with standard deviation for both the classical SLIR and PG-NODE models, and include error bars on the relevant figure or a supplementary table. This will allow readers to assess the robustness of the reported improvement. revision: yes
Referee: [PG-NODE reformulation] PG-NODE reformulation section: While the architecture is stated to enforce compartmental conservation and non-negativity, the manuscript provides no explicit mechanism (e.g., projection layer, constrained loss term, or post-training verification) or numerical check confirming that these invariants hold throughout training and inference on the synthetic trajectories.

Authors: The referee is correct that the manuscript asserts preservation of the invariants but does not detail the implementation or provide verification. We will expand the PG-NODE reformulation section to explicitly describe the constraint mechanism (a composite loss term penalizing deviations from total population conservation together with non-negative activations on rate parameters and state variables) and add a short subsection with numerical checks, such as time-series plots or tables of maximum deviation from conservation and negativity bounds, evaluated on the synthetic trajectories used in all three scenarios. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper derives the standard SLIR compartmental model, its equilibria, and R0 using conventional epidemiological methods, then embeds the system in a PG-NODE architecture that explicitly enforces mass conservation and non-negativity by construction. All reported simulation results are generated from synthetic trajectories produced by the same or lightly augmented SLIR dynamics; the authors explicitly identify adjoint training on real WHO data as future work rather than claiming empirical validity. No load-bearing self-citations, self-definitional loops, or fitted parameters renamed as independent predictions appear in the derivation chain. The work is a self-contained methodological proof-of-concept.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Abstract-only review limits detail; the framework assumes standard SLIR compartmental structure and conservation laws while introducing neural components whose parameters are fitted to data.

free parameters (1)

neural network parameters
Weights and biases of the neural components that learn unknown rate functions; these are fitted during training.

axioms (2)

domain assumption SLIR compartmental conservation laws hold and must be preserved
Invoked when reformulating the system as PG-NODE to maintain biological constraints.
standard math Basic reproduction number R0 and equilibrium analysis follow from standard next-generation matrix methods
Stated as part of the rigorous mathematical analysis performed on the classical SLIR model.

pith-pipeline@v0.9.0 · 5593 in / 1370 out tokens · 77161 ms · 2026-05-10T08:19:05.843504+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

4 extracted references · 4 canonical work pages

[1]

A stochastic agent-based model and simulation for controlling the spread of tuberculosisinamixedpopulationstructure, in: DevelopmentsofArtificial IntelligenceTechnologiesinComputationandRobotics: Proceedingsofthe 14th International FLINS Conference (FLINS 2020), World Scientific. pp. 659–666. Kasereka Kabunga, S., Doungmo Goufo, E.F., Ho Tuong, V.,

work page 2020
[2]

Advances in Difference Equations 2020,

Analysis and simulation of a mathematical model of tuberculosis transmission in democratic republic of the congo. Advances in Difference Equations 2020,

work page 2020
[3]

Mathematical Biosciences and Engineering 6, 815–837

Modeling TB and HIV co-infections. Mathematical Biosciences and Engineering 6, 815–837. doi:10.3934/mbe.2009.6.815. Uplekar, M., Weil, D., Lonnroth, K., Jaramillo, E., Lienhardt, C., Dias, H.M., Falzon, D., Floyd, K., Gargioni, G., Getahun, H., et al.,

work page doi:10.3934/mbe.2009.6.815 2009
[4]

World Health Organiza- tion

Technical Report. World Health Organiza- tion. Geneva, Switzerland.https://www.who.int/teams/ global-programme-on-tuberculosis-and-lung-health/tb-reports/ global-tuberculosis-report-2025, Accessed: April 9,

work page 2025

[1] [1]

A stochastic agent-based model and simulation for controlling the spread of tuberculosisinamixedpopulationstructure, in: DevelopmentsofArtificial IntelligenceTechnologiesinComputationandRobotics: Proceedingsofthe 14th International FLINS Conference (FLINS 2020), World Scientific. pp. 659–666. Kasereka Kabunga, S., Doungmo Goufo, E.F., Ho Tuong, V.,

work page 2020

[2] [2]

Advances in Difference Equations 2020,

Analysis and simulation of a mathematical model of tuberculosis transmission in democratic republic of the congo. Advances in Difference Equations 2020,

work page 2020

[3] [3]

Mathematical Biosciences and Engineering 6, 815–837

Modeling TB and HIV co-infections. Mathematical Biosciences and Engineering 6, 815–837. doi:10.3934/mbe.2009.6.815. Uplekar, M., Weil, D., Lonnroth, K., Jaramillo, E., Lienhardt, C., Dias, H.M., Falzon, D., Floyd, K., Gargioni, G., Getahun, H., et al.,

work page doi:10.3934/mbe.2009.6.815 2009

[4] [4]

World Health Organiza- tion

Technical Report. World Health Organiza- tion. Geneva, Switzerland.https://www.who.int/teams/ global-programme-on-tuberculosis-and-lung-health/tb-reports/ global-tuberculosis-report-2025, Accessed: April 9,

work page 2025