pith. sign in

arxiv: 2605.21676 · v1 · pith:U5JADA2Jnew · submitted 2026-05-20 · 💻 cs.LO · cs.FL

SENTIL: A Runtime Verification Tool for Probabilistic Temporal Logic

Pith reviewed 2026-05-22 08:00 UTC · model grok-4.3

classification 💻 cs.LO cs.FL
keywords runtime verificationprobabilistic signal temporal logicstatistical model checkingcyber-physical systemsstochastic systemsperformance evaluationstreaming algorithms
0
0 comments X

The pith

SENTIL enables runtime verification of Probabilistic Signal Temporal Logic with statistical guarantees and 10-1000x speedups.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SENTIL as a tool to monitor requirements written in Probabilistic Signal Temporal Logic inside systems whose behavior includes random variation, such as self-driving cars or medical devices. It seeks to make such monitoring practical during actual operation by supplying confidence levels rather than absolute yes-or-no answers. The approach rests on an efficient core implementation paired with streaming calculations and sampling methods that keep the monitoring fast enough for real platforms. A sympathetic reader would care because current tools handle only non-random cases, leaving safety checks for uncertain real-world systems largely unaddressed at runtime.

Core claim

SENTIL is a runtime verification tool for PrSTL that supplies rigorous probabilistic guarantees through statistical model checking, sequential probability ratio testing, and adaptive rare event estimation. It achieves this with an efficient Rust core, streaming algorithms for incremental robustness computation, parallel Monte Carlo sampling, and a C-ABI that permits direct use in ROS, Apollo, MATLAB Simulink, AUTOSAR, and languages including C, C++, Python, and Java. Validation on autonomous-vehicle, medical-device, and biological-network examples reports 10-1000x performance gains over prior monitors while preserving provable confidence intervals.

What carries the argument

Statistical model checking and sequential probability ratio testing applied to streaming robustness values computed from PrSTL formulas.

If this is right

  • Autonomous vehicles can check probabilistic safety properties while driving and receive quantified confidence in each result.
  • Medical-device software can be validated for stochastic behaviors during live operation rather than only in offline tests.
  • Platform integration becomes direct for ROS, MATLAB Simulink, and AUTOSAR without custom adapters.
  • Resource-limited embedded targets can run the monitor because of the reported speed gains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same streaming-plus-statistical pattern could support other probabilistic specification languages beyond PrSTL.
  • Runtime results could feed back into system controllers to adjust behavior when confidence drops.
  • Standard certification processes for stochastic controllers might eventually require evidence of this style of monitoring.

Load-bearing premise

The chosen statistical testing methods remain valid and low-cost when run continuously on streaming data from real systems.

What would settle it

A controlled experiment on a system with known true probability where the tool's reported confidence interval fails to contain that probability or where measured runtime exceeds deterministic monitors.

Figures

Figures reproduced from arXiv: 2605.21676 by Ernest Bonnah, Paapa Kwesi Quansah.

Figure 1
Figure 1. Figure 1: CARLA simulation screenshot showing Apollo ego vehicle (red [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: MATLAB Simulink insulin pump model with SENTIL S-Function verification blocks (highlighted in blue). [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Stochastic circadian oscillator trajectories showing Period protein concentration over 240 hours. Gray lines [PITH_FULL_IMAGE:figures/full_fig_p018_3.png] view at source ↗
read the original abstract

Stochastic cyber-physical systems (CPS) permeate critical infrastructure, from autonomous vehicles to medical devices. Yet, tools for runtime verification of such systems capturing the probabilistic dynamics in stochastic systems remain generally absent despite theoretical foundations established nearly a decade ago. In this paper, we present SENTIL, a novel runtime verification tool with provable statistical guarantees for the runtime monitoring of requirements expressed as Probabilistic Signal Temporal Logic (PrSTL). SENTIL combines an efficient Rust core with universal ecosystem integration, delivering performance exceeding existing deterministic monitors while providing rigorous probabilistic guarantees through statistical model checking, sequential probability ratio testing, and adaptive rare event estimation. SENTIL employs streaming algorithms for incremental robustness computation, parallel Monte Carlo sampling, and a language-agnostic C-ABI enabling seamless deployment across ROS, Apollo, MATLAB Simulink, and AUTOSAR platforms, and direct integration in C, C++, Python, and Java. To validate the effectiveness of the proposed tool, we validate SENTIL across various scenarios spanning autonomous vehicle monitoring, medical device validation, and biological networks, demonstrating 10-1,000$\times$ performance improvements over existing tools while maintaining provable confidence intervals. SENTIL is open source (\href{https://github.com/sedislab/SENTIL}{\texttt{sedislab/SENTIL}}) and it positions probabilistic runtime verification as a deployable infrastructure for all real-world safety-critical stochastic systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 1 minor

Summary. The manuscript presents SENTIL, a runtime verification tool for Probabilistic Signal Temporal Logic (PrSTL) targeting stochastic cyber-physical systems. It features an efficient Rust core with streaming algorithms for incremental robustness computation and parallel Monte Carlo sampling, a C-ABI for integration across ROS, Apollo, MATLAB Simulink, AUTOSAR, and languages including C, C++, Python, and Java. The tool claims to deliver rigorous probabilistic guarantees via statistical model checking, sequential probability ratio testing, and adaptive rare event estimation, along with 10-1,000× performance improvements over existing deterministic monitors. Validation is reported across autonomous vehicle monitoring, medical device validation, and biological networks, with the tool released as open source.

Significance. If the streaming implementation of adaptive rare event estimation maintains statistical validity and real-time bounds while achieving the claimed performance, this would constitute a meaningful contribution by making probabilistic runtime verification practically deployable for safety-critical stochastic systems. The broad ecosystem integration and open-source release are clear strengths that support reproducibility and adoption.

major comments (1)
  1. [Abstract and Implementation] Abstract and core algorithmic description: the central claim that adaptive rare event estimation (combined with statistical model checking and SPRT) can be realized via 'streaming algorithms for incremental robustness computation, parallel Monte Carlo sampling' to yield rigorous guarantees inside a real-time runtime monitor is load-bearing yet unsupported by any derivation, pseudocode, or restriction to past/current observations. Standard adaptive rare-event methods rely on trajectory reweighting or future simulation that risks violating either statistical independence or wall-clock latency bounds; without an explicit argument showing how the sample budget is bounded and validity preserved, the performance and guarantee claims cannot be assessed.
minor comments (1)
  1. [Evaluation] The validation section asserts '10-1,000× performance improvements' and 'provable confidence intervals' across scenarios but supplies no tables with raw timings, error bars, or exact statistical parameters; adding these would improve clarity without altering the core contribution.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review and for highlighting the need for greater clarity on the algorithmic foundations. We address the single major comment below and will revise the manuscript to strengthen the presentation of the statistical guarantees and streaming implementation.

read point-by-point responses
  1. Referee: [Abstract and Implementation] Abstract and core algorithmic description: the central claim that adaptive rare event estimation (combined with statistical model checking and SPRT) can be realized via 'streaming algorithms for incremental robustness computation, parallel Monte Carlo sampling' to yield rigorous guarantees inside a real-time runtime monitor is load-bearing yet unsupported by any derivation, pseudocode, or restriction to past/current observations. Standard adaptive rare-event methods rely on trajectory reweighting or future simulation that risks violating either statistical independence or wall-clock latency bounds; without an explicit argument showing how the sample budget is bounded and validity preserved, the performance and guarantee claims cannot be assessed.

    Authors: We agree that the current manuscript does not supply a self-contained derivation or pseudocode for the streaming realization of adaptive rare event estimation, nor does it explicitly argue the restriction to past/current observations and the bounding of the sample budget. This is a substantive gap that prevents full assessment of the claims. In the revised version we will insert a dedicated subsection (with pseudocode and a short proof sketch) that (i) defines the incremental robustness update as a streaming operator over the observed prefix, (ii) shows how the adaptive rare-event sampler draws only from the current observation window without future simulation, (iii) proves that the resulting samples remain independent under the maintained filtration, and (iv) derives an explicit, data-dependent bound on the number of Monte Carlo trajectories that respects real-time latency constraints while preserving the validity of the SPRT and statistical model checking guarantees. revision: yes

Circularity Check

0 steps flagged

No circularity detected in tool presentation paper

full rationale

The manuscript presents SENTIL as a software artifact implementing runtime verification for PrSTL using statistical model checking, SPRT, and adaptive rare event estimation in a streaming Rust core. No derivation chain, equations, or first-principles results are exhibited that reduce a claimed prediction or guarantee to an input parameter or self-citation by construction. Performance claims (10-1000x improvements) are positioned as empirical validation outcomes across scenarios rather than fitted quantities renamed as predictions. Self-citations, if present, are not load-bearing for any central mathematical result. The paper is self-contained as an engineering contribution with external benchmarks and open-source release, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that statistical model checking and related sequential testing methods can be adapted to streaming runtime verification while preserving both statistical rigor and the reported performance levels.

axioms (1)
  • domain assumption Statistical model checking, sequential probability ratio testing, and adaptive rare event estimation provide rigorous probabilistic guarantees when applied to runtime monitoring of PrSTL formulas.
    Invoked in the abstract to justify the provable guarantees of the tool.

pith-pipeline@v0.9.0 · 5775 in / 1236 out tokens · 34624 ms · 2026-05-22T08:00:08.028100+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

32 extracted references · 32 canonical work pages

  1. [1]

    Bak, O.S., Christiansen, M.W., Eriksen, O.V., Feo-Arenis, S., Jensen, P.G., Jensen, M.D., Juozapaitis, S., Larsen, K.G., Mikučionis,M.,Muñiz,M.,etal.:Gpuacceleratingstatisticalmodelcheckingforextendedtimedautomata.In:Principles of Verification: Cycling the Probabilistic Landscape: Essays Dedicated to Joost-Pieter Katoen on the Occasion of His 60th Birthda...

  2. [2]

    In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R

    Boyer, B., Corre, K., Legay, A., Sedwards, S.: Plasma-lab: A flexible, distributable statistical model checking library. In: Joshi, K., Siegle, M., Stoelinga, M., D’Argenio, P.R. (eds.) Quantitative Evaluation of Systems. pp. 160–164. Springer Berlin Heidelberg, Berlin, Heidelberg (2013)

  3. [3]

    The Annals of Mathematical Statistics23(4), 493–507 (1952),http://www.jstor.org/stable/2236576

    Chernoff, H.: A measure of asymptotic efficiency for tests of a hypothesis based on the sum of observations. The Annals of Mathematical Statistics23(4), 493–507 (1952),http://www.jstor.org/stable/2236576

  4. [4]

    Biometrika 26(4), 404–413 (1934)

    Clopper, C.J., Pearson, E.S.: The use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26(4), 404–413 (1934)

  5. [5]

    Uppaal SMC tutorial

    David, A., Larsen, K.G., Legay, A., Mikučionis, M., Poulsen, D.B.: Uppaal smc tutorial. International Journal on Software Tools for Technology Transfer17(4), 397–415 (2015).https://doi.org/10.1007/s10009-014-0361-y,https: //doi.org/10.1007/s10009-014-0361-y

  6. [6]

    In: Touili, T., Cook, B., Jackson, P

    Donzé, A.: Breach, a toolbox for verification and parameter synthesis of hybrid systems. In: Touili, T., Cook, B., Jackson, P. (eds.) Computer Aided Verification. pp. 167–170. Springer Berlin Heidelberg, Berlin, Heidelberg (2010)

  7. [7]

    In: International conference on runtime verification

    Donzé, A.: On signal temporal logic. In: International conference on runtime verification. pp. 382–383. Springer (2013)

  8. [8]

    IEEE Transactions on Automatic Control64(8), 3324–3331 (2018)

    Farahani, S.S., Majumdar, R., Prabhu, V.S., Soudjani, S.: Shrinking horizon model predictive control with signal tem- poral logic constraints under stochastic disturbances. IEEE Transactions on Automatic Control64(8), 3324–3331 (2018)

  9. [9]

    In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems

    Hartmanns, A., Hermanns, H.: The modest toolset: An integrated environment for quantitative modelling and verifica- tion. In: International Conference on Tools and Algorithms for the Construction and Analysis of Systems. pp. 593–598. Springer (2014)

  10. [10]

    International Standard ISO 26262-1:2018, International Organization for Standardization, Geneva, Switzerland (December 2018)

    International Organization for Standardization: Road vehicles — functional safety — part 1: Vocabulary. International Standard ISO 26262-1:2018, International Organization for Standardization, Geneva, Switzerland (December 2018)

  11. [11]

    In: International Conference on Computer Aided Verification

    Jegourel, C., Legay, A., Sedwards, S.: Importance splitting for statistical model checking rare properties. In: International Conference on Computer Aided Verification. pp. 576–591. Springer (2013)

  12. [12]

    Kovatchev, B.P., Breton, M., Dalla Man, C., Cobelli, C.: In silico preclinical trials: a proof of concept in closed-loop control of type 1 diabetes (2009)

  13. [13]

    In: Gopalakrishnan, G., Qadeer, S

    Kwiatkowska, M., Norman, G., Parker, D.: Prism 4.0: Verification of probabilistic real-time systems. In: Gopalakrishnan, G., Qadeer, S. (eds.) Computer Aided Verification. pp. 585–591. Springer Berlin Heidelberg, Berlin, Heidelberg (2011)

  14. [14]

    Proceedings of the National Academy of Sciences of the United States of America100(12), 7051–7056 (2003).https://doi.org/10

    Leloup, J.C., Goldbeter, A.: Toward a detailed computational model for the mammalian circadian clock. Proceedings of the National Academy of Sciences of the United States of America100(12), 7051–7056 (2003).https://doi.org/10. 1073/pnas.1132112100, epub 2003 May 29

  15. [15]

    ACM Transactions on Embedded Computing Systems22(3), 1–31 (2023)

    Lindemann, L., Jiang, L., Matni, N., Pappas, G.J.: Risk of stochastic systems for temporal logic specifications. ACM Transactions on Embedded Computing Systems22(3), 1–31 (2023)

  16. [16]

    IEEE Transactions on Automatic Control67(10), 5262–5277 (2021)

    Lindemann, L., Pappas, G.J., Dimarogonas, D.V.: Reactive and risk-aware control for signal temporal logic. IEEE Transactions on Automatic Control67(10), 5262–5277 (2021)

  17. [17]

    Nucleic Acids Research48(D1), D407–D415 (2020).https://doi.org/10

    Malik-Sheriff, R.S., Glont, M., Nguyen, T.V.N., Tiwari, K., Roberts, M.G., Xavier, A., Vu, M.T., Men, J., Maire, M., Kananathan, S., Fairbanks, E.L., Meyer, J.P., Arankalle, C., Varusai, T.M., Knight-Schrijver, V., Li, L., Dueñas-Roca, C., Dass, G., Keating, S.M., Park, Y.M., Buso, N., Rodriguez, N., Hucka, M., Hermjakob, H.: Biomodels–15 years of sharing...

  18. [18]

    Journal of diabetes science and technology8(1), 26–34 (2014)

    Man, C.D., Micheletto, F., Lv, D., Breton, M., Kovatchev, B., Cobelli, C.: The uva/padova type 1 diabetes simulator: new features. Journal of diabetes science and technology8(1), 26–34 (2014)

  19. [19]

    Nickovic, D., Yamaguchi, T.: Rtamt: Online robustness monitors from stl (2020),https://arxiv.org/abs/2005.11827

  20. [20]

    In: 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS)

    Qin, X., Xia, Y., Zutshi, A., Fan, C., Deshmukh, J.V.: Statistical verification of cyber-physical systems using surro- gate models and conformal inference. In: 2022 ACM/IEEE 13th International Conference on Cyber-Physical Systems (ICCPS). pp. 116–126. IEEE (2022) 8 P. Quansah and E. Bonnah

  21. [21]

    In: Proceedings of the 4th ACM SIGBED international workshop on design, modeling, and evaluation of cyber-physical systems

    Raman, V., Maasoumy, M., Donzé, A.: Model predictive control from signal temporal logic specifications: A case study. In: Proceedings of the 4th ACM SIGBED international workshop on design, modeling, and evaluation of cyber-physical systems. pp. 52–55 (2014)

  22. [22]

    Aviation Industry Supplement DO- 333(withEUROCAEED-216),RadioTechnicalCommissionforAeronautics(RTCA,Inc.),Washington,D.C.(December 2011)

    RTCA, Inc., EUROCAE: Formal Methods Supplement to DO-178C and DO-278A. Aviation Industry Supplement DO- 333(withEUROCAEED-216),RadioTechnicalCommissionforAeronautics(RTCA,Inc.),Washington,D.C.(December 2011)

  23. [23]

    Aviation Industry Document DO-178C (with EUROCAE ED-12C), Radio Technical Commission for Aeronautics (RTCA, Inc.), Washing- ton, D.C

    RTCA, Inc., EUROCAE: Software Considerations in Airborne Systems and Equipment Certification. Aviation Industry Document DO-178C (with EUROCAE ED-12C), Radio Technical Commission for Aeronautics (RTCA, Inc.), Washing- ton, D.C. (December 2011)

  24. [24]

    In: Proceedings of Robotics: Science and Systems XII (2016)

    Sadigh, D., Kapoor, A.: Safe control under uncertainty with probabilistic signal temporal logic. In: Proceedings of Robotics: Science and Systems XII (2016)

  25. [25]

    In: 2022 American Control Conference (ACC)

    Safaoui, S., Lindemann, L., Shames, I., Summers, T.H.: Risk-bounded temporal logic control of continuous-time stochas- tic systems. In: 2022 American Control Conference (ACC). pp. 1555–1562. IEEE (2022)

  26. [26]

    Tiger, M., Heintz, F.: Incremental reasoning in probabilistic signal temporal logic. International Journal of Ap- proximate Reasoning119, 325–352 (2020).https://doi.org/https://doi.org/10.1016/j.ijar.2020.01.009,https: //www.sciencedirect.com/science/article/pii/S0888613X19302671

  27. [27]

    Department of Transportation, National Highway Traffic Safety Administration: A vision for safety 2.0

    U.S. Department of Transportation, National Highway Traffic Safety Administration: A vision for safety 2.0. Tech. rep., U.S. Department of Transportation, National Highway Traffic Safety Administration (September 2017),https: //www.nhtsa.gov/node/36071

  28. [28]

    U.S. Food and Drug Administration: Assessing the credibility of computational modeling and simulation in medical device submissions; guidance for industry and food and drug administration staff. Tech. Rep. Docket No. FDA-2021-D- 0980, U.S. Department of Health and Human Services, Food and Drug Administration, Center for Devices and Radio- logical Health (...

  29. [29]

    Journal of the American Statistical Association22(158), 209–212 (1927)

    Wilson, E.B.: Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association22(158), 209–212 (1927)

  30. [30]

    Automatica173, 112038 (2025)

    Yao, Y., Sun, J., Zhang, Y.: Model predictive control of stochastic hybrid systems with signal temporal logic constraints. Automatica173, 112038 (2025)

  31. [31]

    Carnegie Mellon University (2004)

    Younes, H.L.S.: Verification and planning for stochastic processes with asynchronous events. Carnegie Mellon University (2004)

  32. [32]

    P" comparison number

    Yu, Z., Gao, H., Cong, X., Wu, N., Song, H.H.: A survey on cyber–physical systems security. IEEE Internet of Things Journal10(24), 21670–21686 (2023) SENTIL: A Runtime Verification Tool for Probabilistic Temporal Logic 9 A Complete PrSTL Language Specification This appendix provides the formal syntax and semantics of Probabilistic Signal Temporal Logic as...