pith. sign in

arxiv: 2604.07679 · v1 · submitted 2026-04-09 · 💻 cs.SE · cs.LG· cs.SY· eess.SY

Towards Counterfactual Explanation and Assertion Inference for CPS Debugging

Pith reviewed 2026-05-10 18:33 UTC · model grok-4.3

classification 💻 cs.SE cs.LGcs.SYeess.SY
keywords counterfactual explanationassertion inferenceCPS debuggingcyber-physical systemssimulation failuresinput signal changescausal modelstest verification
0
0 comments X

The pith

DeCaF generates minimal counterfactual changes to input signals that turn failing CPS tests into passing ones and infers generalizable assertions from those changes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces DeCaF to help engineers interpret hard-to-explain failures in large-scale CPS simulations, where violations often arise from specific interactions between continuous signals and discrete events at particular times. Given a failing test, DeCaF produces small, targeted modifications to the input signals that make the test pass, using combinations of counterfactual generators and causal models to ensure the changes are minimal, necessary, and sufficient. It then extracts logical assertions over the input values and timings that capture the conditions for success in a form engineers can read and reason about, without any access to the internal structure of the simulated model. A sympathetic reader would care because current debugging tools can point to faulty components but rarely reveal the precise input conditions that trigger the problem or the smallest fix that would have avoided it.

Core claim

DeCaF combines three counterfactual generators with two causal models to create minimal, necessary, and sufficient changes to the input signals of a failing CPS test so that the test becomes passing, then infers success assertions as logical predicates over those inputs that generalize the recovery conditions in an interpretable way.

What carries the argument

DeCaF framework, which pairs counterfactual generators (KD-Tree Nearest Neighbors, Genetic Algorithm) with causal models (M5 model tree, Random Forest) to produce precise input-signal corrections and derive assertions.

If this is right

  • Engineers obtain interpretable logical predicates that describe the exact input values and timings responsible for a violation.
  • The framework works on black-box models since it requires no internal access to the CPS simulation code.
  • Different generator-model pairs trade off success rate against causal precision, with KD-Tree Nearest Neighbors plus M5 model tree showing the highest success rate across the evaluated case studies.
  • The generated assertions characterize recovery conditions that can be checked on future inputs without rerunning full simulations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The assertions could be reused to filter or generate new test inputs that are guaranteed to avoid the identified failure modes.
  • If the causal models prove accurate on additional CPS examples, the same generator combinations might reduce the total number of simulations needed during verification.
  • The approach implicitly treats the input-signal space as the primary diagnostic surface rather than the model internals, which may shift debugging effort toward input specification and test design.

Load-bearing premise

That the chosen counterfactual generators and causal models can reliably produce minimal, necessary, and sufficient input changes, and that the resulting assertions accurately generalize the recovery conditions beyond the original failing tests.

What would settle it

Apply the counterfactual changes or the inferred assertions to new, previously unseen failing tests in the same CPS models and observe whether the changes actually make the tests pass or whether the assertions correctly predict success versus failure.

Figures

Figures reproduced from arXiv: 2604.07679 by Hadiza Yusuf, Khouloud Gaaloul, Zaid Ghazal.

Figure 1
Figure 1. Figure 1: Example Illustration of DeCaF and Example signals of counterfactual explanation generated for the AT case-study [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of DeCaF Framework. replacing 𝑐𝑢, 𝑗 with the signal value 𝑢(𝑡). Second, we apply logical translation to combine universal quantifiers across conjunctive expressions. For example, the counterfactual in [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of evaluation metrics of the ML techniques [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
read the original abstract

Verification and validation of cyber-physical systems (CPS) via large-scale simulation often surface failures that are hard to interpret, especially when triggered by interactions between continuous and discrete behaviors at specific events or times. Existing debugging techniques can localize anomalies to specific model components, but they provide little insight into the input-signal values and timing conditions that trigger violations, or the minimal, precisely timed changes that could have prevented the failure. In this article, we introduce DeCaF, a counterfactual-guided explanation and assertion-based characterization framework for CPS debugging. Given a failing test input, DeCaF generates counterfactual changes to the input signals that transform the test from failing to passing. These changes are designed to be minimal, necessary, and sufficient to precisely restore correctness. Then, it infers assertions as logical predicates over inputs that generalize recovery conditions in an interpretable form engineers can reason about, without requiring access to internal model details. Our approach combines three counterfactual generators with two causal models, and infers success assertions. Across three CPS case studies, DeCaF achieves its best success rate with KD-Tree Nearest Neighbors combined with M5 model tree, while Genetic Algorithm combined with Random Forest provides the strongest balance between success and causal precision.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper introduces DeCaF, a counterfactual-guided explanation and assertion-based characterization framework for debugging cyber-physical systems (CPS). Given a failing test input, DeCaF uses three counterfactual generators (including Genetic Algorithm and KD-Tree Nearest Neighbors) combined with causal models (such as M5 model trees and Random Forest) to produce changes to input signals that turn the test from failing to passing; these changes are claimed to be minimal, necessary, and sufficient. It then infers interpretable logical predicates (assertions) over inputs that generalize the recovery conditions. Evaluation across three CPS case studies reports that KD-Tree NN with M5 achieves the highest success rate while GA with Random Forest provides the best balance between success and causal precision.

Significance. If the central claims hold, DeCaF would offer a practical advance for CPS debugging by delivering actionable, minimal input modifications and human-readable assertions without requiring white-box access to the system under test. The multi-generator design and emphasis on causal precision address a real gap between localization techniques and interpretable root-cause analysis. The empirical results on three case studies provide initial evidence of feasibility, though the absence of verification for the minimality/necessity/sufficiency properties and generalization reduces the immediate strength of the contribution.

major comments (2)
  1. [Abstract] Abstract: The central claim that generated counterfactual changes are 'minimal, necessary, and sufficient' to restore correctness and that inferred assertions 'generalize recovery conditions' is load-bearing for the entire contribution, yet the reported evaluation provides only aggregate success rates and a balance metric with no quantitative checks (e.g., whether a strictly smaller perturbation still fails, whether the change lies on the decision boundary, or whether the assertion holds on held-out inputs).
  2. [Evaluation] Evaluation section (implied by the three case studies): The abstract and results summary give no methodology specifics, statistical details, or discussion of limitations for the success-rate and causal-precision numbers; without these, it is impossible to determine whether the heuristic generators plus causal models reliably enforce the required properties or merely produce plausible but non-minimal recoveries.
minor comments (1)
  1. [Abstract] The abstract would benefit from at least one concrete quantitative result (e.g., success rate or precision value) rather than only qualitative statements about 'best' and 'strongest balance'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which highlight important aspects of our evaluation that can be strengthened. We provide point-by-point responses to the major comments and outline the revisions we will make to address them.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central claim that generated counterfactual changes are 'minimal, necessary, and sufficient' to restore correctness and that inferred assertions 'generalize recovery conditions' is load-bearing for the entire contribution, yet the reported evaluation provides only aggregate success rates and a balance metric with no quantitative checks (e.g., whether a strictly smaller perturbation still fails, whether the change lies on the decision boundary, or whether the assertion holds on held-out inputs).

    Authors: We acknowledge that the evaluation in the current manuscript focuses on success rates and causal precision without explicit quantitative verification of minimality, necessity, sufficiency, or generalization on held-out data. The success rate indicates that the generated counterfactuals lead to passing tests, and the balance metric considers causal precision, but direct checks such as testing smaller perturbations or boundary conditions were not performed. In the revised version, we will add these validations: we will report the average perturbation size compared to random baselines, verify that the original failing input is recovered only with the full change, and evaluate the inferred assertions on a held-out set of test cases to demonstrate generalization. These additions will be incorporated into the Evaluation section. revision: yes

  2. Referee: [Evaluation] Evaluation section (implied by the three case studies): The abstract and results summary give no methodology specifics, statistical details, or discussion of limitations for the success-rate and causal-precision numbers; without these, it is impossible to determine whether the heuristic generators plus causal models reliably enforce the required properties or merely produce plausible but non-minimal recoveries.

    Authors: The manuscript does provide methodology details in the Evaluation section, including descriptions of the three CPS case studies, the counterfactual generators (Genetic Algorithm and KD-Tree Nearest Neighbors), the causal models (M5 model trees and Random Forest), and how success is measured. Statistical details such as the number of experiments and averaging over runs are included. However, we agree that a more explicit discussion of limitations and potential issues with the heuristic nature of the generators is needed to fully address concerns about reliability and minimality. We will revise the Evaluation section to include additional statistical analysis (e.g., standard deviations, significance tests) and a new subsection on limitations, discussing the assumptions of the causal models and the heuristic search for counterfactuals. revision: partial

Circularity Check

0 steps flagged

No circularity detected; empirical method proposal evaluated on external case studies

full rationale

The paper introduces DeCaF as a framework that combines three existing counterfactual generators (GA, KD-Tree NN, etc.) with two causal models to produce input changes and infer assertions for CPS debugging. All load-bearing claims are empirical performance results (success rates, causal precision) measured on three separate CPS case studies. No equations, derivations, or self-citations are presented that reduce the central claims to tautological redefinitions or fitted inputs renamed as predictions. The method is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work to force its conclusions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The provided abstract does not detail any free parameters, background axioms, or newly invented entities used in the framework.

pith-pipeline@v0.9.0 · 5524 in / 1147 out tokens · 74711 ms · 2026-05-10T18:33:48.109440+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

55 extracted references · 55 canonical work pages

  1. [1]

    A roadmap for simulation-based testing of autonomous cyber- physical systems: Challenges and future direction,

    C. Birchler, S. Khatiri, P. Rani, T. Kehrer, and S. Panichella, “A roadmap for simulation-based testing of autonomous cyber- physical systems: Challenges and future direction,”ACM Trans. Softw. Eng. Methodol., vol. 34, no. 5, May 2025. [Online]. Available: https://doi.org/10.1145/3711906

  2. [2]

    Simulation-based test case generation for unmanned aerial vehicles in the neighborhood of real flights,

    S. Khatiri, S. Panichella, and P. Tonella, “Simulation-based test case generation for unmanned aerial vehicles in the neighborhood of real flights,” in2023 IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 2023, pp. 281–292

  3. [3]

    Pareto efficient multi-objective black-box test case selec- tion for simulation-based testing,

    A. Arrieta, S. Wang, U. Markiegi, A. Arruabarrena, L. Etxeberria, and G. Sagardui, “Pareto efficient multi-objective black-box test case selec- tion for simulation-based testing,”Information and Software Technology, 2019

  4. [4]

    Digital twins could revolutionize planes, cars and hearts,

    S. Woo, “Digital twins could revolutionize planes, cars and hearts,” https://www.wsj.com/articles/digital-twins-could-revolutionize- planes-cars-and-hearts-technology-a8c2bd4e, 2024, wall Street Journal, July 17, 2024

  5. [5]

    D. K. Chaturvedi,Modeling and simulation of systems using MATLAB® and Simulink®. CRC press, 2017, ISBN: 978-1439806722

  6. [6]

    Finding unknown unknowns using cyber- physical system simulators,

    S. D. Wehbe and S. Bak, “Finding unknown unknowns using cyber- physical system simulators,” inProceedings of the 7th Workshop on Design Automation for CPS and IoT, 2025, pp. 1–6

  7. [7]

    Failure diagnosis using discrete-event models,

    M. Sampath, R. Sengupta, S. Lafortune, K. Sinnamohideen, and D. C. Teneketzis, “Failure diagnosis using discrete-event models,”IEEE trans- actions on control systems technology, vol. 4, no. 2, pp. 105–124, 2002

  8. [8]

    Trace diagnostics using temporal implicants,

    T. Ferrère, O. Maler, and D. Ni ˇckovi´c, “Trace diagnostics using temporal implicants,” inInternational Symposium on Automated Technology for Verification and Analysis. Springer, 2015, pp. 241–258

  9. [9]

    Localizing faults in simulink/stateflow models with stl,

    E. Bartocci, T. Ferrère, N. Manjunath, and D. Ni ˇckovi´c, “Localizing faults in simulink/stateflow models with stl,” inProceedings of the 21st international conference on hybrid systems: computation and control (part of cps week), 2018, pp. 197–206

  10. [10]

    Simulink fault localization: an iterative statistical debugging approach,

    B. Liu, L. Lucia, S. Nejati, L. C. Briand, and T. Bruckmann, “Simulink fault localization: an iterative statistical debugging approach,”Software Testing, Verification and Reliability, vol. 26, no. 6, pp. 431–459, 2016

  11. [11]

    Cpsdebug: Automatic failure explanation in cps models,

    E. Bartocci, N. Manjunath, L. Mariani, C. Mateis, and D. Ni ˇckovi´c, “Cpsdebug: Automatic failure explanation in cps models,”International Journal on Software Tools for Technology Transfer, pp. 1–14, 2021

  12. [12]

    Causal signal temporal logic for the environmental control and life support system’s fault analysis and explanation,

    Z. Deng, S. P. Eshima, J. Nabity, and Z. Kong, “Causal signal temporal logic for the environmental control and life support system’s fault analysis and explanation,”IEEE Access, vol. 11, pp. 26 471–26 482, 2023

  13. [13]

    Counterfault: Value-based fault local- ization by modeling and predicting counterfactual outcomes,

    A. Podgurski and Y . Küçük, “Counterfault: Value-based fault local- ization by modeling and predicting counterfactual outcomes,” in2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2020, pp. 382–393

  14. [14]

    Causal testing: understanding de- fects’ root causes,

    B. Johnson, Y . Brun, and A. Meliou, “Causal testing: understanding de- fects’ root causes,” inProceedings of the ACM/IEEE 42nd international conference on software engineering, 2020, pp. 87–99

  15. [15]

    Applications of causality and causal inference in software engineering,

    P. Chadbourne and N. U. Eisty, “Applications of causality and causal inference in software engineering,” in2023 IEEE/ACIS 21st Interna- tional Conference on Software Engineering Research, Management and Applications (SERA). IEEE, 2023, pp. 47–52

  16. [16]

    Root cause detection among anomalous time series using temporal state alignment,

    S. Chakraborty, S. Shah, K. Soltani, and A. Swigart, “Root cause detection among anomalous time series using temporal state alignment,” in2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 2019, pp. 523–528

  17. [17]

    Root cause localization for unreproducible builds via causality analysis over system call trac- ing,

    Z. Ren, C. Liu, X. Xiao, H. Jiang, and T. Xie, “Root cause localization for unreproducible builds via causality analysis over system call trac- ing,” in2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 2019, pp. 527–538

  18. [18]

    Causal signal temporal logic for the environmental control and life support system’s fault analysis and explanation,

    Z. Deng, S. Eshima, J. Nabity, and Z. Kong, “Causal signal temporal logic for the environmental control and life support system’s fault analysis and explanation,”IEEE Access, vol. PP, pp. 1–1, 01 2023

  19. [19]

    Human-in-the- loop oracle learning for semantic bugs in string processing programs,

    C. G. Kapugama, V .-T. Pham, A. Aleti, and M. Böhme, “Human-in-the- loop oracle learning for semantic bugs in string processing programs,” inProceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, 2022, pp. 215–226

  20. [20]

    Abstracting failure-inducing inputs,

    R. Gopinath, A. Kampmann, N. Havrikov, E. O. Soremekun, and A. Zeller, “Abstracting failure-inducing inputs,” inProceedings of the 29th ACM SIGSOFT international symposium on software testing and analysis, 2020, pp. 237–248

  21. [21]

    Inputs from hell,

    E. Soremekun, E. Pavese, N. Havrikov, L. Grunske, and A. Zeller, “Inputs from hell,”IEEE Transactions on Software Engineering, vol. 48, no. 4, pp. 1138–1153, 2020

  22. [22]

    When does my program do this? learning circumstances of software behavior,

    A. Kampmann, N. Havrikov, E. O. Soremekun, and A. Zeller, “When does my program do this? learning circumstances of software behavior,” inProceedings of the 28th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, 2020, pp. 1228–1239

  23. [23]

    Combining genetic programming and model checking to generate environment assumptions,

    K. Gaaloul, C. Menghi, S. Nejati, L. C. Briand, and Y . I. Parache, “Combining genetic programming and model checking to generate environment assumptions,”IEEE Transactions on Software Engineering, vol. 48, no. 9, pp. 3664–3685, 2021

  24. [24]

    Learning non- robustness using simulation-based testing: a network traffic-shaping case study,

    B. A. Jodat, S. Nejati, M. Sabetzadeh, and P. Saavedra, “Learning non- robustness using simulation-based testing: a network traffic-shaping case study,” in2023 IEEE Conference on Software Testing, Verification and Validation (ICST). IEEE, 2023, pp. 386–397

  25. [25]

    The daikon system for dynamic detection of likely invariants,

    M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao, “The daikon system for dynamic detection of likely invariants,”Science of computer programming, vol. 69, no. 1-3, pp. 35–45, 2007

  26. [26]

    Min- ing assumptions for software components using machine learning,

    K. Gaaloul, C. Menghi, S. Nejati, L. C. Briand, and D. Wolfe, “Min- ing assumptions for software components using machine learning,” in Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2020, pp. 159–171

  27. [27]

    Adaptive cruise control for an intelligent vehicle,

    W. Pananurak, S. Thanok, and M. Parnichkun, “Adaptive cruise control for an intelligent vehicle,” in2008 IEEE International Conference on Robotics and Biomimetics. IEEE, 2009, pp. 1794–1799

  28. [28]

    Requirements-driven test generation for autonomous vehicles with ma- chine learning components,

    C. E. Tuncali, G. Fainekos, D. Prokhorov, H. Ito, and J. Kapinski, “Requirements-driven test generation for autonomous vehicles with ma- chine learning components,”IEEE Transactions on Intelligent Vehicles, vol. 5, no. 2, pp. 265–280, 2019

  29. [29]

    [Online]

    (Accessed: September 2025) Cruise control test generation. [Online]. Available: https://www.mathworks.com/help/sldv/ug/cruise-control-test- generation.html

  30. [30]

    [Online]

    (Accessed: September 2025) Building a clutch lock-up model. [Online]. Available: https://www.mathworks.com/help/simulink/slref/building-a- clutch-lock-up-model.html

  31. [31]

    [Online]

    (Accessed: September 2025) Design a guid- ance system in matlab and simulink. [Online]. Available: https://www.mathworks.com/help/simulink/slref/designing-a- guidance-system-in-matlab-and-simulink.html

  32. [32]

    [Online]

    (Accessed: September 2025) Dc motor model simulink model. [Online]. Available: https://www.mathworks.com/matlabcentral/fileexchange/11587-dc- motor-model-simulink

  33. [33]

    Arch-comp 2024 category report: Falsification,

    T. Khandait, F. Formica, P. Arcaini, S. Chotaliya, G. Fainekos, A. Hekal, A. Kundu, E. Lew, M. Loreti, C. Menghiet al., “Arch-comp 2024 category report: Falsification,” inProceedings of the 11th Int. Workshop on Applied, vol. 103, 2024, pp. 122–144

  34. [34]

    Arch-comp 2019 category report: Falsification

    G. Ernst, P. Arcaini, A. Donze, G. Fainekos, L. Mathesen, G. Pedrielli, S. Yaghoubi, Y . Yamagata, and Z. Zhang, “Arch-comp 2019 category report: Falsification.” inARCH@ CPSIoTWeek, 2019, pp. 129–140

  35. [35]

    Luke,Essentials of Metaheuristics, 2nd ed

    S. Luke,Essentials of Metaheuristics, 2nd ed. Lulu, 2013, available for free at http://cs.gmu.edu/∼sean/book/metaheuristics/

  36. [36]

    Monitoring temporal properties of con- tinuous signals,

    O. Maler and D. Nickovic, “Monitoring temporal properties of con- tinuous signals,” inInternational Symposium on Formal Techniques in Real-Time and Fault-Tolerant Systems. Springer, 2004, pp. 152–166

  37. [37]

    Learning with continuous classes,

    R. J. Quinlan, “Learning with continuous classes,” in5th Australian Joint Conference on Artificial Intelligence. Singapore: World Scientific, 1992, pp. 343–348

  38. [38]

    Generating rule sets from model trees,

    G. Holmes, M. Hall, and E. Frank, “Generating rule sets from model trees,” inTwelfth Australian Joint Conference on Artificial Intelligence. Springer, 1999, pp. 1–12

  39. [39]

    A random forest guided tour,

    G. Biau and E. Scornet, “A random forest guided tour,”Test, vol. 25, no. 2, pp. 197–227, 2016

  40. [40]

    A comprehensive survey on support vector machine classification: Applications, challenges and trends,

    J. Cervantes, F. Garcia-Lamont, L. Rodríguez-Mazahua, and A. Lopez, “A comprehensive survey on support vector machine classification: Applications, challenges and trends,” Neurocomputing, vol. 408, pp. 189–215, 2020. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0925231220307153

  41. [41]

    Fast effective rule induction,

    W. W. Cohenet al., “Fast effective rule induction,” inProceedings of the twelfth international conference on machine learning, 1995, pp. 115– 123

  42. [42]

    Geco: quality counterfactual explanations in real time,

    M. Schleich, Z. Geng, Y . Zhang, and D. Suciu, “Geco: quality counterfactual explanations in real time,”Proc. VLDB Endow., vol. 14, no. 9, p. 1681–1693, May 2021. [Online]. Available: https://doi.org/10.14778/3461535.3461555

  43. [43]

    Interpretable counterfactual explana- tions guided by prototypes,

    A. Van Looveren and J. Klaise, “Interpretable counterfactual explana- tions guided by prototypes,” inMachine Learning and Knowledge Dis- covery in Databases. Applied Data Science and Demo Track: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, 2019, Proceedings, Part III. Springer, 2021, pp. 201–217

  44. [44]

    Benchmarks for temporal logic requirements for automotive systems

    B. Hoxha, H. Abbas, and G. Fainekos, “Benchmarks for temporal logic requirements for automotive systems.”ARCH@ CPSWeek, vol. 34, pp. 25–30, 2014

  45. [45]

    Adaptive cruise control system using model predictive con- trol,

    MathWorks, “Adaptive cruise control system using model predictive con- trol,” https://www.mathworks.com/help/mpc/ug/adaptive-cruise-control- using-model-predictive-controller.html, 2021, accessed: 2024-09-11

  46. [46]

    Towards a theory of stochastic hybrid systems,

    J. Hu, J. Lygeros, and S. Sastry, “Towards a theory of stochastic hybrid systems,” inInternational Workshop on Hybrid Systems: Computation and Control. Springer, 2000, pp. 160–173

  47. [47]

    Arch-comp 2022 category report: Falsification with ubounded resources,

    G. Ernst, P. Arcaini, G. Fainekos, F. Formica, J. Inoue, T. Khandait, M. M. Mahboob, C. Menghi, G. Pedrielli, M. Waga, Y . Yamagata, and Z. Zhang, “Arch-comp 2022 category report: Falsification with ubounded resources,” inProceedings of 9th International Workshop on Applied, vol. 90, 2022, pp. 204–221

  48. [48]

    When cyber-physical systems meet ai: a benchmark, an evaluation, and a way forward,

    J. Song, D. Lyu, Z. Zhang, Z. Wang, T. Zhang, and L. Ma, “When cyber-physical systems meet ai: a benchmark, an evaluation, and a way forward,” inProceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice, 2022, pp. 343– 352

  49. [49]

    Robustness-guided temporal logic testing and verification for stochastic cyber-physical systems,

    H. Abbas, B. Hoxha, G. Fainekos, and K. Ueda, “Robustness-guided temporal logic testing and verification for stochastic cyber-physical systems,” inThe 4th Annual IEEE International Conference on Cyber Technology in Automation, Control and Intelligent. IEEE, 2014, pp. 1–6

  50. [50]

    Explaining machine learning classifiers through diverse counterfactual explanations,

    R. K. Mothilal, A. Sharma, and C. Tan, “Explaining machine learning classifiers through diverse counterfactual explanations,” inProceedings of the 2020 conference on fairness, accountability, and transparency, 2020, pp. 607–617

  51. [51]

    On a test of whether one of two random variables is stochastically larger than the other,

    H. B. Mann and D. R. Whitney, “On a test of whether one of two random variables is stochastically larger than the other,”The annals of mathematical statistics, pp. 50–60, 1947

  52. [52]

    A critique and improvement of the cl common language effect size statistics of mcgraw and wong,

    A. Vargha and H. D. Delaney, “A critique and improvement of the cl common language effect size statistics of mcgraw and wong,”Journal of Educational and Behavioral Statistics, vol. 25, no. 2, pp. 101–132, 2000

  53. [53]

    Towards unifying feature attribution and counterfactual explanations: Different means to the same end,

    R. Kommiya Mothilal, D. Mahajan, C. Tan, and A. Sharma, “Towards unifying feature attribution and counterfactual explanations: Different means to the same end,” inProceedings of the 2021 AAAI/ACM Conference on AI, Ethics, and Society, 2021, pp. 652–663

  54. [54]

    Arch-comp 2021 category report: Falsification with validation of results

    G. Ernst, P. Arcaini, I. Bennani, A. Chandratre, A. Donzé, G. Fainekos, G. Frehse, K. Gaaloul, J. Inoue, T. Khandaitet al., “Arch-comp 2021 category report: Falsification with validation of results.” inARCH@ ADHS, 2021, pp. 133–152

  55. [55]

    Isolating cause-effect chains from computer programs,

    A. Zeller, “Isolating cause-effect chains from computer programs,”ACM SIGSOFT Software Engineering Notes, vol. 27, no. 6, pp. 1–10, 2002