Prophecy: Inferring Formal Properties from Neuron Activations

Corina S. Pasareanu; Divya Gopinath; Muhammad Usman

arxiv: 2509.21677 · v2 · submitted 2025-09-25 · 💻 cs.LG · cs.SE

Prophecy: Inferring Formal Properties from Neuron Activations

Divya Gopinath , Corina S. Pasareanu , Muhammad Usman This is my paper

Pith reviewed 2026-05-18 13:27 UTC · model grok-4.3

classification 💻 cs.LG cs.SE

keywords neural networksformal propertiesneuron activationsrule extractionformal verificationproperty inferenceexplainability

0 comments

The pith

Prophecy infers formal properties of feed-forward neural networks by deriving logical rules from neuron activation patterns in hidden layers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Prophecy as a tool to automatically infer formal properties from neural networks. It extracts rules where specific neuron activation values or on/off states in inner layers act as preconditions that guarantee desirable outputs such as a particular class prediction. This approach rests on the idea that much of a network's decision logic lives in those activation patterns rather than only in the final output. A sympathetic reader would care because the rules turn internal states into usable logical statements that support verification and explanation tasks. If the extraction succeeds, it provides a direct way to reason about network behavior using the same activations the network itself produces.

Core claim

Prophecy is based on the observation that a significant part of the logic of feed-forward networks is captured in the activation status of the neurons at inner layers. Prophecy works by extracting rules based on neuron activations (values or on/off statuses) as preconditions that imply certain desirable output property, e.g., the prediction being a certain class. These rules represent network properties captured in the hidden layers that imply the desired output behavior.

What carries the argument

Extraction of rules that treat neuron activation patterns (continuous values or binary on/off states) as preconditions guaranteeing specific output properties.

If this is right

The rules enable inferring and proving formal explanations for network decisions.
They support compositional verification of systems that include neural networks.
The approach allows run-time monitoring to check whether properties hold during execution.
Extracted rules can guide repair steps that modify the network to satisfy desired properties.
The same technique shows promise for property inference in large vision-language models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the rules prove sound across many inputs, verification efforts could shift from exhaustive input testing to checking activation preconditions.
The method suggests a route to interpretability by turning internal activation patterns into explicit logical statements rather than post-hoc approximations.
Extending the rule extraction to other network types could test whether activation-based preconditions generalize beyond feed-forward architectures.

Load-bearing premise

The activation status of neurons at inner layers captures enough of the network's logic to produce rules that reliably imply the desired output properties without gaps or incorrect implications.

What would settle it

An input that satisfies the activation precondition of an extracted rule but produces an output violating the implied property would demonstrate that the rules do not hold.

Figures

Figures reproduced from arXiv: 2509.21677 by Corina S. Pasareanu, Divya Gopinath, Muhammad Usman.

**Figure 1.** Figure 1: Prophecy Tool. Overview Diagram. Several studies [17, 31, 21] have shown that neuron activations (values or on/off statuses) at a layer effectively represent the features at that layer. Typically, dense layers closer to the output encompass the logic relevant to model decisions. Guided by these insights, simple rules in terms of conditions on neuron outputs at these layers, have the potential to capture t… view at source ↗

**Figure 2.** Figure 2: Example Layer Properties. Let F denote a feedforward neural network and Fl denote the function representing the transformations applied by all layers from the input till layer l. A layer property for a post-condition P encodes a pattern σ over the neuron values in layer l that satisfies the rule, ∀x ∈ X σ(Fl(x)) ⇒ P(F(x)). In the original paper [11], we expressed σ in terms of on/off activations, σ(X) … view at source ↗

**Figure 3.** Figure 3: Inference times comparison. Pima Diabetes (PD [2])), MNIST [20], CIFAR [27]. The command when invoked with the rules parameter performs a linear search on the dictionary of rules (simulating original Prophecy implementation) [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

read the original abstract

We present Prophecy, a tool for automatically inferring formal properties of feed-forward neural networks. Prophecy is based on the observation that a significant part of the logic of feed-forward networks is captured in the activation status of the neurons at inner layers. Prophecy works by extracting rules based on neuron activations (values or on/off statuses) as preconditions that imply certain desirable output property, e.g., the prediction being a certain class. These rules represent network properties captured in the hidden layers that imply the desired output behavior. We present the architecture of the tool, highlight its features and demonstrate its usage on different types of models and output properties. We present an overview of its applications, such as inferring and proving formal explanations of neural networks, compositional verification, run-time monitoring, repair, and others. We also show novel results highlighting its potential in the era of large vision-language models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Prophecy introduces a tool that extracts rules from neuron activations to infer formal properties in neural nets, but the formal guarantees look under-supported without verification steps.

read the letter

The punchline is that Prophecy turns activation patterns or on/off states at hidden layers into preconditions meant to imply output properties like class membership. The paper lays out the tool architecture, its features, and shows usage on feed-forward models plus large vision-language models, with applications in explanations, compositional verification, runtime monitoring, and repair. What is actually new is this specific rule-extraction setup focused on inner-layer activations as a way to capture network logic, combined with demonstrations on modern large models rather than just small examples. They connect the approach to several practical uses, which gives it some applied direction beyond abstract ideas. The paper does well at sketching how such a tool could fit into safety workflows and at highlighting the potential for vision-language models, which is relevant given current AI trends. The soft spots center on soundness. The abstract and description give no quantitative validation, error analysis, or derivation showing the rules actually hold as implications. There is no sign of SMT encoding, region enumeration, or over-approximation to rule out counterexamples outside observed samples. The stress-test concern about needing exhaustive or symbolic checks to make the implication formal holds up here; without that, the rules risk being patterns that fail on new inputs hitting the same activations. This is aimed at the formal verification and explainable-AI crowd who build or evaluate tools for neural net properties. A reader working on safety or interpretability might pick up useful architecture ideas and use-case framing, though they would want the full experiments and code to judge reliability. It deserves peer review so referees can check the implementation details and any supporting results for whether the formal claims stand.

Referee Report

2 major / 1 minor

Summary. The paper presents Prophecy, a tool for automatically inferring formal properties of feed-forward neural networks. Prophecy extracts rules from neuron activation values or on/off statuses at hidden layers, treating these as preconditions that imply desirable output properties such as specific class predictions. The manuscript describes the tool architecture and features, demonstrates usage across model types, and outlines applications including formal explanations, compositional verification, runtime monitoring, repair, and potential extensions to large vision-language models.

Significance. If the extracted rules can be established as sound formal implications, the work would offer a practical bridge between empirical activation patterns and verifiable network properties, supporting interpretability and safety applications in neural networks. The focus on internal layer logic as a basis for property inference is a constructive direction that could complement existing verification techniques, and the broad application overview provides useful context for practitioners.

major comments (2)

[Abstract and method overview] Abstract and method overview: the claim that activation-based preconditions 'imply' output properties is presented as formal, yet the description provides no soundness argument, over-approximation technique, SMT encoding, or enumeration of consistent ReLU regions to guarantee that every input satisfying the precondition produces the claimed output with no counterexamples in the continuous domain.
[Evaluation section] Evaluation section: no quantitative validation, error rates, or counterexample analysis is reported for the inferred rules, leaving the reliability of the claimed implications for applications such as verification and monitoring unsupported.

minor comments (1)

[Applications discussion] Clarify whether the 'novel results' for vision-language models are supported by concrete examples or remain prospective.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and for recognizing the potential of Prophecy to connect neuron activations with verifiable network properties. We address each major comment below, indicating the revisions we will incorporate to strengthen the formal claims and empirical support.

read point-by-point responses

Referee: [Abstract and method overview] Abstract and method overview: the claim that activation-based preconditions 'imply' output properties is presented as formal, yet the description provides no soundness argument, over-approximation technique, SMT encoding, or enumeration of consistent ReLU regions to guarantee that every input satisfying the precondition produces the claimed output with no counterexamples in the continuous domain.

Authors: We agree that the current presentation would be strengthened by an explicit treatment of soundness. The rules in Prophecy are derived by mining activation patterns (values or binary on/off states) that hold for sampled inputs and correlate with target outputs. To guarantee that every continuous input satisfying the precondition yields the claimed output, the implications must be verified over the linear regions induced by ReLU activations. In the revised manuscript we will add a dedicated subsection that (i) states the precise conditions under which the extracted rules are sound, (ii) describes a lightweight over-approximation based on linear relaxations of ReLU, and (iii) outlines how an SMT encoding or region enumeration can be used to certify absence of counterexamples. This addition will make the formal character of the claims explicit without altering the core extraction algorithm. revision: yes
Referee: [Evaluation section] Evaluation section: no quantitative validation, error rates, or counterexample analysis is reported for the inferred rules, leaving the reliability of the claimed implications for applications such as verification and monitoring unsupported.

Authors: The present evaluation focuses on demonstrating usage across model architectures and on illustrating the intended applications. We acknowledge that quantitative evidence of rule reliability is necessary to support claims about verification and runtime monitoring. We will therefore augment the evaluation section with a new set of experiments that (a) measure the fraction of rules that admit counterexamples when checked with an SMT solver or gradient-based search, (b) report precision and recall of the inferred preconditions on held-out inputs, and (c) analyze failure cases. These metrics will be presented for both small feed-forward networks and larger vision-language models, directly addressing the concern about unsupported reliability. revision: yes

Circularity Check

0 steps flagged

No circularity: rule extraction presented as independent procedure from activation observations

full rationale

The paper introduces Prophecy as a tool that extracts rules from neuron activation patterns (values or binary statuses) at inner layers to serve as preconditions implying output properties such as class membership. This is framed as a direct consequence of the stated observation that hidden-layer activations capture significant network logic, without any equations, fitted parameters, or self-citations that define the output properties in terms of the extracted rules themselves. No derivation step reduces a claimed formal implication back to its inputs by construction; the method is described as a new extraction and application procedure for verification, monitoring, and repair tasks, remaining self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests primarily on one domain assumption about neuron activations capturing network logic. No free parameters, invented entities, or additional axioms are stated in the abstract.

axioms (1)

domain assumption A significant part of the logic of feed-forward networks is captured in the activation status of the neurons at inner layers.
This observation is explicitly stated as the basis for the rule-extraction approach in the abstract.

pith-pipeline@v0.9.0 · 5680 in / 1310 out tokens · 48613 ms · 2026-05-18T13:27:08.785254+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Engineering Resource-constrained Software Systems with DNN Components: a Concept-based Pruning Approach
cs.SE 2026-04 unverdicted novelty 5.0

A concept-based pruning method for DNNs guided by interpretable concepts and system requirements produces smaller, computationally efficient models that maintain effectiveness on image classification tasks.

Reference graph

Works this paper leans on

31 extracted references · 31 canonical work pages · cited by 1 Pith paper · 7 internal anchors

[1]

https://scikit-learn.org/stable/

work page
[2]

https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database, [Ac- cessed 31-01-2025]

Pima Indians Diabetes Database — kaggle.com. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database, [Ac- cessed 31-01-2025]

work page 2025
[3]

Understanding intermediate layers using linear classifier probes

Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. CoRRabs/1610.01644(2016), http://arxiv.org/abs/1610.01644

work page internal anchor Pith review Pith/arXiv arXiv 2016
[4]

CoRR abs/2210.05189(2022)

Aytekin, Ç.: Neural networks are decision trees. CoRR abs/2210.05189(2022). https://doi.org/10.48550/ARXIV.2210.05189, https://doi.org/10.48550/arXiv.2210.05189

work page doi:10.48550/arxiv.2210.05189 2022
[5]

Neural Decision Trees

Balestriero, R.: Neural decision trees. CoRRabs/1702.07360(2017), http://arxiv.org/abs/1702.07360

work page internal anchor Pith review Pith/arXiv arXiv 2017
[6]

Probing classifiers: Promises, shortcomings, and advances.Computational Linguistics, 48(1):207–219, 2022

Belinkov, Y.: Probing classifiers: Promises, shortcomings, and alternatives. CoRR abs/2102.12452(2021), https://arxiv.org/abs/2102.12452

work page internal anchor Pith review Pith/arXiv arXiv 2021
[7]

In: Wani, M.A., Arabnia, H.R., Cios, K.J., Hafeez, K., Kendall, G

Boz, O.: Converting A trained neural network to a decision tree dectext - decision tree extractor. In: Wani, M.A., Arabnia, H.R., Cios, K.J., Hafeez, K., Kendall, G. (eds.) Proceedings of the 2002 International Conference on Machine Learning and Applications - ICMLA 2002, June 24-27, 2002, Las Vegas, Nevada, USA. pp. 110–116. CSREA Press (2002)

work page 2002
[8]

Elazar, Y., Ravfogel, S., Jacovi, A., Goldberg, Y.: Amnesic probing: Be- havioral explanation with amnesic counterfactuals. Trans. Assoc. Com- put. Linguistics9, 160–175 (2021). https://doi.org/10.1162/TACL\_A\_00359, https://doi.org/10.1162/tacl_a_00359

work page internal anchor Pith review doi:10.1162/tacl 2021
[9]

In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepE- val@ACL 2016, Berlin, Germany, August 2016

Ettinger, A., Elgohary, A., Resnik, P.: Probing for semantic evidence of composition by means of simple classification tasks. In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepE- val@ACL 2016, Berlin, Germany, August 2016. pp. 134–139. Association for Computational Linguistics (2016). https://doi.org/10.18653/V1/W1...

work page doi:10.18653/v1/w16-2524 2016
[10]

Distilling a Neural Network Into a Soft Decision Tree

Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. CoRR abs/1711.09784(2017), http://arxiv.org/abs/1711.09784

work page internal anchor Pith review Pith/arXiv arXiv 2017
[11]

In: 2019 34th IEEE/ACM International Con- ference on Automated Software Engineering (ASE)

Gopinath, D., Converse, H., Pasareanu, C., Taly, A.: Property inference for deep neural networks. In: 2019 34th IEEE/ACM International Con- ference on Automated Software Engineering (ASE). pp. 797–809 (2019). https://doi.org/10.1109/ASE.2019.00079

work page doi:10.1109/ase.2019.00079 2019
[12]

Runtime enforcement using knowledge bases

Gopinath, D., Lungeanu, L., Mangal, R., Pasareanu, C.S., Xie, S., Yu, H.: Feature-guided analysis of neural networks. In: Lambers, L., Uchitel, S. (eds.) Fundamental Approaches to Software Engineering - 26th Interna- tional Conference, FASE 2023, Held as Part of the European Joint Confer- ences on Theory and Practice of Software, ETAPS 2023, Paris, France...

work page doi:10.1007/978-3-031-30826-0 2023
[13]

https://doi.org/10.48550/arXiv.2104.08197

Ivanova, A., Hewitt, J., Zaslavsky, N.: Probing artificial neural networks: insights from neuroscience (04 2021). https://doi.org/10.48550/arXiv.2104.08197

work page doi:10.48550/arxiv.2104.08197 2021
[14]

In: Proc

Julian, K., Lopez, J., Brush, J., Owen, M., Kochenderfer, M.: Policy compression for aircraft collision avoidance systems. In: Proc. 35th Digital Avionics System Conf. (DASC). pp. 1–10 (2016) 14 Divya Gopinath, Corina Păsăreanu, and Muhammad Usman

work page 2016
[15]

In: Dillig, I., Tasiran, S

Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., Dill, D.L., Kochenderfer, M.J., Barrett, C.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verification. pp. 443–452. Springer Interna- tional Publishing, Cham (2019)

work page 2019
[16]

In: Dy, J.G., Krause, A

Kim, B., Wattenberg, M., Gilmer, J., Cai, C.J., Wexler, J., Viégas, F.B., Sayres, R.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In: Dy, J.G., Krause, A. (eds.) Pro- ceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2...

work page 2018
[17]

In: Pereira, F., Burges, C., Bottou, L., Weinberger, K

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep con- volutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems. vol. 25. Curran Asso- ciates, Inc. (2012)

work page 2012
[18]

In: The Eleventh International Conference on Learning Represen- tations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

Li, K., Hopkins, A.K., Bau, D., Viégas, F.B., Pfister, H., Wattenberg, M.: Emergent world representations: Exploring a sequence model trained on a syn- thetic task. In: The Eleventh International Conference on Learning Represen- tations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023), https://openreview.net/forum?id=DeG07_TcZvT

work page 2023
[19]

Marabou repository, https://github.com/NeuralNetworkVerification/Marabou

work page
[20]

The MNIST database of handwritten digits Home Page, http://yann.lecun.com/exdb/mnist/

work page
[21]

Distill (2017), https://distill.pub/2017/feature-visualization/

Olah, C., Schubert, L., Mordvintsev, A.: Feature visualization. Distill (2017), https://distill.pub/2017/feature-visualization/

work page 2017
[22]

In: Christakis, M., Pradel, M

Pinconschi, E., Gopinath, D., Abreu, R., Pasareanu, C.S.: Evaluat- ing deep neural networks in deployment: A comparative study (repli- cability study). In: Christakis, M., Pradel, M. (eds.) Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Test- ing and Analysis, ISSTA 2024, Vienna, Austria, September 16-20, 2024. pp. 1300–1311. ACM ...

work page doi:10.1145/3650212.3680401 2024
[23]

Machine Learning1(1), 81–106 (1986)

Quinlan, J.R.: Induction of decision trees. Machine Learning1(1), 81–106 (1986). https://doi.org/10.1023/A:1022643204877

work page doi:10.1023/a:1022643204877 1986
[24]

Morgan Kaufmann, San Ma- teo, CA (1993)

Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Ma- teo, CA (1993)

work page 1993
[25]

In: Meila, M., Zhang, T

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceed- ings of Machine Learning Res...

work page 2021
[26]

Sun, M., Chen, X., Kolter, J.Z., Liu, Z.: Massive activations in large language models (2024), https://arxiv.org/abs/2402.17762

work page internal anchor Pith review Pith/arXiv arXiv 2024
[27]

https://keras.io/api/datasets/cifar10/, [Accessed 31-01-2025]

Team, K.: Keras documentation: CIFAR10 small images classification dataset — keras.io. https://keras.io/api/datasets/cifar10/, [Accessed 31-01-2025]

work page 2025
[28]

In: Computer Aided Verifica- tion: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I

Usman, M., Gopinath, D., Sun, Y., Noller, Y., Păsăreanu, C.S.: Nnrepair: Constraint-based repair of neural network classifiers. In: Computer Aided Verifica- tion: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I. p. 3–25. Springer-Verlag, Berlin, Heidelberg (2021) Prophecy: Inferring Formal Properties from Neur...

work page 2021
[29]

In: Dang, T., Stolz, V

Usman,M.,Gopinath,D.,Sun,Y.,Păsăreanu,C.S.:Rule-basedruntimemitigation against poison attacks on neural networks. In: Dang, T., Stolz, V. (eds.) Runtime Verification. pp. 67–84. Springer International Publishing, Cham (2022)

work page 2022
[30]

In: Proceedings of the 1st International Workshop on Depend- ability and Trustworthiness of Safety-Critical Systems with Machine Learned Components

Usman, M., Sun, Y., Gopinath, D., Păsăreanu, C.S.: Rule-based testing of neu- ral networks. In: Proceedings of the 1st International Workshop on Depend- ability and Trustworthiness of Safety-Critical Systems with Machine Learned Components. p. 1–5. SE4SafeML 2023, Association for Computing Machin- ery, New York, NY, USA (2023). https://doi.org/10.1145/361...

work page doi:10.1145/3617574.3622747 2023
[31]

Visualizing and Understanding Convolutional Networks

Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks (2013), https://arxiv.org/abs/1311.2901 16 Divya Gopinath, Corina Păsăreanu, and Muhammad Usman Appendix 5.1 Tool Commands and Parameters. Command Parameters Description ALL -m Pre-trained model in keras (.h5) format. -wd Working directory path analyze -tx,-ty Data and labels (...

work page internal anchor Pith review Pith/arXiv arXiv 2013

[1] [1]

https://scikit-learn.org/stable/

work page

[2] [2]

https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database, [Ac- cessed 31-01-2025]

Pima Indians Diabetes Database — kaggle.com. https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database, [Ac- cessed 31-01-2025]

work page 2025

[3] [3]

Understanding intermediate layers using linear classifier probes

Alain, G., Bengio, Y.: Understanding intermediate layers using linear classifier probes. CoRRabs/1610.01644(2016), http://arxiv.org/abs/1610.01644

work page internal anchor Pith review Pith/arXiv arXiv 2016

[4] [4]

CoRR abs/2210.05189(2022)

Aytekin, Ç.: Neural networks are decision trees. CoRR abs/2210.05189(2022). https://doi.org/10.48550/ARXIV.2210.05189, https://doi.org/10.48550/arXiv.2210.05189

work page doi:10.48550/arxiv.2210.05189 2022

[5] [5]

Neural Decision Trees

Balestriero, R.: Neural decision trees. CoRRabs/1702.07360(2017), http://arxiv.org/abs/1702.07360

work page internal anchor Pith review Pith/arXiv arXiv 2017

[6] [6]

Probing classifiers: Promises, shortcomings, and advances.Computational Linguistics, 48(1):207–219, 2022

Belinkov, Y.: Probing classifiers: Promises, shortcomings, and alternatives. CoRR abs/2102.12452(2021), https://arxiv.org/abs/2102.12452

work page internal anchor Pith review Pith/arXiv arXiv 2021

[7] [7]

In: Wani, M.A., Arabnia, H.R., Cios, K.J., Hafeez, K., Kendall, G

Boz, O.: Converting A trained neural network to a decision tree dectext - decision tree extractor. In: Wani, M.A., Arabnia, H.R., Cios, K.J., Hafeez, K., Kendall, G. (eds.) Proceedings of the 2002 International Conference on Machine Learning and Applications - ICMLA 2002, June 24-27, 2002, Las Vegas, Nevada, USA. pp. 110–116. CSREA Press (2002)

work page 2002

[8] [8]

Elazar, Y., Ravfogel, S., Jacovi, A., Goldberg, Y.: Amnesic probing: Be- havioral explanation with amnesic counterfactuals. Trans. Assoc. Com- put. Linguistics9, 160–175 (2021). https://doi.org/10.1162/TACL\_A\_00359, https://doi.org/10.1162/tacl_a_00359

work page internal anchor Pith review doi:10.1162/tacl 2021

[9] [9]

In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepE- val@ACL 2016, Berlin, Germany, August 2016

Ettinger, A., Elgohary, A., Resnik, P.: Probing for semantic evidence of composition by means of simple classification tasks. In: Proceedings of the 1st Workshop on Evaluating Vector-Space Representations for NLP, RepE- val@ACL 2016, Berlin, Germany, August 2016. pp. 134–139. Association for Computational Linguistics (2016). https://doi.org/10.18653/V1/W1...

work page doi:10.18653/v1/w16-2524 2016

[10] [10]

Distilling a Neural Network Into a Soft Decision Tree

Frosst, N., Hinton, G.E.: Distilling a neural network into a soft decision tree. CoRR abs/1711.09784(2017), http://arxiv.org/abs/1711.09784

work page internal anchor Pith review Pith/arXiv arXiv 2017

[11] [11]

In: 2019 34th IEEE/ACM International Con- ference on Automated Software Engineering (ASE)

Gopinath, D., Converse, H., Pasareanu, C., Taly, A.: Property inference for deep neural networks. In: 2019 34th IEEE/ACM International Con- ference on Automated Software Engineering (ASE). pp. 797–809 (2019). https://doi.org/10.1109/ASE.2019.00079

work page doi:10.1109/ase.2019.00079 2019

[12] [12]

Runtime enforcement using knowledge bases

Gopinath, D., Lungeanu, L., Mangal, R., Pasareanu, C.S., Xie, S., Yu, H.: Feature-guided analysis of neural networks. In: Lambers, L., Uchitel, S. (eds.) Fundamental Approaches to Software Engineering - 26th Interna- tional Conference, FASE 2023, Held as Part of the European Joint Confer- ences on Theory and Practice of Software, ETAPS 2023, Paris, France...

work page doi:10.1007/978-3-031-30826-0 2023

[13] [13]

https://doi.org/10.48550/arXiv.2104.08197

Ivanova, A., Hewitt, J., Zaslavsky, N.: Probing artificial neural networks: insights from neuroscience (04 2021). https://doi.org/10.48550/arXiv.2104.08197

work page doi:10.48550/arxiv.2104.08197 2021

[14] [14]

In: Proc

Julian, K., Lopez, J., Brush, J., Owen, M., Kochenderfer, M.: Policy compression for aircraft collision avoidance systems. In: Proc. 35th Digital Avionics System Conf. (DASC). pp. 1–10 (2016) 14 Divya Gopinath, Corina Păsăreanu, and Muhammad Usman

work page 2016

[15] [15]

In: Dillig, I., Tasiran, S

Katz, G., Huang, D.A., Ibeling, D., Julian, K., Lazarus, C., Lim, R., Shah, P., Thakoor, S., Wu, H., Zeljić, A., Dill, D.L., Kochenderfer, M.J., Barrett, C.: The marabou framework for verification and analysis of deep neural networks. In: Dillig, I., Tasiran, S. (eds.) Computer Aided Verification. pp. 443–452. Springer Interna- tional Publishing, Cham (2019)

work page 2019

[16] [16]

In: Dy, J.G., Krause, A

Kim, B., Wattenberg, M., Gilmer, J., Cai, C.J., Wexler, J., Viégas, F.B., Sayres, R.: Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (TCAV). In: Dy, J.G., Krause, A. (eds.) Pro- ceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2...

work page 2018

[17] [17]

In: Pereira, F., Burges, C., Bottou, L., Weinberger, K

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep con- volutional neural networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems. vol. 25. Curran Asso- ciates, Inc. (2012)

work page 2012

[18] [18]

In: The Eleventh International Conference on Learning Represen- tations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023

Li, K., Hopkins, A.K., Bau, D., Viégas, F.B., Pfister, H., Wattenberg, M.: Emergent world representations: Exploring a sequence model trained on a syn- thetic task. In: The Eleventh International Conference on Learning Represen- tations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net (2023), https://openreview.net/forum?id=DeG07_TcZvT

work page 2023

[19] [19]

Marabou repository, https://github.com/NeuralNetworkVerification/Marabou

work page

[20] [20]

The MNIST database of handwritten digits Home Page, http://yann.lecun.com/exdb/mnist/

work page

[21] [21]

Distill (2017), https://distill.pub/2017/feature-visualization/

Olah, C., Schubert, L., Mordvintsev, A.: Feature visualization. Distill (2017), https://distill.pub/2017/feature-visualization/

work page 2017

[22] [22]

In: Christakis, M., Pradel, M

Pinconschi, E., Gopinath, D., Abreu, R., Pasareanu, C.S.: Evaluat- ing deep neural networks in deployment: A comparative study (repli- cability study). In: Christakis, M., Pradel, M. (eds.) Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Test- ing and Analysis, ISSTA 2024, Vienna, Austria, September 16-20, 2024. pp. 1300–1311. ACM ...

work page doi:10.1145/3650212.3680401 2024

[23] [23]

Machine Learning1(1), 81–106 (1986)

Quinlan, J.R.: Induction of decision trees. Machine Learning1(1), 81–106 (1986). https://doi.org/10.1023/A:1022643204877

work page doi:10.1023/a:1022643204877 1986

[24] [24]

Morgan Kaufmann, San Ma- teo, CA (1993)

Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Ma- teo, CA (1993)

work page 1993

[25] [25]

In: Meila, M., Zhang, T

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Meila, M., Zhang, T. (eds.) Proceedings of the 38th International Conference on Machine Learning. Proceed- ings of Machine Learning Res...

work page 2021

[26] [26]

Sun, M., Chen, X., Kolter, J.Z., Liu, Z.: Massive activations in large language models (2024), https://arxiv.org/abs/2402.17762

work page internal anchor Pith review Pith/arXiv arXiv 2024

[27] [27]

https://keras.io/api/datasets/cifar10/, [Accessed 31-01-2025]

Team, K.: Keras documentation: CIFAR10 small images classification dataset — keras.io. https://keras.io/api/datasets/cifar10/, [Accessed 31-01-2025]

work page 2025

[28] [28]

In: Computer Aided Verifica- tion: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I

Usman, M., Gopinath, D., Sun, Y., Noller, Y., Păsăreanu, C.S.: Nnrepair: Constraint-based repair of neural network classifiers. In: Computer Aided Verifica- tion: 33rd International Conference, CAV 2021, Virtual Event, July 20–23, 2021, Proceedings, Part I. p. 3–25. Springer-Verlag, Berlin, Heidelberg (2021) Prophecy: Inferring Formal Properties from Neur...

work page 2021

[29] [29]

In: Dang, T., Stolz, V

Usman,M.,Gopinath,D.,Sun,Y.,Păsăreanu,C.S.:Rule-basedruntimemitigation against poison attacks on neural networks. In: Dang, T., Stolz, V. (eds.) Runtime Verification. pp. 67–84. Springer International Publishing, Cham (2022)

work page 2022

[30] [30]

In: Proceedings of the 1st International Workshop on Depend- ability and Trustworthiness of Safety-Critical Systems with Machine Learned Components

Usman, M., Sun, Y., Gopinath, D., Păsăreanu, C.S.: Rule-based testing of neu- ral networks. In: Proceedings of the 1st International Workshop on Depend- ability and Trustworthiness of Safety-Critical Systems with Machine Learned Components. p. 1–5. SE4SafeML 2023, Association for Computing Machin- ery, New York, NY, USA (2023). https://doi.org/10.1145/361...

work page doi:10.1145/3617574.3622747 2023

[31] [31]

Visualizing and Understanding Convolutional Networks

Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks (2013), https://arxiv.org/abs/1311.2901 16 Divya Gopinath, Corina Păsăreanu, and Muhammad Usman Appendix 5.1 Tool Commands and Parameters. Command Parameters Description ALL -m Pre-trained model in keras (.h5) format. -wd Working directory path analyze -tx,-ty Data and labels (...

work page internal anchor Pith review Pith/arXiv arXiv 2013