Quantitative Verification of Neural Networks And its Security Applications
Pith reviewed 2026-05-25 16:45 UTC · model grok-4.3
The pith
A sampling-based framework delivers bounded-error estimates of how many inputs satisfy logical properties in neural networks.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The framework is the first to offer PAC-style soundness guarantees for quantitative verification, meaning its estimates of the number of inputs satisfying a property are within a controllable and bounded error from the true count, achieved through an efficient sampling procedure when the network is binarized.
What carries the argument
NPAQ, a prototype tool that encodes the property as a logical formula and uses sampling to produce PAC-bounded quantitative estimates for binarized neural networks.
If this is right
- Quantifies the robustness of a network to adversarial inputs by counting violating cases.
- Measures the efficacy of trojan attacks by estimating how many inputs trigger the backdoor.
- Assesses fairness and bias by counting inputs that lead to discriminatory outputs.
- Supports richer probabilistic reasoning about neural network behavior beyond binary checks.
Where Pith is reading between the lines
- Similar sampling techniques might be adapted for continuous-valued networks with discretization.
- The approach could inspire quantitative verification in other machine learning models like decision trees.
- Security certifications might one day require such counts instead of existential checks alone.
Load-bearing premise
The estimates rely on the network being binarized so that the property check can be efficiently sampled with bounded error via PAC learning.
What would settle it
Finding a binarized network and property where repeated sampling produces an estimate outside the claimed error bound at the stated confidence level.
Figures
read the original abstract
Neural networks are increasingly employed in safety-critical domains. This has prompted interest in verifying or certifying logically encoded properties of neural networks. Prior work has largely focused on checking existential properties, wherein the goal is to check whether there exists any input that violates a given property of interest. However, neural network training is a stochastic process, and many questions arising in their analysis require probabilistic and quantitative reasoning, i.e., estimating how many inputs satisfy a given property. To this end, our paper proposes a novel and principled framework to quantitative verification of logical properties specified over neural networks. Our framework is the first to provide PAC-style soundness guarantees, in that its quantitative estimates are within a controllable and bounded error from the true count. We instantiate our algorithmic framework by building a prototype tool called NPAQ that enables checking rich properties over binarized neural networks. We show how emerging security analyses can utilize our framework in 3 concrete point applications: quantifying robustness to adversarial inputs, efficacy of trojan attacks, and fairness/bias of given neural networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a framework for quantitative verification of logical properties over neural networks. It claims to be the first to provide PAC-style soundness guarantees, meaning quantitative estimates are within a controllable bounded error from the true count. The framework is instantiated as the NPAQ prototype tool for binarized neural networks and demonstrated on three security applications: quantifying robustness to adversarial inputs, efficacy of trojan attacks, and fairness/bias of neural networks.
Significance. If the claimed reduction to an efficient sampling procedure over binarized networks with PAC-bounded error holds, the work would enable probabilistic and quantitative reasoning in NN verification, which is valuable for security analyses where counting satisfying inputs matters. The grounding in standard statistical learning theory for the bounds is a potential strength.
major comments (1)
- [Abstract / Framework description] The central claim rests on the existence of an efficient sampling procedure whose output count is within a PAC-bounded error of the true measure after reducing the NN property to a logical formula over a binarized network. The abstract asserts this is achieved in NPAQ, but no derivation of the PAC bounds, error analysis, or explicit sampling procedure is visible; the manuscript must show the concrete reduction and bound application in the relevant technical section.
minor comments (1)
- Clarify the precise form of the PAC guarantee (additive vs. multiplicative error, dependence on network size or formula complexity) to make the soundness claim easier to evaluate.
Simulated Author's Rebuttal
We thank the referee for their review and constructive feedback on our manuscript. We address the single major comment below.
read point-by-point responses
-
Referee: [Abstract / Framework description] The central claim rests on the existence of an efficient sampling procedure whose output count is within a PAC-bounded error of the true measure after reducing the NN property to a logical formula over a binarized network. The abstract asserts this is achieved in NPAQ, but no derivation of the PAC bounds, error analysis, or explicit sampling procedure is visible; the manuscript must show the concrete reduction and bound application in the relevant technical section.
Authors: We agree that the central technical elements require more explicit presentation. Sections 3 and 4 describe the property reduction to a logical formula over the binarized network and invoke the PAC sampling procedure, but we will revise the manuscript to add a dedicated subsection that derives the PAC bounds from statistical learning theory, provides the error analysis, and includes pseudocode for the sampling procedure. revision: yes
Circularity Check
No significant circularity identified
full rationale
The paper's central claim of PAC-style soundness guarantees rests on applying standard statistical learning theory to bound sampling error for properties encoded over binarized networks. This is an external result from PAC learning, not derived from or equivalent to the paper's own fitted parameters, definitions, or self-citations. No load-bearing step reduces by construction to its inputs; the framework is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption PAC learning bounds apply to the sampling procedure used for counting satisfying inputs to logical properties over binarized neural networks
Reference graph
Works this paper leans on
-
[1]
Correctional Offender Management Profiling for Alternative Sanctions
2012. Correctional Offender Management Profiling for Alternative Sanctions. http://www.northpointeinc.com/files/downloads/FAQ_Document.pdf. (2012)
work page 2012
-
[2]
UCI Machine Learning Repository
2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml. (2017)
work page 2017
-
[3]
BDDs for pseudo-Boolean constraints–revisited
Ignasi Abío, Robert Nieuwenhuis, Albert Oliveras, and Enric Rodríguez- Carbonell. BDDs for pseudo-Boolean constraints–revisited. In SAT’11
-
[4]
A parametric approach for smaller and better encodings of cardinality constraints
Ignasi Abío, Robert Nieuwenhuis, Albert Oliveras, and Enric Rodríguez- Carbonell. A parametric approach for smaller and better encodings of cardinality constraints. In CP’13
-
[5]
On the solution-space geome- try of random constraint satisfaction problems
Dimitris Achlioptas and Federico Ricci-Tersenghi. On the solution-space geome- try of random constraint satisfaction problems. In STOC’06
-
[6]
Dimitris Achlioptas and P. Theodoropoulos. Probabilistic Model Counting with Short XORs. In SAT’17
-
[7]
FairSquare: probabilistic verification of program fairness
Aws Albarghouthi, Loris D’Antoni, Samuel Drews, and Aditya V Nori. FairSquare: probabilistic verification of program fairness. In OOPSLA’17
-
[8]
Roberto Asín, Robert Nieuwenhuis, Albert Oliveras, and Enric Rodríguez- Carbonell. 2011. Cardinality networks: a theoretical and empirical study. Con- straints (2011)
work page 2011
-
[9]
Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples
Anish Athalye, Nicholas Carlini, and David Wagner. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In ICML’18
-
[10]
#∃ SAT: Projected Model Counting
Rehan Abdul Aziz, Geoffrey Chu, Christian Muise, and Peter Stuckey. #∃ SAT: Projected Model Counting. In SAT’15
-
[11]
Neural machine translation by jointly learning to align and translate
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In ICLR’15. 5https://www.nscc.sg/
-
[12]
Sorting networks and their applications
Kenneth E Batcher. Sorting networks and their applications. In AFIPS’1968
work page 1968
-
[13]
Certification considerations for adaptive systems
Siddhartha Bhattacharyya, Darren Cofer, D Musliner, Joseph Mueller, and Eric Engstrom. Certification considerations for adaptive systems. In ICUAS’15
-
[14]
Poisoning attacks against support vector machines
Battista Biggio, Blaine Nelson, and Pavel Laskov. Poisoning attacks against support vector machines. In ICML’12
-
[15]
An effective decision procedure for linear arithmetic over the integers and reals
Bernard Boigelot, Sébastien Jodogne, and Pierre Wolper. An effective decision procedure for linear arithmetic over the integers and reals. In TOCL’05
-
[16]
Mariusz Bojarski, Davide Del Testa, Daniel Dworakowski, Bernhard Firner, Beat Flepp, Prasoon Goyal, Lawrence D Jackel, Mathew Monfort, Urs Muller, Jiakai Zhang, and Others. 2016. End to end learning for self-driving cars. arXiv (2016)
work page 2016
-
[17]
Nicholas Carlini, Chang Liu, Jernej Kos, Úlfar Erlingsson, and Dawn Song. 2018. The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets. arXiv (2018)
work page 2018
-
[18]
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. Hidden Voice Commands.. In USENIX’16
-
[19]
Towards Evaluating the Robustness of Neural Networks
Nicholas Carlini and David Wagner. Towards Evaluating the Robustness of Neural Networks. In SP’17
-
[20]
From Weighted to Unweighted Model Counting
Supratik Chakraborty, Dror Fried, Kuldeep S Meel, and Moshe Y Vardi. From Weighted to Unweighted Model Counting.. In IJCAI’15
-
[21]
A scalable approxi- mate model counter
Supratik Chakraborty, Kuldeep S Meel, and Moshe Y Vardi. A scalable approxi- mate model counter. In CP’13
-
[22]
Supratik Chakraborty, Kuldeep S Meel, and Moshe Y Vardi. Algorithmic im- provements in approximate counting for probabilistic inference: From linear to logarithmic SAT calls. In IJCAI’16
-
[23]
Balancing scalability and uniformity in SAT witness generator
Supratik Chakraborty, Kuldeep S Meel, and Moshe Y Vardi. Balancing scalability and uniformity in SAT witness generator. In DAC’14
-
[24]
Use privacy in data-driven systems: Theory and experiments with machine learnt programs
Anupam Datta, Matthew Fredrikson, Gihyuk Ko, Piotr Mardziel, and Shayak Sen. Use privacy in data-driven systems: Theory and experiments with machine learnt programs. In CCS’17
-
[25]
Anupam Datta, Shayak Sen, and Yair Zick. Algorithmic transparency via quanti- tative input influence: Theory and experiments with learning systems. In SP’16
-
[26]
Elvis Dohmatob. 2018. Limitations of adversarial robustness: strong No Free Lunch Theorem. arXiv (2018)
work page 2018
-
[27]
Jeffrey M. Dudek, Kuldeep S. Meel, and Moshe Y. Vardi. Combining the k-CNF and XOR phase-transitions. In IJCAI’16
-
[28]
Jeffrey M. Dudek, Kuldeep S. Meel, and Moshe Y. Vardi. The Hard Problems Are Almost Everywhere For Random CNF-XOR Formulas. In IJCAI’17
-
[29]
Krishnamurthy Dvijotham, Robert Stanforth, Sven Gowal, Timothy Mann, and Pushmeet Kohli. 2018. A Dual Approach to Scalable Verification of Deep Net- works. arXiv (2018)
work page 2018
-
[30]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness through awareness. In ITCS’12
-
[31]
Niklas Eén and Niklas Sörensson. 2006. Translating Pseudo-Boolean Constraints into SAT. JSAT 2, 1-4 (2006), 1–26
work page 2006
-
[32]
Formal verification of piece-wise linear feed-forward neural networks
Ruediger Ehlers. Formal verification of piece-wise linear feed-forward neural networks. In ATV A’17
-
[33]
Embed and project: Discrete sampling with universal hashing
Stefano Ermon, Carla P Gomes, Ashish Sabharwal, and Bart Selman. Embed and project: Discrete sampling with universal hashing. In NIPS’13
-
[34]
Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization
Stefano Ermon, Carla P Gomes, Ashish Sabharwal, and Bart Selman. Taming the Curse of Dimensionality: Discrete Integration by Hashing and Optimization. In ICML’13
-
[35]
Stefano Ermon, Carla P Gomes, Ashish Sabharwal, and Bart Selman. 2013. Op- timization with parity constraints: From binary codes to discrete integration. arXiv (2013)
work page 2013
-
[36]
Robust physical-world attacks on deep learning models
Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, and Dawn Song. Robust physical-world attacks on deep learning models. In CVPR’18
-
[37]
Certifying and removing disparate impact
Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. Certifying and removing disparate impact. In SIGKDD’15
-
[38]
Nic Ford, Justin Gilmer, Nicolas Carlini, and Dogus Cubuk. 2019. Adversarial Examples Are a Natural Consequence of Test Error in Noise. arXiv (2019)
work page 2019
-
[39]
Model inversion attacks that exploit confidence information and basic countermeasures
Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. Model inversion attacks that exploit confidence information and basic countermeasures. In CCS’15
-
[40]
Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing
Matt Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David Page, and Thomas Ris- tenpart. Privacy in Pharmacogenetics: An End-to-End Case Study of Personalized Warfarin Dosing.. In USENIX’14’
-
[41]
Attacking Binarized Neural Networks
Angus Galloway, Graham W Taylor, and Medhat Moussa. Attacking Binarized Neural Networks. In ICLR’18
-
[42]
A decision procedure for bit-vectors and arrays
Vijay Ganesh and David L Dill. A decision procedure for bit-vectors and arrays. In CA V’07
-
[43]
Yansong Gao, Chang Xu, Derui Wang, Shiping Chen, Damith C Ranasinghe, and Surya Nepal. 2019. STRIP: A Defence Against Trojan Attacks on Deep Neural Networks. arXiv (2019)
work page 2019
-
[44]
AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation
Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation. In SP’18
-
[45]
Arturo Geigel. 2013. Neural network trojan. Journal of Computer Security (2013)
work page 2013
-
[46]
Justin Gilmer, Ryan P Adams, Ian Goodfellow, David Andersen, and George E Dahl. 2018. Motivating the rules of the game for adversarial example research. arXiv (2018)
work page 2018
-
[47]
Justin Gilmer, Luke Metz, Fartash Faghri, Samuel S Schoenholz, Maithra Raghu, Martin Wattenberg, and Ian Goodfellow. 2018. Adversarial spheres.arXiv (2018)
work page 2018
-
[48]
Alessandro Giusti, Jerome Guzzi, Dan C. Ciresan, Fang-Lin He, Juan P. Rodriguez, Flavio Fontana, Matthias Faessler, Christian Forster, Jurgen Schmidhuber, Gi- anni Di Caro, Davide Scaramuzza, and Luca M. Gambardella. 2016. A Machine Learning Approach to Visual Perception of Forest Trails for Mobile Robots. IEEE Robotics and Automation Letters (2016)
work page 2016
-
[49]
Gomes, Ashish Sabharwal, and Bart Selman
Carla P. Gomes, Ashish Sabharwal, and Bart Selman. Model counting: A new strategy for obtaining good bounds. In AAAI’06
-
[50]
Explaining and Har- nessing Adversarial Examples
Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and Har- nessing Adversarial Examples. In ICLR’15
-
[51]
Radu Grosu and Scott A Smolka. Monte carlo model checking. In TACAS’05
-
[52]
Equality of opportunity in supervised learning
Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. In NIPS’16
-
[53]
W Keith Hastings. 1970. Monte Carlo sampling methods using Markov chains and their applications. (1970)
work page 1970
-
[54]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In CVPR’16
-
[55]
Sound Abstraction and Decom- position of Probabilistic Programs
Steven Holtzen, Guy Broeck, and Todd Millstein. Sound Abstraction and Decom- position of Probabilistic Programs. In ICML’18
-
[56]
Safety Verification of Deep Neural Networks
Xiaowei Huang, Marta Kwiatkowska, Sen Wang, and Min Wu. Safety Verification of Deep Neural Networks. In CA V’17
-
[57]
Itay Hubara, Matthieu Courbariaux, Daniel Soudry, Ran El-Yaniv, and Yoshua Bengio. Binarized neural networks. In NIPS’16
-
[58]
Mark R. Jerrum and Alistair Sinclair. 1996. The Markov chain Monte Carlo method: an approach to approximate counting and integration. Approximation algorithms for NP-hard problems (1996), 482–520
work page 1996
-
[59]
Policy compression for aircraft collision avoidance systems
Kyle D Julian, Jessica Lopez, Jeffrey S Brush, Michael P Owen, and Mykel J Kochenderfer. Policy compression for aircraft collision avoidance systems. In DASC’16
-
[60]
Reluplex: An efficient SMT solver for verifying deep neural networks
Guy Katz, Clark Barrett, David L Dill, Kyle Julian, and Mykel J Kochenderfer. Reluplex: An efficient SMT solver for verifying deep neural networks. InCA V’17
-
[61]
Understanding black-box predictions via influ- ence functions
Pang Wei Koh and Percy Liang. Understanding black-box predictions via influ- ence functions. In ICML’17
-
[62]
A decision procedure for the propositional µ-calculus
Dexter Kozen and Rohit Parikh. A decision procedure for the propositional µ-calculus. In WLP’1983
work page 1983
-
[63]
Imagenet classification with deep convolutional neural networks
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In NIPS’12
-
[64]
Jaeha Kung, David Zhang, Gooitzen van der Wal, Sek Chai, and Saibal Mukhopad- hyay. 2018. Efficient object detection using embedded binarized neural networks. Journal of Signal Processing Systems (2018)
work page 2018
-
[65]
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. nature (2015)
work page 2015
-
[66]
Yann LeCun and Corinna Cortes. 2010. MNIST handwritten digit database. (2010)
work page 2010
-
[67]
Defense against adversarial attacks using high-level representation guided denoiser
Fangzhou Liao, Ming Liang, Yinpeng Dong, Tianyu Pang, Xiaolin Hu, and Jun Zhu. Defense against adversarial attacks using high-level representation guided denoiser. In CVPR’18
-
[68]
Trojaning Attack on Neural Networks
Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. Trojaning Attack on Neural Networks. In NDSS’18
-
[69]
Saeed Mahloujifar, Dimitrios I Diochnos, and Mohammad Mahmoody. 2018. The curse of concentration in robust learning: Evasion and poisoning attacks from concentration of measure. arXiv (2018)
work page 2018
-
[70]
Embedded Binarized Neural Networks
Bradley McDanel, Surat Teerapittayanon, and H T Kung. Embedded Binarized Neural Networks. In EWSN’17
-
[71]
Kuldeep S. Meel. 2017. Constrained Counting and Sampling: Bridging the Gap between Theory and Practice . Ph.D. Dissertation. Rice University
work page 2017
-
[72]
Kuldeep S. Meel, Moshe Y. Vardi, Supratik Chakraborty, Daniel J Fremont, Sanjit A Seshia, Dror Fried, Alexander Ivrii, and Sharad Malik. Constrained Sampling and Counting: Universal Hashing Meets SAT Solving. In AAAI’16 (Beyond NP Workshop)
-
[73]
Verifying properties of binarized deep neural networks
Nina Narodytska, Shiva Prasad Kasiviswanathan, Leonid Ryzhyk, Mooly Sagiv, and Toby Walsh. Verifying properties of binarized deep neural networks. In AAAI’18
-
[74]
Radford M Neal. 1993. Probabilistic inference using Markov chain Monte Carlo methods. (1993)
work page 1993
-
[75]
Robert Nieuwenhuis and Albert Oliveras. 2005. Decision procedures for SAT, SAT modulo theories and beyond. the BarcelogicTools. In International Conference on Logic for Programming Artificial Intelligence and Reasoning . Springer, 23–46
work page 2005
-
[76]
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples. arXiv (2016)
work page 2016
-
[77]
The limitations of deep learning in adversarial settings
Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z Berkay Celik, and Ananthram Swami. The limitations of deep learning in adversarial settings. In EuroS&P’16
-
[78]
Distillation as a defense to adversarial perturbations against deep neural net- works
Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural net- works. In SP’16
-
[79]
Automatic differentiation in PyTorch
Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in PyTorch. InNIPS-W’17
-
[80]
Deepxplore: Automated whitebox testing of deep learning systems
Kexin Pei, Yinzhi Cao, Junfeng Yang, and Suman Jana. Deepxplore: Automated whitebox testing of deep learning systems. In SOSP’17
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.