Recognition: 1 theorem link
· Lean TheoremBEAVER: An Efficient Deterministic LLM Verifier
Pith reviewed 2026-05-17 01:54 UTC · model grok-4.3
The pith
BEAVER computes deterministic probability bounds on whether LLMs meet safety properties by exploring token sequences.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BEAVER is the first practical framework for computing deterministic, sound probability bounds on LLM satisfaction of safety properties. Given a prompt and any safety property, BEAVER systematically explores the model output space using novel Token trie and Frontier data structures, maintaining provably sound bounds at every iteration. The authors formalize the verification problem, prove soundness of the approach, and evaluate it on four safety properties across twelve open-weight LLMs, where it identifies two to three times more risky instances than baselines at one-tenth the compute budget.
What carries the argument
Token trie and Frontier data structures that systematically enumerate output sequences while tracking provably sound probability bounds on property satisfaction.
If this is right
- Sound bounds replace sampling estimates for any safety property that can be checked on complete outputs.
- More risky generations are surfaced than random sampling achieves at the same budget.
- The bounds remain valid after every exploration step, allowing early stopping with guarantees.
- The approach applies across multiple open-weight models and different safety properties.
Where Pith is reading between the lines
- The same exploration structure could be adapted to verify other output properties such as factual correctness or style constraints.
- When only black-box access is available, combining the method with targeted sampling might still yield useful approximate bounds.
- Efficiency improvements open the door to running verification inside generation loops rather than after the fact.
- Scaling to longer outputs would likely require pruning rules on the frontier to keep the search tractable.
Load-bearing premise
The LLM must allow token-by-token queries that return logits or probabilities, and safety properties must be expressible as predicates on finite output sequences.
What would settle it
Running the method on a black-box API that returns only complete strings without intermediate probabilities, and showing that sound bounds can no longer be maintained.
Figures
read the original abstract
As large language models (LLMs) transition from research prototypes to production systems, practitioners often need reliable methods to verify model outputs and characterize tail risk for safe deployment. While sampling-based estimates provide an ad-hoc intuition of model behavior, they offer no sound guarantees. We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM satisfaction of safety properties. Given a prompt & any safety property, BEAVER systematically explores the model output space using novel Token trie and Frontier data structures, maintaining provably sound bounds at every iteration. We formalize the verification problem, prove soundness of our approach, and evaluate BEAVER on 4 safety properties across 12 open-weight LLMs. BEAVER identifies 2-3x more risky instances compared to baselines while taking 1/10 of the compute budget, surfacing tail risks that loose bounds and ad-hoc evaluation misses.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents BEAVER, a framework that uses novel Token Trie and Frontier data structures to systematically explore the token-level output space of LLMs, computing deterministic sound probability bounds on satisfaction of arbitrary safety properties expressed as predicates over finite sequences. It formalizes the verification problem, proves soundness with bounds maintained at every iteration, and reports empirical results on 4 safety properties across 12 open-weight LLMs, claiming 2-3x more risky instance detections than baselines at 1/10 the compute budget.
Significance. If the central claims hold, the work provides a meaningful step toward guaranteed, non-sampling-based characterization of tail risks in LLM outputs, which is valuable for safe deployment. The formalization, soundness proof, and data-structure-driven exploration are strengths that distinguish it from ad-hoc sampling methods. The reported efficiency gains could be impactful if they generalize beyond the evaluated predicates.
major comments (2)
- [Algorithm description and predicate integration (around the Frontier update and early-termination rules)] The efficiency and 'first practical framework' claims rest on the ability to prune or bound the frontier using partial-sequence evaluations of the safety predicate. For general predicates (e.g., semantic toxicity classifiers or checks over large forbidden-phrase sets) that are only decidable on complete sequences, the algorithm reduces to exhaustive expansion of all frontier paths to leaves before tightening the remaining-mass bound, restoring exponential cost in output length. No section explicitly analyzes this case or provides a complexity bound conditioned on predicate monotonicity or prefix-decidability.
- [Soundness proof and experimental setup] The soundness proof is stated to maintain bounds at every iteration, but the manuscript does not detail how the predicate is encoded to support incremental checks or how early termination preserves the guarantee when the predicate cannot be evaluated on prefixes. This leaves the practical 1/10-compute claim only partially supported for the 'any safety property' scope asserted in the abstract.
minor comments (2)
- [Preliminaries] Notation for the probability bounds and remaining mass could be clarified with a single running example that shows bound evolution over iterations.
- [Experiments] The baseline comparison would benefit from explicit pseudocode or parameter settings for the sampling methods to ensure reproducibility.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our manuscript. We value the recognition of the formalization, soundness proof, and data structures as distinguishing strengths. We respond point-by-point to the major comments below, offering clarifications on the framework's design and committing to revisions that will better delineate the conditions for efficiency while preserving the soundness guarantees.
read point-by-point responses
-
Referee: [Algorithm description and predicate integration (around the Frontier update and early-termination rules)] The efficiency and 'first practical framework' claims rest on the ability to prune or bound the frontier using partial-sequence evaluations of the safety predicate. For general predicates (e.g., semantic toxicity classifiers or checks over large forbidden-phrase sets) that are only decidable on complete sequences, the algorithm reduces to exhaustive expansion of all frontier paths to leaves before tightening the remaining-mass bound, restoring exponential cost in output length. No section explicitly analyzes this case or provides a complexity bound conditioned on predicate monotonicity or prefix-decidability.
Authors: We thank the referee for this precise observation. Our approach is defined for arbitrary predicates over finite sequences, and the Token Trie and Frontier structures maintain a conservative bound on unexplored probability mass at every step. When a predicate is only decidable on complete sequences, expansion continues until leaves are reached for those paths, after which the predicate is evaluated and the bound is tightened; however, the Frontier still permits early termination of the overall search once the remaining mass is below the target precision, without requiring full enumeration of all possible outputs. We agree that the manuscript would be strengthened by an explicit discussion of this distinction. In revision, we will add a subsection analyzing predicate classes (prefix-decidable, monotonic, or terminal-only) and the resulting complexity implications, including a high-level bound showing that the cost is exponential in sequence length only for the terminal-only case but remains practical due to mass-based pruning. revision: yes
-
Referee: [Soundness proof and experimental setup] The soundness proof is stated to maintain bounds at every iteration, but the manuscript does not detail how the predicate is encoded to support incremental checks or how early termination preserves the guarantee when the predicate cannot be evaluated on prefixes. This leaves the practical 1/10-compute claim only partially supported for the 'any safety property' scope asserted in the abstract.
Authors: The soundness result (Theorem 1) holds unconditionally: the upper and lower bounds on the probability of predicate satisfaction are invariants maintained by the Frontier updates, regardless of when or whether the predicate can be evaluated on a given prefix. If the predicate cannot be decided on a prefix, that path's full probability mass is retained in the Frontier and no predicate-based pruning is applied; termination occurs only when the total unexplored mass is sufficiently small, which conservatively over-approximates any possible violations in the remaining space. We will revise the manuscript to include a clearer description of the predicate interface (with examples of incremental vs. non-incremental encodings), additional pseudocode for the update rules, and an explicit statement that soundness is independent of prefix decidability while efficiency gains are largest for predicates admitting early decisions. The reported 1/10-compute results are for the four evaluated properties, which admit varying degrees of prefix information; we will add this qualification to the experimental section. revision: yes
Circularity Check
No circularity: algorithmic formalization with independent soundness proof
full rationale
The paper describes BEAVER as an algorithmic procedure that systematically explores LLM output space via Token trie and Frontier data structures while maintaining provably sound probability bounds. It explicitly states that the verification problem is formalized and soundness is proven, with evaluation performed on concrete safety properties across models. No step reduces a claimed result to a fitted parameter, self-defined quantity, or self-citation chain by construction; the central claims rest on data-structure invariants and the formalization rather than on any enumerated circular pattern. The method is therefore self-contained against external benchmarks such as sampling-based estimates.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The LLM can be queried for next-token probabilities at any prefix, enabling systematic enumeration of the output space.
- domain assumption Safety properties can be decided on finite output sequences.
invented entities (2)
-
Token trie
no independent evidence
-
Frontier
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/ArithmeticFromLogic.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We present BEAVER, the first practical framework for computing deterministic, sound probability bounds on LLM satisfaction of safety properties... using novel Token trie and Frontier data structures
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report, 2023
work page 2023
-
[2]
Ross Anderson, Joey Huchette, Will Ma, Christian Tjandraatmadja, and Juan Pablo Vielma. Strong mixed-integer programming formulations for trained neural networks.Mathematical Programming, 2020
work page 2020
-
[3]
Stanley Bak, Hoang-Dung Tran, Kerianne Hobbs, and Taylor T. Johnson. Improved geometric path enumeration for verifying relu neural networks. In Shuvendu K. Lahiri and Chao Wang, editors,Computer Aided Verification - 32nd International Conference, CA V 2020, Los Angeles, CA, USA, July 21-24, 2020, Proceedings, Part I, volume 12224 ofLecture Notes in Comput...
-
[4]
Certifying geometric robustness of neural networks
Mislav Balunovic, Maximilian Baader, Gagandeep Singh, Timon Gehr, and Martin Vechev. Certifying geometric robustness of neural networks. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, editors,Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https: //proceedings.neurips....
work page 2019
-
[5]
Interpreting robustness proofs of deep neural networks
Debangshu Banerjee, Avaljot Singh, and Gagandeep Singh. Interpreting robustness proofs of deep neural networks. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id= Ev10F9TWML
work page 2024
-
[6]
Crane: Reasoning with constrained llm generation, 2025
Debangshu Banerjee, Tarun Suresh, Shubham Ugare, Sasa Misailovic, and Gagandeep Singh. Crane: Reasoning with constrained llm generation, 2025. URL https://arxiv.org/abs/2502.09061
-
[7]
Purple llama cyberseceval: A secure coding benchmark for language models, 2023
Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, and Joshua Saxe. Purple l...
-
[8]
Rudy Bunel, Jingyue Lu, Ilker Turkaslan, Pushmeet Kohli, P Torr, and P Mudigonda. Branch and bound for piecewise linear neural network verification.Journal of Machine Learning Research, 21(2020), 2020
work page 2020
-
[9]
Rudy R Bunel, Oliver Hinder, Srinadh Bhojanapalli, and Krishnamurthy Dvijotham. An efficient nonconvex re- formulation of stagewise convex optimization problems.Advances in Neural Information Processing Systems, 33, 2020
work page 2020
-
[10]
Marco Casadio, Tanvi Dinkar, Ekaterina Komendantskaya, Luca Arnaboldi, Matthew L Daggitt, Omri Isac, Guy Katz, Verena Rieser, and Oliver Lemon. Nlp verification: towards a general methodology for certifying robustness.European Journal of Applied Mathematics, pages 1–58, 2025
work page 2025
-
[11]
Quantitative certification of knowledge comprehension in llms
Isha Chaudhary, Vedaant V Jain, and Gagandeep Singh. Quantitative certification of knowledge comprehension in llms. InICLR 2024 Workshop on Secure and Trustworthy Large Language Models
work page 2024
-
[12]
Quantitative certification of bias in large language models.arXiv e-prints, pages arXiv–2405, 2024
Isha Chaudhary, Qian Hu, Manoj Kumar, Morteza Ziyadi, Rahul Gupta, and Gagandeep Singh. Quantitative certification of bias in large language models.arXiv e-prints, pages arXiv–2405, 2024
work page 2024
-
[13]
Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, and Thang Luong. Gold-medalist performance in solving olympiad geometry with alphageometry2, 2025. URL https://arxiv.org/abs/2502.03544
-
[14]
Training Verifiers to Solve Math Word Problems
Karl Cobbe, Vineet Kosaraju, Mohammad Bavarian, Mark Chen, Heewoo Jun, Lukasz Kaiser, Matthias Plappert, Jerry Tworek, Jacob Hilton, Reiichiro Nakano, Christopher Hesse, and John Schulman. Training verifiers to solve math word problems, 2021. URL https://arxiv.org/abs/2110.14168
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[15]
Certified adversarial robustness via randomized smoothing
Jeremy Cohen, Elan Rosenfeld, and Zico Kolter. Certified adversarial robustness via randomized smoothing. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,Proceedings of the 36th International Conference on Machine Learning, volume 97 ofProceedings of Machine Learning Research, pages 1310–1320. PMLR, 09–15 Jun 2019. URL https://proceedings.mlr.pres...
work page 2019
-
[16]
File searching using variable length keys
Rene De La Briandais. File searching using variable length keys. InPapers Presented at the the March 3-5, 1959, Western Joint Computer Conference, IRE-AIEE-ACM ’59 (Western), page 295–298, New York, NY, USA, 1959. Association for Computing Machinery. ISBN 9781450378659. doi: 10.1145/1457838.1457895. URL https://doi.org/10.1145/1457838. 1457895
-
[17]
Leonardo De Moura and Nikolaj Bjørner. Z3: an efficient smt solver. InProceedings of the Theory and Practice of Software, 14th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS’08/ETAPS’08, page 337–340, Berlin, Heidelberg, 2008. Springer-Verlag. ISBN 3540787992
work page 2008
-
[18]
Formal verification of piece-wise linear feed-forward neural networks
Ruediger Ehlers. Formal verification of piece-wise linear feed-forward neural networks. InInternational Symposium on Automated Technology for Verification and Analysis, 2017
work page 2017
-
[19]
Shh, don’t say that! domain certification in llms.arXiv preprint arXiv:2502.19320, 24 Suresh et al
Cornelius Emde, Alasdair Paren, Preetham Arvind, Maxime Kayser, Tom Rainforth, Thomas Lukasiewicz, Bernard Ghanem, Philip HS Torr, and Adel Bibi. Shh, don’t say that! domain certification in llms.arXiv preprint arXiv:2502.19320, 24 Suresh et al. 2025
-
[20]
Complete verification via multi-neuron relaxation guided branch-and-bound
Claudio Ferrari, Mark Niklas Mueller, Nikola Jovanović, and Martin Vechev. Complete verification via multi-neuron relaxation guided branch-and-bound. InInternational Conference on Learning Representations, 2022. URL https: //openreview.net/forum?id=l_amHf1oaK
work page 2022
-
[21]
Fast geometric projections for local robustness certification
Aymeric Fromherz, Klas Leino, Matt Fredrikson, Bryan Parno, and Corina Pasareanu. Fast geometric projections for local robustness certification. InInternational Conference on Learning Representations, 2021. URL https://openreview. net/forum?id=zWy1uxjDdZJ
work page 2021
-
[22]
Ai2: Safety and robustness certification of neural networks with abstract interpretation
Timon Gehr, Matthew Mirman, Dana Drachsler-Cohen, Petar Tsankov, Swarat Chaudhuri, and Martin Vechev. Ai2: Safety and robustness certification of neural networks with abstract interpretation. In2018 IEEE Symposium on Security and Privacy (SP), 2018
work page 2018
-
[23]
Aaron Grattafiori, Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Alex Vaughan, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava S...
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[24]
The Curious Case of Neural Text Degeneration
Ari Holtzman, Jan Buys, Li Du, Maxwell Forbes, and Yejin Choi. The curious case of neural text degeneration, 2020. URL https://arxiv.org/abs/1904.09751
work page internal anchor Pith review Pith/arXiv arXiv 2020
-
[25]
Are large pre-trained language models leaking your personal information?, 2022
Jie Huang, Hanyin Shao, and Kevin Chen-Chuan Chang. Are large pre-trained language models leaking your personal information?, 2022. URL https://arxiv.org/abs/2205.12628
-
[26]
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang, Samyak Gupta, Mengzhou Xia, Kai Li, and Danqi Chen. Catastrophic jailbreak of open-source llms via exploiting generation, 2023. URL https://arxiv.org/abs/2310.06987
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[27]
Towards Generalized Certified Robustness with Multi-Norm Training
Enyi Jiang and Gagandeep Singh. Towards universal certified robustness with multi-norm training, 2024. URL https://arxiv.org/abs/2410.03000
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[28]
Highly accurate protein structure prediction with alphafold.Nature, 596(7873):583–589, 2021
John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunya- suvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, et al. Highly accurate protein structure prediction with alphafold.Nature, 596(7873):583–589, 2021
work page 2021
-
[29]
Guy Katz, Derek Huang, Duligur Ibeling, Kyle Julian, Christopher Lazarus, Rachel Lim, Parth Shah, Shantanu Thakoor, Haoze Wu, Aleksandar Zeljić, David Dill, Mykel Kochenderfer, and Clark Barrett.The Marabou Framework for Verification and Analysis of Deep Neural Networks, pages 443–452. 07 2019. ISBN 978-3-030-25539-8
work page 2019
-
[30]
Certifying llm safety against adversarial prompting.arXiv preprint arXiv:2309.02705, 2023
Aounon Kumar, Chirag Agarwal, Suraj Srinivas, Aaron Jiaxun Li, Soheil Feizi, and Himabindu Lakkaraju. Certifying llm safety against adversarial prompting.arXiv preprint arXiv:2309.02705, 2023
-
[31]
Double sampling randomized smoothing
Linyi Li, Jiawei Zhang, Tao Xie, and Bo Li. Double sampling randomized smoothing. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors,Proceedings of the 39th International Conference on Machine Learning, volume 162 ofProceedings of Machine Learning Research, pages 13163–13208. PMLR, 17–23 Jul
-
[32]
URL https://proceedings.mlr.press/v162/li22aa.html
-
[33]
Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D Manning, Christopher Re, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu ...
work page 2023
-
[34]
GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models
Iman Mirzadeh, Keivan Alizadeh, Hooman Shahrokhi, Oncel Tuzel, Samy Bengio, and Mehrdad Farajtabar. Gsm- symbolic: Understanding the limitations of mathematical reasoning in large language models, 2024. URL https: //arxiv.org/abs/2410.05229
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[35]
Certified training: Small boxes are all you need
Mark Niklas Mueller, Franziska Eckert, Marc Fischer, and Martin Vechev. Certified training: Small boxes are all you need. InThe Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum? id=7oFuxtJtUMH
work page 2023
-
[36]
The enron corpus: Where the email bodies are buried?, 2020
David Noever. The enron corpus: Where the email bodies are buried?, 2020. URL https://arxiv.org/abs/2001.10374
-
[37]
Alessandro De Palma, Harkirat S. Behl, Rudy R. Bunel, Philip H. S. Torr, and M. Pawan Kumar. Scaling the convex barrier with active sets. In9th International Conference on Learning Representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021, 2021
work page 2021
-
[38]
Pawan Kumar, Robert Stanforth, and Alessio Lomuscio
Alessandro De Palma, Rudy R Bunel, Krishnamurthy Dj Dvijotham, M. Pawan Kumar, Robert Stanforth, and Alessio Lomuscio. Expressive losses for verified robustness via convex combinations. InThe Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=mzyZ4wzKlM
work page 2024
-
[39]
Red Teaming Language Models with Language Models
Ethan Perez, Saffron Huang, Francis Song, Trevor Cai, Roman Ring, John Aslanides, Amelia Glaese, Nat McAleese, and Geoffrey Irving. Red teaming language models with language models, 2022. URL https://arxiv.org/abs/2202.03286
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[40]
Perplexity. Perplexity ai. AI Chatbot, 2023. URL https://www.perplexity.ai/
work page 2023
-
[41]
Qwen, :, An Yang, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chengyuan Li, Dayiheng Liu, Fei Huang, Haoran Wei, Huan Lin, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jingren Zhou, Junyang Lin, Kai Dang, Keming Lu, Keqin Bao, Kexin Yang, Le Yu, Mei Li, Mingfeng Xue, Pei Zhang, Qin Zhu, Rui Men, Runji Lin, Tianhao Li,...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[42]
Position: Formal methods are the principled foundation of safe AI
Gagandeep Singh and Deepika Chawla. Position: Formal methods are the principled foundation of safe AI. InICML Workshop on Technical AI Governance (TAIG), 2025. URL https://openreview.net/forum?id=7V5CDSsjB7
work page 2025
-
[43]
Gagandeep Singh, Timon Gehr, Matthew Mirman, Markus Püschel, and Martin Vechev. Fast and effective robustness certification.Advances in Neural Information Processing Systems, 31, 2018
work page 2018
-
[44]
Beyond the single neuron convex barrier for neural network certification
Gagandeep Singh, Rupanshu Ganvir, Markus Püschel, and Martin Vechev. Beyond the single neuron convex barrier for neural network certification. InAdvances in Neural Information Processing Systems, 2019
work page 2019
-
[45]
An abstract domain for certifying neural networks
Gagandeep Singh, Timon Gehr, Markus Püschel, and Martin Vechev. An abstract domain for certifying neural networks. Proceedings of the ACM on Programming Languages, 3(POPL), 2019
work page 2019
-
[46]
Safety and trust in artificial intelligence with abstract interpretation.Found
Gagandeep Singh, Jacob Laurel, Sasa Misailovic, Debangshu Banerjee, Avaljot Singh, Changming Xu, Shubham Ugare, and Huan Zhang. Safety and trust in artificial intelligence with abstract interpretation.Found. Trends Program. Lang., 8(3–4):250–408, June 2025. ISSN 2325-1107. doi: 10.1561/2500000062. URL https://doi.org/10.1561/2500000062
-
[47]
59 Shubham Ugare, Debangshu Banerjee, Sasa Misailovic, and Gagandeep Singh
Shubham Ugare, Debangshu Banerjee, Sasa Misailovic, and Gagandeep Singh. Incremental verification of neural networks.Proc. ACM Program. Lang., 7(PLDI), June 2023. doi: 10.1145/3591299. URL https://doi.org/10.1145/3591299
-
[48]
Incremental randomized smoothing certification
Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Gagandeep Singh, and Sasa Misailovic. Incremental randomized smoothing certification. InThe Twelfth International Conference on Learning Representations, 2024. URL https: //openreview.net/forum?id=SdeAPV1irk
work page 2024
-
[49]
Itergen: Iterative semantic- aware structured llm generation with backtracking, 2025
Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, and Sasa Misailovic. Itergen: Iterative semantic- aware structured llm generation with backtracking, 2025. URL https://arxiv.org/abs/2410.07295
-
[50]
61 Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, and Suman Jana
Caterina Urban and Antoine Miné. A review of formal methods applied to machine learning, 2021. URL https: //arxiv.org/abs/2104.02466
-
[51]
Boxin Wang, Weixin Chen, Hengzhi Pei, Chulin Xie, Mintong Kang, Chenhui Zhang, Chejian Xu, Zidi Xiong, Ritik Dutta, Rylan Schaeffer, Sang T. Truong, Simran Arora, Mantas Mazeika, Dan Hendrycks, Zinan Lin, Yu Cheng, Sanmi Koyejo, Dawn Song, and Bo Li. Decodingtrust: A comprehensive assessment of trustworthiness in gpt models, 2024. URL https://arxiv.org/ab...
-
[52]
Quantifying risks in multi-turn conversation with large language models, 2025
Chengxiao Wang, Isha Chaudhary, Qian Hu, Weitong Ruan, Rahul Gupta, and Gagandeep Singh. Quantifying risks in multi-turn conversation with large language models, 2025. URL https://arxiv.org/abs/2510.03969
-
[53]
Efficient formal safety analysis of neural networks
Shiqi Wang, Kexin Pei, Justin Whitehouse, Junfeng Yang, and Suman Jana. Efficient formal safety analysis of neural networks. InAdvances in Neural Information Processing Systems, 2018
work page 2018
-
[54]
Shiqi Wang, Huan Zhang, Kaidi Xu, Xue Lin, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. Beta-CROWN: Efficient bound propagation with per-neuron split constraints for neural network robustness verification. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan, editors,Advances in Neural Information Processing Systems, 2021. URL https://openrevi...
work page 2021
-
[55]
Automatic perturbation analysis for scalable certified robustness and beyond
Kaidi Xu, Zhouxing Shi, Huan Zhang, Yihan Wang, Kai-Wei Chang, Minlie Huang, Bhavya Kailkhura, Xue Lin, and Cho-Jui Hsieh. Automatic perturbation analysis for scalable certified robustness and beyond. InProceedings of the BEAVER: An Efficient Deterministic LLM Verifier 27 34th International Conference on Neural Information Processing Systems, NIPS’20, Red...
work page 2020
-
[56]
Kaidi Xu, Huan Zhang, Shiqi Wang, Yihan Wang, Suman Jana, Xue Lin, and Cho-Jui Hsieh. Fast and complete: Enabling complete neural network verification with rapid and massively parallel incomplete verifiers. InInternational Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=nVZtXBI6LNn
work page 2021
-
[57]
An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, Chujie Zheng, Dayiheng Liu, Fan Zhou, Fei Huang, Feng Hu, Hao Ge, Haoran Wei, Huan Lin, Jialong Tang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Yang, Jiaxi Yang, Jing Zhou, Jingren Zhou, Junyang Lin, Kai Dang, Keqin Bao, Kexin Yang, ...
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[58]
Quantifying distributional robustness of agentic tool-selection,
Jehyeok Yeon, Isha Chaudhary, and Gagandeep Singh. Quantifying distributional robustness of agentic tool-selection,
-
[59]
URL https://arxiv.org/abs/2510.03992
work page internal anchor Pith review Pith/arXiv arXiv
-
[60]
Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. Efficient neural network robustness certification with general activation functions.Advances in neural information processing systems, 31, 2018
work page 2018
-
[61]
General cutting planes for bound-propagation-based neural network verification
Huan Zhang, Shiqi Wang, Kaidi Xu, Linyi Li, Bo Li, Suman Jana, Cho-Jui Hsieh, and J Zico Kolter. General cutting planes for bound-propagation-based neural network verification. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho, editors,Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/ forum?id=5haAJAcofjc
work page 2022
-
[62]
Universal and Transferable Adversarial Attacks on Aligned Language Models
Andy Zou, Zifan Maier, Daniel Liu, J Zachary Meng, Teodora Serrano, Matt Fredrikson, Pavol Mazeika, Jacob Stein- hardt, and Zico Kolter. Universal and transferable adversarial attacks on aligned language models.arXiv preprint arXiv:2307.15043, 2023. 28 Suresh et al. A Decoding Strategies Greedy decodingis a deterministic strategy that picks the highest pr...
work page internal anchor Pith review Pith/arXiv arXiv 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.