pith. machine review for the scientific record. sign in

arxiv: 2605.13616 · v1 · submitted 2026-05-13 · 💻 cs.SE

Recognition: 1 theorem link

· Lean Theorem

Scalable Deductive Verification of Data-Level Parallel Programs

Authors on Pith no claims yet

Pith reviewed 2026-05-14 17:54 UTC · model grok-4.3

classification 💻 cs.SE
keywords deductive verificationdata-level parallelismGPU kernelsquantifier rewritingnon-aliasing specificationsimmutabilityarray programsVerCors
0
0 comments X

The pith

Quantifier rewrite rules and non-aliasing specifications reduce verification time for data-parallel array programs by a factor of nine on average.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to scale deductive verification for programs that operate on arrays and matrices in a data-parallel style, such as GPU kernels. A proven rewrite technique turns expressions with nested quantifiers into forms that produce effective triggers for automated provers. New specification constructs let users declare pairs of arrays as non-aliases or mark arrays as immutable so they can be treated as mathematical sequences. Experiments on representative GPU kernels demonstrate that these changes cut average verification time by a factor of nine, with some cases reaching 150 times faster, and they make previously unreachable verification results attainable.

Core claim

We introduce a sound rewrite technique for expressions containing nested quantifiers that generates suitable triggers, together with specification constructs that assert and verify non-aliasing or immutability of arrays so they can be modelled as sequences, all implemented inside the VerCors verifier, and we demonstrate that these changes yield large reductions in verification effort for data-level parallel programs.

What carries the argument

The quantifier-rewrite procedure that produces usable triggers for nested quantifiers, combined with the non-aliasing and immutability specification constructs that permit arrays to be treated as mathematical sequences.

If this is right

  • Verification of larger or more complex GPU kernels becomes feasible within practical time limits.
  • Earlier case studies, such as the radio telescope pipeline, can now be completed where they previously could not.
  • Re-running prior experiments with the new constructs yields both faster results and newly obtainable proofs.
  • Programs that fit the data-parallel model and are annotated with the required specifications verify substantially quicker.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same rewrite approach could be ported to other deductive verifiers that rely on SMT solvers for quantifier instantiation.
  • If the non-aliasing constructs were adopted more widely, they might reduce the manual annotation burden for array-based parallel code in general.
  • The observed speed-ups suggest that similar trigger-generation improvements could benefit verification of other array-heavy domains such as scientific computing libraries.

Load-bearing premise

Users can supply accurate non-aliasing and immutability specifications and the programs conform to the data-level parallel model without complex aliasing that defeats the new constructs.

What would settle it

Re-running the reported GPU kernel verifications after disabling the quantifier rewrite and the non-aliasing/immutability constructs produces verification times that show no average factor-of-nine improvement.

Figures

Figures reproduced from arXiv: 2605.13616 by Anton Wijs, Lars B. van den Haak, Marieke Huisman.

Figure 1
Figure 1. Figure 1: An example of storing two matrices together in one array. The second condition may be confusing, but note that with integer division, get_global_size(0)*(n/get_global_size(0)) is not necessarily equal to n . Furthermore, note that x represents the original index expression \gtid+ i*get_global_size(0), hence x - \gtid equals i*get_global_size(0). From this, we can derive that 0 ≤i*get_global_size(0)<get_glo… view at source ↗
Figure 2
Figure 2. Figure 2: Verification of CLBlast kernels, normalized so Base=100%. Base is compared against versions with unique type qualifiers (Unique), immutable arrays (Immutable), extracted kernel bodies (Extract), and All which enables all these features. Below the bars: number of successful runs (✓= 10/10, ✗ 0/10). Next to base bars: average time of the 10 base runs in seconds. The † indicates we (try to) prove functional c… view at source ↗
read the original abstract

This paper introduces several techniques that improve the scalability of the deductive verification of data-level programs working on arrays and matrices. First of all, we introduce a technique to rewrite expressions with (nested) quantifiers, so suitable triggers can be generated for these expressions. We have proven this rewrite technique correct in a theorem prover. Second, we make reasoning about potentially overlapping arrays easier, by providing specification constructs to indicate and verify that two arrays are not aliases, or that they are immutable, so they can be modelled as mathematical sequences. All our techniques are implemented in the VerCors program verifier. We illustrate how our techniques improve scalability through a large number of experiments. Using our techniques on a set of typical GPU kernels, we achieve a reduction of verification time by, on average, a factor of 9, with outliers being up to 150 times faster. Additionally, applying these techniques to earlier experiments and an earlier case study of a radio telescope pipeline permitted the verification of results which were previously unobtainable and significantly reduced the verification time.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 1 minor

Summary. The paper introduces a technique to rewrite expressions with (nested) quantifiers to enable suitable trigger generation, proven correct inside a theorem prover, together with new specification constructs for asserting and verifying non-aliasing and immutability of arrays so they can be modeled as mathematical sequences. Both features are implemented in the VerCors verifier and evaluated on GPU kernels, yielding an average 9-fold reduction in verification time (outliers up to 150 times faster) and enabling verification of previously unobtainable results on a radio-telescope pipeline case study.

Significance. If the reported speedups hold, the contribution is significant for practical deductive verification of data-parallel code. The machine-checked proof of the rewrite rule and the concrete timing numbers on external GPU benchmarks provide credible, reproducible support. By directly addressing quantifier instantiation and aliasing bottlenecks that commonly arise in GPU kernels, the work lowers a key barrier to applying deductive verification at scale.

minor comments (1)
  1. A summary table listing per-benchmark verification times before and after the new techniques would make the experimental claims easier to assess at a glance.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces a quantifier rewrite technique proven correct via an independent theorem prover and new specification constructs for non-aliasing/immutability, both implemented in VerCors and evaluated on external GPU kernel benchmarks. No derivation step reduces by construction to fitted parameters, self-definitions, or load-bearing self-citations; the speedup claims rest on empirical runs against independent test cases rather than internal redefinitions.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on standard first-order logic and the existing VerCors framework; no new free parameters, ad-hoc axioms, or invented entities are introduced.

axioms (1)
  • standard math Standard first-order logic axioms and trigger mechanisms in SMT solvers
    Invoked for the correctness proof of the quantifier rewrite.

pith-pipeline@v0.9.0 · 5479 in / 1085 out tokens · 38102 ms · 2026-05-14T17:54:40.878012+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    In: Gurfinkel, A., Ganesh, V

    Armborst, L., Bos, P., van den Haak, L.B., Huisman, M., Rubbens, R., Şakar, Ö., Tasche, P.: The VerCors Verifier: A Progress Report. In: Gurfinkel, A., Ganesh, V. (eds.) Computer Aided Verification. Lecture Notes in Computer Science, vol. 14682, pp. 3–18. Springer Nature Switzerland, Cham (2024).https://doi.org/ 10.1007/978-3-031-65630-9_1

  2. [2]

    In: In Proceedings of the 1st Workshop on Grids, Clouds and P2P Programming (CGWS)

    Bertolli, C., Betts, A., Mudalige, G., Giles, M., Kelly, P.: Design and Performance of the OP2 Library for Unstructured Mesh Applications. In: In Proceedings of the 1st Workshop on Grids, Clouds and P2P Programming (CGWS). Lecture Notes in Computer Science, vol. 7155, pp. 191–200. Springer (2011).https://doi.org/ 10.1007/978-3-642-29737-3_22

  3. [3]

    In: Proceedings of the 10th SIG- PLAN Symposium on New Ideas, New Paradigms, and Reflections on Program- ming and Software

    Bierhoff, K.: Automated program verification made SYMPLAR: Symbolic per- missions for lightweight automated reasoning. In: Proceedings of the 10th SIG- PLAN Symposium on New Ideas, New Paradigms, and Reflections on Program- ming and Software. pp. 19–32. ACM, Portland Oregon USA (Oct 2011).https: //doi.org/10.1145/2048237.2048242

  4. [4]

    Science of Computer Programming95, 376–388 (Dec 2014).https: //doi.org/10.1016/j.scico.2014.03.013

    Blom, S., Huisman, M., Mihelčić, M.: Specification and verification of GPGPU programs. Science of Computer Programming95, 376–388 (Dec 2014).https: //doi.org/10.1016/j.scico.2014.03.013

  5. [5]

    In: Gardner, P., Yoshida, N

    Brookes, S.: A Semantics for Concurrent Separation Logic. In: Gardner, P., Yoshida, N. (eds.) CONCUR 2004 - Concurrency Theory. pp. 16–34. Springer, Berlin, Heidelberg (2004).https://doi.org/10.1007/978-3-540-28644-8_2

  6. [6]

    In: Yang, H

    Charguéraud, A., Pottier, F.: Temporary Read-Only Permissions for Separation Logic. In: Yang, H. (ed.) Programming Languages and Systems, vol. 10201, pp. 260–286. Springer Berlin Heidelberg, Berlin, Heidelberg (2017).https://doi.or g/10.1007/978-3-662-54434-1_10

  7. [7]

    Journal of the ACM52(3), 365–473 (May 2005).https://doi.org/10.1145/ 1066100.1066102

    Detlefs, D., Nelson, G., Saxe, J.B.: Simplify: A theorem prover for program check- ing. Journal of the ACM52(3), 365–473 (May 2005).https://doi.org/10.1145/ 1066100.1066102

  8. [8]

    In: Computer Aided Verification (CAV) (2025)

    Eilers, M., Schwerhoff, M., Summers, A.J., Müller, P.: Fifteen Years of Viper. In: Computer Aided Verification (CAV) (2025)

  9. [9]

    In: Pro- ceedings of the 4th Workshop on General Purpose Processing on Graphics Process- ing Units (GPGPU)

    Grewe, D., Lokhmotov, A.: Automatically Generating and Tuning GPU Code for Sparse Matrix-Vector Multiplication from a High-Level Representation. In: Pro- ceedings of the 4th Workshop on General Purpose Processing on Graphics Process- ing Units (GPGPU). ACM (2011).https://doi.org/10.1145/1964179.1964196

  10. [10]

    In: Drossopoulou, S

    Haack, C., Poll, E.: Type-Based Object Immutability with Flexible Initializa- tion. In: Drossopoulou, S. (ed.) ECOOP 2009 – Object-Oriented Programming, vol. 5653, pp. 520–545. Springer Berlin Heidelberg, Berlin, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03013-0_24

  11. [11]

    Hillis, W.D., Steele, G.L.: Data parallel algorithms. Commun. ACM29(12), 1170– 1183 (Dec 1986).https://doi.org/10.1145/7902.7903

  12. [12]

    In: Proceedings of the 28th International Conference on Machine Learning (ICML)

    Le, Q., Ngiam, J., Coates, A., Lahiri, A., Prochnow, B., Ng, A.: On Optimization Methods for Deep Learning. In: Proceedings of the 28th International Conference on Machine Learning (ICML). pp. 265–272. Omnipress (2011)

  13. [13]

    In: Chaudhuri, S., Farzan, A

    Leino, K.R.M., Pit-Claudel, C.: Trigger selection strategies to stabilize program verifiers. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification. pp. 361–381. Springer International Publishing, Cham (2016)

  14. [14]

    Rustan M

    Leino, K.R.M.: Dafny: An Automatic Program Verifier for Functional Correctness. In: Clarke, E.M., Voronkov, A. (eds.) Logic for Programming, Artificial Intelli- 20 L. B. van den Haak, A.J. Wijs and M. Huisman gence, and Reasoning, vol. 6355, pp. 348–370. Springer Berlin Heidelberg, Berlin, Heidelberg (2010).https://doi.org/10.1007/978-3-642-17511-4_20

  15. [15]

    In: Proceedings of the 2012 Conference and Exhibition on Design, Automation & Test in Europe (DATE)

    Liu, X., Tan, S., Wang, H.: Parallel Statistical Analysis of Analog Circuits by GPU-Accelerated Graph-Based Approach. In: Proceedings of the 2012 Conference and Exhibition on Design, Automation & Test in Europe (DATE). pp. 852–857. IEEE Computer Society (2012).https://doi.org/10.1109/DATE.2012.6176615

  16. [17]

    In: Chaudhuri, S., Farzan, A

    Müller, P., Schwerhoff, M., Summers, A.J.: Automatic Verification of Iterated Sep- arating Conjunctions Using Symbolic Execution. In: Chaudhuri, S., Farzan, A. (eds.) Computer Aided Verification, vol. 9779, pp. 405–425. Springer International Publishing, Cham (2016).https://doi.org/10.1007/978-3-319-41528-4_22

  17. [18]

    In: Jobstmann, B., Leino, K.R.M

    Müller, P., Schwerhoff, M., Summers, A.J.: Viper: A Verification Infrastructure for Permission-Based Reasoning. In: Jobstmann, B., Leino, K.R.M. (eds.) Verifi- cation, Model Checking, and Abstract Interpretation. pp. 41–62. Springer, Berlin, Heidelberg (2016).https://doi.org/10.1007/978-3-662-49122-5_2

  18. [19]

    In: Proceedings of the International Workshop on OpenCL

    Nugteren, C.: CLBlast: A Tuned OpenCL BLAS Library. In: Proceedings of the International Workshop on OpenCL. pp. 1–10. IWOCL ’18, Association for Com- puting Machinery, New York, NY, USA (May 2018).https://doi.org/10.1145/ 3204919.3204924

  19. [20]

    Communications of the ACM61(1), 106–115 (Dec 2017).https://doi.org/10.1145/3150211

    Ragan-Kelley, J., Adams, A., Sharlet, D., Barnes, C., Paris, S., Levoy, M., Ama- rasinghe, S., Durand, F.: Halide: Decoupling algorithms from schedules for high- performance image processing. Communications of the ACM61(1), 106–115 (Dec 2017).https://doi.org/10.1145/3150211

  20. [21]

    Doctoral Thesis, ETH Zurich (2016).https://doi.or g/10.3929/ethz-a-010835519

    Schwerhoff, M.H.: Advancing Automated, Permission-Based Program Verification Using Symbolic Execution. Doctoral Thesis, ETH Zurich (2016).https://doi.or g/10.3929/ethz-a-010835519

  21. [22]

    In: Finkbeiner, B., Kovács, L

    van den Haak, L.B., Wijs, A.J., Huisman, M., van den Brand, M.G.J.: HaliVer: Deductive Verification and Scheduling Languages Join Forces. In: Finkbeiner, B., Kovács, L. (eds.) Tools and Algorithms for the Construction and Analysis of Sys- tems. Lecture Notes in Computer Science, vol. 14572, pp. 71–89. Springer Nature Switzerland, Cham (2024).https://doi.o...

  22. [23]

    In: Haxthausen, A.E., Serwe, W

    van den Haak, L.B., Wijs, A.J., Huisman, M., van den Brand, M.G.J.: Verify- ing a Radio Telescope Pipeline Using HaliVer: Solving Nonlinear and Quantifier Challenges. In: Haxthausen, A.E., Serwe, W. (eds.) Formal Methods for Industrial Critical Systems. Lecture Notes in Computer Science, vol. 14952, pp. 152–169. Springer Nature Switzerland, Cham (2024).ht...

  23. [24]

    doi: 10.1007/97 8-3-319-99579-3_21

    Wienke, S., Springer, P., Terboven, C., Mey, D.: OpenACC - First Experiences with Real-World Applications. In: Proceedings of the 18th European Conference on Parallel and Distributed Computing (EuroPar). Lecture Notes in Computer Science, vol. 7484, pp. 859–870. Springer (2012).https://doi.org/10.1007/97 8-3-642-32820-6_85

  24. [25]

    David, F

    Wijs, A., Bošnački, D.: Improving GPU Sparse Matrix-Vector Multiplication for Probabilistic Model Checking. In: Proceedings of the 19th International SPIN Workshop on Model Checking of Software (SPIN). Lecture Notes in Computer Science, vol. 7385, pp. 98–116. Springer (2012).https://doi.org/10.1007/978-3 -642-31759-0_9 Scalable Deductive Verification of D...

  25. [26]

    In: Silva, A., Leino, K.R.M

    Wolf, F.A., Arquint, L., Clochard, M., Oortwijn, W., Pereira, J.C., Müller, P.: Go- bra: Modular Specification and Verification of Go Programs. In: Silva, A., Leino, K.R.M. (eds.) Computer Aided Verification, vol. 12759, pp. 367–379. Springer In- ternational Publishing, Cham (2021).https://doi.org/10.1007/978-3-030-816 85-8_17 22 L. B. van den Haak, A.J. ...