Synthesizing Inductive Invariants for Distributed Protocols via IC3 and Large Language Models
Pith reviewed 2026-06-30 13:01 UTC · model grok-4.3
The pith
IC3Syn combines IC3 with LLMs to synthesize inductive invariants for all 29 tested distributed protocols in TLA+.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
IC3Syn executes an IC3-style process over TLA+ states where the IC3 controller decomposes the invariant search into blocking tasks and the LLM provides the missing protocol-level reasoning, enabling discovery of invariants on finite instances that TLAPS confirms are inductive for the full protocol.
What carries the argument
The IC3Syn neuro-symbolic framework that pairs a symbolic IC3 controller for task decomposition with LLM protocol-level reasoning.
If this is right
- Safety can be established for protocols like MongoLoglessDynamicRaft that previous tools could not handle.
- The approach applies to consensus, reconfiguration, and client-server protocols without logical restrictions.
- Invariants synthesized on finite instances transfer to unbounded models.
- No manual templates are required for the synthesis process.
Where Pith is reading between the lines
- This integration could reduce the need for expert-crafted templates in verifying additional classes of systems.
- Similar neuro-symbolic loops might be tested on liveness properties or other specification languages.
- The results on industrial-scale examples point to possible use in automated protocol design pipelines.
Load-bearing premise
The invariants discovered on finite instances remain inductive when applied to the full unbounded protocol without subtle flaws that the theorem prover overlooks.
What would settle it
A case where the candidate invariants pass checks on finite instances but fail to be inductive or imply safety when checked by TLAPS on the unbounded TLA+ model.
Figures
read the original abstract
Distributed protocols are notoriously difficult to verify correctly. Proving safety typically requires inductive invariants that both imply the desired property and are preserved by every protocol transition; yet inferring such invariants remains a major bottleneck: existing approaches either restrict the protocol models to a decidable fragment of first-order logic or demand expert-crafted templates. We present IC3Syn, a neuro-symbolic framework that synthesizes inductive invariants by executing an IC3-style process over TLA+ states with the assistance of Large Language Models (LLMs). At large, IC3Syn combines a symbolic IC3 controller, which decomposes invariant synthesis into focused blocking tasks and an LLM which provides protocol-level reasoning that IC3 alone lacks for TLA+ specifications. This integration enables a disciplined yet flexible search for invariants without imposing logical restrictions or requiring manual templates. We evaluate IC3Syn on 29 distributed protocols spanning consensus, reconfiguration and client-server systems, and compare it against Endive, IC3PO, SWISS and DistAI. IC3Syn discovers candidate invariants for all 29 protocols, including MongoLoglessDynamicRaft (MLDR), an industrial-scale Raft-based reconfiguration protocol for which none of the compared tools reports a solution, as well as one complex Paxos variant. In each case, the invariants synthesized on finite instances are shown in TLAPS to be inductive for the full unbounded protocol, thereby establishing safety.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents IC3Syn, a neuro-symbolic framework that combines a symbolic IC3 controller with LLMs to synthesize inductive invariants for TLA+ specifications of distributed protocols. The approach decomposes synthesis into focused blocking tasks on finite instances, using the LLM for protocol-level reasoning to generate candidate clauses. It reports success on all 29 evaluated protocols (including MongoLoglessDynamicRaft and a complex Paxos variant), with the synthesized invariants verified via TLAPS to be inductive and to imply safety over the full unbounded models, outperforming Endive, IC3PO, SWISS, and DistAI.
Significance. If the results hold, the work is significant for reducing reliance on expert templates or decidable fragments in distributed protocol verification. Success on an industrial-scale reconfiguration protocol like MLDR, where prior tools report no solution, is a notable practical advance. Explicit credit is due for the machine-checked TLAPS proofs establishing inductiveness and safety on the unbounded TLA+ models, and for the reproducible empirical comparison across multiple baselines.
major comments (2)
- [Evaluation section] Evaluation section (and abstract): The 100% success rate on 29 protocols is reported without quantitative data on LLM usage, such as the number of candidates generated per protocol, rejection rates, prompt variations tested, or failure modes encountered (especially for MLDR). This information is load-bearing for assessing the reliability and reproducibility of the LLM component in the neuro-symbolic integration.
- [Approach and evaluation sections] Approach and evaluation sections: The method relies on finite-instance IC3 blocking tasks to surface candidates that TLAPS then certifies for the unbounded model. No argument or sensitivity analysis is provided showing that the chosen finite bounds (e.g., for MLDR) are adequate to surface all invariants required for unbounded inductiveness, even though TLAPS accepts the final sets. This is load-bearing for the transfer claim underlying success on complex protocols.
minor comments (2)
- [Evaluation section] The paper would benefit from a table summarizing per-protocol metrics (e.g., number of invariants found, TLAPS proof times) to complement the high-level success claims.
- [§3] Notation for the IC3 blocking tasks and LLM interaction could be clarified with a small example in §3 to make the decomposition more accessible.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments on our submission. We address each major comment below and indicate where revisions will be made to strengthen the manuscript.
read point-by-point responses
-
Referee: [Evaluation section] Evaluation section (and abstract): The 100% success rate on 29 protocols is reported without quantitative data on LLM usage, such as the number of candidates generated per protocol, rejection rates, prompt variations tested, or failure modes encountered (especially for MLDR). This information is load-bearing for assessing the reliability and reproducibility of the LLM component in the neuro-symbolic integration.
Authors: We agree that quantitative details on LLM usage would strengthen reproducibility claims. In the revised manuscript we will add a new subsection (or table) in the evaluation reporting, for each of the 29 protocols: (i) total LLM queries issued, (ii) number of unique candidate clauses generated, (iii) rejection rate by the IC3 controller, and (iv) any prompt-template variations that were explored. For MLDR we will additionally document the specific failure modes observed before a successful invariant set was obtained. These statistics were recorded during our experiments and can be included without altering the core claims. revision: yes
-
Referee: [Approach and evaluation sections] Approach and evaluation sections: The method relies on finite-instance IC3 blocking tasks to surface candidates that TLAPS then certifies for the unbounded model. No argument or sensitivity analysis is provided showing that the chosen finite bounds (e.g., for MLDR) are adequate to surface all invariants required for unbounded inductiveness, even though TLAPS accepts the final sets. This is load-bearing for the transfer claim underlying success on complex protocols.
Authors: We acknowledge the value of an explicit argument for bound selection. The IC3 blocking phase operates only on finite instances to produce candidate clauses; the subsequent TLAPS proof establishes inductiveness and safety directly on the unbounded TLA+ model, which is the transfer guarantee. Nevertheless, we will insert a short paragraph in the approach section explaining the heuristic used to choose finite bounds (typically the smallest number of nodes/processes sufficient to expose all relevant blocking obligations) and will report, for MLDR, the concrete bounds employed together with a brief note that increasing those bounds did not yield additional necessary clauses. A full sensitivity sweep across multiple bound sizes for every protocol was not performed; we therefore mark this revision as partial. revision: partial
Circularity Check
No circularity; empirical synthesis and external verification are independent
full rationale
The paper describes a neuro-symbolic IC3+LLM framework evaluated empirically on 29 protocols against external baselines (Endive, IC3PO, SWISS, DistAI) with TLAPS certification on unbounded TLA+ models. No equations, fitted parameters, self-definitional loops, or load-bearing self-citations reduce any claimed result to its inputs by construction. The finite-to-unbounded transfer is presented as an empirical outcome verified externally rather than derived tautologically from the method's own definitions.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption TLA+ transition semantics are faithfully represented in the finite instances used by IC3
Reference graph
Works this paper leans on
-
[1]
Aaron R. Bradley. 2011. SAT-based model checking without unrolling. InInternational Conference on Verification, Model Checking, and Ab- stract Interpretation. Springer, Springer, Berlin, Heidelberg, 70–87. doi:10.1007/978-3-642-18275-4_7
-
[2]
Aaron R. Bradley. 2012. Understanding IC3. InInternational Conference on Theory and Applications of Satisfiability Testing (SAT). Springer, Berlin, Heidelberg, 1–14. doi:10.1007/978-3-642-31612-8_1
-
[3]
Weining Cao, Guangyuan Wu, Tangzhi Xu, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2025. Clause2Inv: A Generate- Combine-Check Framework for Loop Invariant Inference.Proceed- ings of the ACM on Software Engineering2, ISSTA (2025), 1009–1030. doi:10.1145/3728920
-
[4]
Saikat Chakraborty, Shuvendu Lahiri, Sarah Fakhoury, Akash Lal, Madanlal Musuvathi, Aseem Rastogi, Aditya Senthilnathan, Rahul Sharma, and Nikhil Swamy. 2023. Ranking LLM-Generated Loop Invariants for Program Verification. InFindings of the Association for Computational Linguistics: EMNLP 2023, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association...
-
[5]
Kaustuv Chaudhuri, Damien Doligez, Leslie Lamport, and Stephan Merz. 2008. A TLA+ proof system. InProceedings of the LPAR Work- shop on Knowledge Exchange: Automated Provers and Proof Assis- tants (KEAPPA) (CEUR Workshop Proceedings, Vol. 418). CEUR-WS.org, Aachen, Germany, 17–37.https://ceur-ws.org/Vol-418/paper2.pdf
2008
-
[6]
Alessandro Cimatti and Alberto Griggio. 2016. Infinite-state invariant checking with IC3 and predicate abstraction.Formal Methods in System Design49, 3 (2016), 190–218. doi:10.1007/s10703-016-0257-4
-
[7]
Alessandro Cimatti, Alberto Griggio, Sergio Mover, and Stefano Tonetta. 2014. IC3 Modulo Theories via Implicit Predicate Abstraction. InTools and Algorithms for the Construction and Analysis of Systems (TACAS). Springer, Berlin, Heidelberg, 46–61. doi:10.1007/978-3-642- 54862-8_4
- [8]
-
[9]
Yotam M. Y. Feldman, James R. Wilcox, Sharon Shoham, and Mooly Sagiv. 2019. Inferring Inductive Invariants from Phase Structures. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 405–425. doi:10.1007/978-3-030-25543-5_23
-
[10]
Aman Goel and Karem A. Sakallah. 2021. On symmetry and quan- tification: A new approach to verify distributed protocols. InNASA Formal Methods: 13th International Symposium, NFM 2021, Virtual Event, May 24–28, 2021, Proceedings. Springer, Cham, Switzerland, 131–150. doi:10.1007/978-3-030-76384-8_9
- [11]
-
[12]
Travis Hance, Marijn Heule, Ruben Martins, and Bryan Parno. 2021. Finding Invariants of Distributed Systems: It’s a Small (Enough) World After All. In18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). USENIX Association, Virtual Event, 115–131. https://www.usenix.org/conference/nsdi21/presentation/hance
2021
-
[13]
Chris Hawblitzel, Jon Howell, Manos Kapritsos, Jacob R. Lorch, Bryan Parno, Michael L. Roberts, Srinath Setty, and Brian Zill. 2015. IronFleet: Proving practical distributed systems correct. InProceedings of the 25th Symposium on Operating Systems Principles (SOSP). Association for Computing Machinery, New York, NY, USA, 1–17. doi:10.1145/2815400. 2815428
-
[14]
Kryštof Hoder and Nikolaj Bjørner. 2012. Generalized property di- rected reachability. InInternational Conference on Theory and Appli- cations of Satisfiability Testing. Springer, Springer, Berlin, Heidelberg, 157–171. doi:10.1007/978-3-642-31612-8_13
-
[15]
Adharsh Kamath, Nausheen Mohammed, Aditya Senthilnathan, Saikat Chakraborty, Pantazis Deligiannis, Shuvendu K Lahiri, Akash Lal, Aseem Rastogi, Subhajit Roy, and Rahul Sharma. 2024. Leveraging LLMs for program verification. InFormal Methods in Computer-Aided Design (FMCAD). FMCAD, Inc., Prague, Czech Republic, 107–118. doi:10.34727/2024/isbn.978-3-85448-065-5_16
-
[16]
Koenig, Oded Padon, Neil Immerman, and Alex Aiken
Jason R. Koenig, Oded Padon, Neil Immerman, and Alex Aiken. 2020. First-order quantified separators. InProceedings of the 41st ACM SIG- PLAN Conference on Programming Language Design and Implemen- tation. Association for Computing Machinery, New York, NY, USA, 703–717. doi:10.1145/3385412.3386018
-
[17]
Igor Konnov, Jure Kukovec, and Thanh-Hai Tran. 2019. TLA+ model checking made symbolic.Proceedings of the ACM on Programming Lan- guages3, OOPSLA, Article 123 (2019), 30 pages. doi:10.1145/3360549
-
[18]
Leslie Lamport. 1998. The Part-Time Parliament.ACM Transactions on Computer Systems16, 2 (1998), 133–169. doi:10.1145/279227.279229
-
[19]
Leslie Lamport. 2001. Paxos Made Simple.ACM SIGACT News32, 4 (2001), 51–58.https://lamport.azurewebsites.net/pubs/paxos-simple. pdf
2001
-
[20]
2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers
Leslie Lamport. 2002.Specifying Systems: The TLA+ Language and Tools for Hardware and Software Engineers. Addison-Wesley, Boston, MA.https://lamport.org/tla/book.html
2002
-
[21]
Haojun Ma, Aman Goel, Jean-Baptiste Jeannin, Manos Kapritsos, Baris Kasikci, and Karem A. Sakallah. 2019. I4: Incremental Inference of Inductive Invariants for Verification of Distributed Protocols. InPro- ceedings of the 27th ACM Symposium on Operating Systems Principles. Association for Computing Machinery, New York, NY, USA, 370–384. doi:10.1145/334130...
-
[22]
Kenneth L. McMillan. 2003. Interpolation and SAT-Based Model Check- ing. InComputer Aided Verification (CA V). Springer, Berlin, Heidelberg, 1–13. doi:10.1007/978-3-540-45069-6_1
-
[23]
Chris Newcombe, Tim Rath, Fan Zhang, Bogdan Munteanu, Marc Brooker, and Michael Deardeuff. 2015. How Amazon Web Services Uses Formal Methods.Commun. ACM58, 4 (2015), 66–73. doi:10.1145/ 2699417
2015
-
[24]
Oded Padon, Giuliano Losa, Mooly Sagiv, and Sharon Shoham. 2017. Paxos Made EPR: Decidable Reasoning about Distributed Protocols. Proceedings of the ACM on Programming Languages1, OOPSLA, Article 108 (2017), 31 pages. doi:10.1145/3140568
-
[25]
McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham
Oded Padon, Kenneth L. McMillan, Aurojit Panda, Mooly Sagiv, and Sharon Shoham. 2016. Ivy: Safety verification by interactive generaliza- tion. InProceedings of the 37th ACM SIGPLAN Conference on Program- ming Language Design and Implementation. Association for Computing Machinery, New York, NY, USA, 614–630. doi:10.1145/2908080.2908118
-
[26]
Muhammad A. A. Pirzada, Giles Reger, Ahmed Bhayat, and Lucas C. Cordeiro. 2024. LLM-Generated Invariants for Bounded Model Check- ing Without Loop Unrolling. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering. Asso- ciation for Computing Machinery, New York, NY, USA, 1395–1407. doi:10.1145/3691620.3695512
-
[27]
William Schultz, Edward Ashton, Heidi Howard, and Stavros Tri- pakis. 2024. Scalable, Interpretable Distributed Protocol Verifi- cation by Inductive Proof Slicing. doi:10.48550/arXiv.2404.18048 arXiv:2404.18048 [cs.DC]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2404.18048 2024
-
[28]
William Schultz, Ian Dardik, and Stavros Tripakis. 2022. Plain and simple inductive invariant inference for distributed protocols in TLA+. InProceedings of the 22nd Conference on Formal Methods in Computer- Aided Design (FMCAD). FMCAD, Inc., Austin, TX, USA, 273–283. doi:10. 34727/2022/isbn.978-3-85448-053-2_34
2022
-
[29]
Anjiang Wei, Tarun Suresh, Tianran Sun, Haoze Wu, Ke Wang, and Alex Aiken. 2025. Quokka: Accelerating Program Verification with LLMs via Invariant Synthesis. doi:10.48550/arXiv.2509.21629 arXiv:2509.21629 [cs.PL] 14
-
[30]
Cheng Wen, Jialun Cao, Jie Su, Zhiwu Xu, Shengchao Qin, Mengda He, Haokun Li, Shing-Chi Cheung, and Cong Tian. 2024. Enchanting Pro- gram Specification Synthesis by Large Language Models Using Static Analysis and Program Verification. InComputer Aided Verification: 36th International Conference, CA V 2024, Montreal, QC, Canada, July 24–27, 2024, Proceedin...
-
[31]
Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D
James R. Wilcox, Doug Woos, Pavel Panchekha, Zachary Tatlock, Xi Wang, Michael D. Ernst, and Thomas Anderson. 2015. Verdi: A frame- work for implementing and formally verifying distributed systems. InProceedings of the 36th ACM SIGPLAN Conference on Programming Language Design and Implementation. Association for Computing Ma- chinery, New York, NY, USA, 3...
-
[32]
Guangyuan Wu, Weining Cao, Yuan Yao, Hengfeng Wei, Taolue Chen, and Xiaoxing Ma. 2024. LLM Meets Bounded Model Checking: Neuro- symbolic Loop Invariant Inference. InProceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering(Sacra- mento, CA, USA)(ASE ’24). Association for Computing Machinery, New York, NY, USA, 406–417. do...
-
[33]
Haoze Wu, Clark Barrett, and Nina Narodytska. 2024. Lemur: Inte- grating Large Language Models in Automated Program Verification. https://openreview.net/forum?id=Q3YaCghZNtICLR 2024 poster, OpenReview.net
2024
-
[34]
VeruSAGE: A Study of Agent-Based Verification for Rust Systems
Chenyuan Yang, Natalie Neamtu, Chris Hawblitzel, Jacob R. Lorch, and Shan Lu. 2025. VeruSAGE: A Study of Agent-Based Verification for Rust Systems. doi:10.48550/arXiv.2512.18436arXiv:2512.18436 [cs.SE]
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2512.18436arxiv:2512.18436 2025
-
[35]
Jianan Yao, Runzhou Tao, Ronghui Gu, Jason Nieh, Suman Jana, and Gabriel Ryan. 2021. DistAI: Data-driven automated invariant learning for distributed protocols. In15th USENIX Symposium on Operating Systems Design and Implementation (OSDI 21). USENIX Association, Santa Clara, CA, 405–421.https://www.usenix.org/conference/osdi21/ presentation/yao
2021
-
[36]
Yuan Yu, Panagiotis Manolios, and Leslie Lamport. 1999. Model checking TLA+ specifications. InCorrect Hardware Design and Veri- fication Methods: 10th IFIP WG 10.5 Advanced Research Working Con- ference, CHARME’99. Springer, Springer, Berlin, Heidelberg, 54–66. doi:10.1007/3-540-48153-2_6
-
[37]
Tony Nuda Zhang, Travis Hance, Manos Kapritsos, Tej Chajed, and Bryan Parno. 2024. Inductive Invariants That Spark Joy: Using In- variant Taxonomies to Streamline Distributed Protocol Proofs. In18th USENIX Symposium on Operating Systems Design and Implementation (OSDI 24). USENIX Association, Santa Clara, CA, 837–853.https: //www.usenix.org/conference/osd...
2024
-
[38]
Tony Nuda Zhang, Keshav Singh, Tej Chajed, Manos Kapritsos, and Bryan Parno. 2025. Basilisk: Using Provenance Invariants to Auto- mate Proofs of Undecidable Protocols. In19th USENIX Symposium on Operating Systems Design and Implementation (OSDI 25). USENIX Association, Boston, MA, 1–18.https://www.usenix.org/conference/ osdi25/presentation/zhang-tonyBest ...
2025
-
[39]
Yuhao Zhou and Stavros Tripakis. 2025. Towards Language Model Guided TLA+ Proof Automation. doi:10.48550/arXiv.2512.09758 arXiv:2512.09758 [cs.LO] 15
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.