Automata Learning with an Incomplete but Inductive Teacher (Technical Report)
Pith reviewed 2026-05-15 19:36 UTC · model grok-4.3
The pith
IdMAT teacher formalism allows active automata learning to find any regular language in a target class using incomplete inductive answers.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce IdMAT as a teacher formalism that can provide incomplete answers or inductive facts consistent with any language in the target class, and pair it with the LIndA algorithm that reduces the learning problem to solving a unique SAT instance using incremental solving and UNSAT core analysis, handling both simple and inductive counterexamples efficiently.
What carries the argument
IdMAT, a teacher formalism supplying incomplete or inductive answers consistent with any target-class language, and LIndA, the algorithm that encodes all uncertainties into a single SAT instance without forking.
If this is right
- Any regular language in the target class can be found without fully characterizing the language.
- All uncertainties are encoded in one SAT instance, avoiding exponential forking.
- Incremental SAT solving and UNSAT core analysis improve efficiency over repeated independent solves.
- Simple and inductive counterexamples are processed frugally using a Rivest-Schapire-inspired refinement.
- The method applies directly to regular language separation and to inductive invariant synthesis in regular model checking.
Where Pith is reading between the lines
- If inductive facts turn out to be dense in practice, the number of membership queries needed may drop below that of a standard MAT teacher.
- The single-SAT encoding could be reused for other learning settings that admit multiple consistent models.
- Tractability of the SAT instances will determine whether the approach scales to larger alphabets or longer words in real verification tasks.
- Pairing the method with existing query-minimization heuristics could further reduce total learning cost.
Load-bearing premise
The inductive facts supplied by the teacher are sufficiently dense and the resulting SAT instances remain tractable for regular-language separation and regular model checking.
What would settle it
An experiment on a regular language separation instance in which LIndA either fails to return a correct separator when one exists or requires more SAT solving time than a forking baseline on the same input.
Figures
read the original abstract
Active automata learning (AAL) under a Minimally Adequate Teacher (MAT) has been successfully used to infer a regular language through membership and equivalence queries. This language might not be fully characterized: we are then interested in finding any solution in a target class of possibly many regular languages. Some problems such as regular language separation or inductive invariant synthesis in the context of regular model checking (RMC) may indeed admit more than one answer. We therefore introduce IdMAT: a new teacher formalism answering queries with respect to any language in the target class, all at once. Such a teacher tailored towards invariant synthesis might provide incomplete "don't know" answers, but also inductive facts of the form "if w1 is accepted, so is w2". We pair IdMAT with a novel AAL algorithm LIndA that 1. encodes all uncertainties as a unique SAT instance and does not fork, 2. leverages incremental SAT solving and UNSAT core analysis, and 3. handles counterexamples of the simple or inductive type in a frugal manner inspired by the Rivest-Schapire refinement technique. We finally evaluate a prototype implementation in the context of regular language separation and RMC.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces IdMAT, a new teacher formalism for active automata learning that answers queries with respect to any language in a target class (possibly many regular languages), providing incomplete 'don't know' answers along with inductive facts of the form 'if w1 is accepted then w2 is'. It pairs this with the LIndA algorithm, which encodes all uncertainties (including inductive implications) into a single SAT instance without forking, leverages incremental SAT solving and UNSAT-core analysis, and handles both simple and inductive counterexamples in a frugal way inspired by Rivest-Schapire. The approach is evaluated via a prototype on regular language separation and regular model checking (RMC) problems.
Significance. If the central claims on correctness and tractability hold, the work could enable more scalable automata learning for under-specified problems such as invariant synthesis in RMC and language separation, where multiple solutions exist. The single-SAT encoding plus incremental solving is a technically interesting departure from standard forking-based AAL, and the inductive-teacher model directly addresses a practical gap in incomplete teachers. Strengths include the explicit use of standard SAT primitives and the frugal counterexample handling; these could be reusable if the encoding remains compact.
major comments (3)
- [§3] §3 (LIndA algorithm description): The headline claim that all uncertainties and inductive facts are captured in one SAT instance without forking is load-bearing for the efficiency argument, yet the manuscript supplies neither a formal definition of the encoding nor a size bound on the resulting CNF (or on the number of inductive facts needed per query). Without this, it is impossible to verify that the approach avoids the exponential cost it claims to sidestep when inductive facts are sparse.
- [Evaluation] Evaluation section: The abstract states that a prototype was evaluated on regular language separation and RMC, but no quantitative results (query counts, SAT instance sizes, runtimes, or comparison against forking baselines) are reported. This absence makes it impossible to assess whether the incremental-SAT + UNSAT-core strategy remains practical for the target problems.
- [§4] §4 (correctness argument): No proof or even sketch is supplied that LIndA correctly learns a language consistent with the IdMAT answers (including inductive implications). The absence of a correctness argument is a load-bearing gap for any claim that the single-instance encoding preserves soundness.
minor comments (2)
- [Abstract] The abstract introduces the acronyms IdMAT and LIndA without first spelling them out; this should be corrected for readability.
- [Introduction] Notation for inductive facts (e.g., the 'if w1 accepted then w2' implication) is used without a compact formal syntax; a small definition box would help.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the revisions planned for the next version.
read point-by-point responses
-
Referee: [§3] §3 (LIndA algorithm description): The headline claim that all uncertainties and inductive facts are captured in one SAT instance without forking is load-bearing for the efficiency argument, yet the manuscript supplies neither a formal definition of the encoding nor a size bound on the resulting CNF (or on the number of inductive facts needed per query). Without this, it is impossible to verify that the approach avoids the exponential cost it claims to sidestep when inductive facts are sparse.
Authors: We agree that a formal definition of the encoding and an accompanying size analysis are required to substantiate the efficiency claims. In the revised manuscript we will supply a precise definition of the CNF encoding that captures both membership uncertainties and inductive implications in a single instance. We will also prove a polynomial bound on the number of clauses and variables in terms of the number of queries and inductive facts, confirming that the encoding remains compact when inductive facts are sparse. revision: yes
-
Referee: [Evaluation] Evaluation section: The abstract states that a prototype was evaluated on regular language separation and RMC, but no quantitative results (query counts, SAT instance sizes, runtimes, or comparison against forking baselines) are reported. This absence makes it impossible to assess whether the incremental-SAT + UNSAT-core strategy remains practical for the target problems.
Authors: The prototype was run on the cited benchmarks, but the detailed metrics were not included in the submitted draft. We will expand the evaluation section with tables reporting query counts, SAT instance sizes, runtimes, and direct comparisons against forking-based baselines for both language-separation and RMC instances, thereby allowing assessment of practicality. revision: yes
-
Referee: [§4] §4 (correctness argument): No proof or even sketch is supplied that LIndA correctly learns a language consistent with the IdMAT answers (including inductive implications). The absence of a correctness argument is a load-bearing gap for any claim that the single-instance encoding preserves soundness.
Authors: We will add a correctness section containing a proof sketch. The argument will show that the single SAT encoding is sound with respect to the IdMAT answers and inductive facts, that every hypothesis produced is consistent with the accumulated information, and that the algorithm terminates with a regular language satisfying all constraints. revision: yes
Circularity Check
No circularity: algorithmic construction rests on standard SAT and automata primitives
full rationale
The paper introduces IdMAT (a teacher formalism) and LIndA (an AAL algorithm) as explicit constructions that encode uncertainties into a single SAT instance, leverage incremental solving, and handle counterexamples via a Rivest-Schapire-inspired refinement. No derivation step reduces by definition to its own inputs, no fitted parameter is relabeled as a prediction, and no load-bearing claim depends on a self-citation chain whose validity is presupposed. The single-SAT encoding is presented as a direct algorithmic choice whose tractability is an empirical question for the target domains rather than a self-referential assumption. The derivation chain is therefore self-contained against external benchmarks of SAT solvers and automata learning.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Membership and equivalence queries can be encoded as Boolean constraints whose satisfiability corresponds to existence of a consistent regular language.
Reference graph
Works this paper leans on
-
[1]
Abdulla, P.A.: Regular model checking. Int. J. Softw. Tools Technol. Transf.14(2), 109–118 (2012). https://doi.org/10.1007/S10009-011-0216-8, https://doi.org/10. 1007/s10009-011-0216-8
-
[2]
Abdulla, P.A., Haziza, F., Holík, L.: Parameterized verification through view ab- straction. Int. J. Softw. Tools Technol. Transf.18(5), 495–516 (2016). https://doi. org/10.1007/S10009-015-0406-X, https://doi.org/10.1007/s10009-015-0406-x
-
[3]
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987). https://doi.org/10.1016/0890-5401(87)90052-6, https://doi. org/10.1016/0890-5401(87)90052-6 Automata Learning with an Incomplete but Inductive Teacher 17
-
[4]
(eds.): Proceedings of SAT Competition 2023: Solver, Benchmark and Proof Checker Descriptions
Balyo, T., Heule, M., Iser, M., Järvisalo, M., Suda, M. (eds.): Proceedings of SAT Competition 2023: Solver, Benchmark and Proof Checker Descriptions. Depart- ment of Computer Science Series of Publications B, Department of Computer Sci- ence, University of Helsinki, Finland (2023)
work page 2023
-
[5]
Biere, A., Faller, T., Fazekas, K., Fleury, M., Froleyks, N., Pollitt, F.: Cadical 2.0. In: Gurfinkel, A., Ganesh, V. (eds.) Computer Aided Verification. pp. 133–152. Springer Nature Switzerland, Cham (2024)
work page 2024
-
[6]
In: Kowalewski, S., Philippou, A
Chen, Y.F., Farzan, A., Clarke, E.M., Tsay, Y.K., Wang, B.Y.: Learning minimal separating dfa’s for compositional verification. In: Kowalewski, S., Philippou, A. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. pp. 31–45. Springer Berlin Heidelberg, Berlin, Heidelberg (2009)
work page 2009
-
[7]
In: Stewart, D., Weissenbacher, G
Chen, Y., Hong, C., Lin, A.W., Rümmer, P.: Learning to prove safety over param- eterised concurrent systems. In: Stewart, D., Weissenbacher, G. (eds.) 2017 Formal Methods in Computer Aided Design, FMCAD 2017, Vienna, Austria, October 2- 6, 2017. pp. 76–83. IEEE (2017). https://doi.org/10.23919/FMCAD.2017.8102244, https://doi.org/10.23919/FMCAD.2017.8102244
-
[8]
Czerner, P., Esparza, J., Krasotin, V., Welzel-Mohr, C.: Computing Inductive In- variants of Regular Abstraction Frameworks. In: Majumdar, R., Silva, A. (eds.) 35th International Conference on Concurrency Theory (CONCUR 2024). Leibniz International Proceedings in Informatics (LIPIcs), vol. 311, pp. 19:1–19:18. Schloss Dagstuhl – Leibniz-Zentrum für Inform...
-
[9]
Esparza, J., Raskin, M., Welzel-Mohr, C.: Regular model checking upside-down: An invariant-based approach. Log. Methods Comput. Sci.21(1), 4 (2025). https:// doi.org/10.46298/LMCS-21(1:4)2025, https://doi.org/10.46298/lmcs-21(1:4)2025
-
[10]
In: Klin, B., Lasota, S., Muscholl, A
Esparza, J., Raskin, M.A., Welzel, C.: Regular model checking upside-down: An invariant-based approach. In: Klin, B., Lasota, S., Muscholl, A. (eds.) 33rd In- ternational Conference on Concurrency Theory, CONCUR 2022, September 12-16, 2022, Warsaw, Poland. LIPIcs, vol. 243, pp. 23:1–23:19. Schloss Dagstuhl - Leibniz- Zentrum für Informatik (2022). https:/...
-
[11]
Journal of the ACM39(3), 675–735 (Jul 1992)
German, S.M., Sistla, A.P.: Reasoning about systems with many processes. Journal of the ACM39(3), 675–735 (Jul 1992)
work page 1992
-
[12]
Grinchtein, O., Leucker, M., Piterman, N.: Inferring network invariants automat- ically. In: Furbach, U., Shankar, N. (eds.) Automated Reasoning. pp. 483–497. Springer Berlin Heidelberg, Berlin, Heidelberg (2006)
work page 2006
-
[13]
In: Jansen, N., Stoelinga, M., van den Bos, P
Howar, F., Steffen, B.: Active automata learning as black-box search and lazy par- tition refinement. In: Jansen, N., Stoelinga, M., van den Bos, P. (eds.) A Journey from Process Algebra via Timed Automata to Model Learning - Essays Dedicated to Frits Vaandrager on the Occasion of His 60th Birthday. Lecture Notes in Com- puter Science, vol. 13560, pp. 321...
-
[14]
Ignatiev, A., Morgado, A., Marques-Silva, J.: PySAT: A Python toolkit for pro- totyping with SAT oracles. In: SAT. pp. 428–437 (2018). https://doi.org/10.1007/ 978-3-319-94144-8_26, https://doi.org/10.1007/978-3-319-94144-8_26
-
[15]
Ignatiev, A., Tan, Z.L., Karamanos, C.: Towards universally accessible SAT tech- nology. In: SAT. pp. 4:1–4:11 (2024). https://doi.org/10.4230/LIPICS.SAT.2024. 16, https://doi.org/10.4230/LIPIcs.SAT.2024.16 18 D. Stan, A. Pommellet, J. Jacquot
-
[16]
Jonsson, B., Nilsson, M.: Transitive closures of regular relations for verifying infinite-state systems. In: Graf, S., Schwartzbach, M. (eds.) Tools and Algorithms for the Construction and Analysis of Systems. pp. 220–235. Springer Berlin Hei- delberg, Berlin, Heidelberg (2000)
work page 2000
-
[17]
Leucker, M., Neider, D.: Learning minimal deterministic automata from inexpe- rienced teachers. In: Margaria, T., Steffen, B. (eds.) Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change. pp. 524–538. Springer Berlin Heidelberg, Berlin, Heidelberg (2012)
work page 2012
-
[18]
Liffiton, M.H., Sakallah, K.A.: Algorithms for computing minimal unsatisfiable subsets of constraints. J. Autom. Reason.40(1), 1–33 (2008). https://doi.org/10. 1007/S10817-007-9084-Z, https://doi.org/10.1007/s10817-007-9084-z
-
[19]
Moeller, M., Wiener, T., Solko-Breslin, A., Koch, C., Foster, N., Silva, A.: Au- tomata learning with an incomplete teacher. In: Ali, K., Salvaneschi, G. (eds.) 37th European Conference on Object-Oriented Programming, ECOOP 2023, July 17-21, 2023, Seattle, Washington, United States. LIPIcs, vol. 263, pp. 21:1–21:30. Schloss Dagstuhl - Leibniz-Zentrum für ...
-
[20]
Dagstuhl Artifacts Se- ries9(2), 21:1–21:3 (2023)
Moeller, M., Wiener, T., Solko-Breslin, A., Koch, C., Foster, N., Silva, A.: Au- tomata Learning with an Incomplete Teacher (Artifact). Dagstuhl Artifacts Se- ries9(2), 21:1–21:3 (2023). https://doi.org/10.4230/DARTS.9.2.21, https://drops. dagstuhl.de/entities/document/10.4230/DARTS.9.2.21
-
[21]
Neider, D.: Applications of automata learning in verification and synthesis. Ph.D. thesis, RWTH Aachen University (2014), http://darwin.bth.rwth-aachen. de/opus3/volltexte/2014/5169
work page 2014
-
[22]
In: Brat, G., Rungta, N., Venet, A
Neider, D., Jansen, N.: Regular model checking using solver technologies and automata learning. In: Brat, G., Rungta, N., Venet, A. (eds.) NASA Formal Methods, 5th International Symposium, NFM 2013, Moffett Field, CA, USA, May 14-16, 2013. Proceedings. Lecture Notes in Computer Science, vol. 7871, pp. 16–31. Springer (2013). https://doi.org/10.1007/978-3-...
-
[23]
Oliveira, A.L., Silva, J.P.M.: Efficient algorithms for the inference of mini- mum size dfas. Mach. Learn.44(1/2), 93–119 (2001). https://doi.org/10.1023/A: 1010828029885, https://doi.org/10.1023/A:1010828029885
work page doi:10.1023/a: 2001
-
[24]
Rivest, R.L., Schapire, R.E.: Inference of finite automata using homing sequences. Inf. Comput.103(2), 299–347 (1993). https://doi.org/10.1006/INCO.1993.1021, https://doi.org/10.1006/inco.1993.1021
-
[25]
Automata Learning with an Incomplete but Inductive Teacher (Technical Report)
Stan, D., Pommellet, A., Jacquot, J.: Automata learning with an incomplete but inductive teacher. CoRRabs/2602.21073(2026). https://doi.org/10.48550/ ARXIV.2602.21073, https://doi.org/10.48550/arXiv.2602.21073
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.21073 2026
-
[26]
Steffen, B., Howar, F., Merten, M.: Introduction to Active Automata Learning from a Practical Perspective, pp. 256–296. Springer Berlin Heidelberg, Berlin, Hei- delberg (2011). https://doi.org/10.1007/978-3-642-21455-4_8, https://doi.org/10. 1007/978-3-642-21455-4_8
-
[27]
Touili, T.: Regular model checking using widening techniques. In: Mayr, R. (ed.) Verification of Parameterized Systems, VEPAS 2001, Satellite Workshop of ICALP 2001, Crete, Greece, July 13, 2001. Electronic Notes in Theoretical Computer Science, vol. 50, pp. 342–356. Elsevier (2001). https://doi.org/10.1016/ S1571-0661(04)00187-2, https://doi.org/10.1016/...
-
[28]
Vaandrager, F., Garhewal, B., Rot, J., Wißmann, T.: A new approach for active automata learning based on apartness. In: Fisman, D., Rosu, G. (eds.) Tools and Automata Learning with an Incomplete but Inductive Teacher 19 Algorithms for the Construction and Analysis of Systems. pp. 223–243. Springer International Publishing, Cham (2022)
work page 2022
-
[29]
In: Davies, J., Schulte, W., Barnett, M
Vardhan, A., Sen, K., Viswanathan, M., Agha, G.: Learning to verify safety prop- erties. In: Davies, J., Schulte, W., Barnett, M. (eds.) Formal Methods and Software Engineering. pp. 274–289. Springer Berlin Heidelberg, Berlin, Heidelberg (2004)
work page 2004
-
[30]
Yaacov, T., Weiss, G., Amram, G., Hayoun, A.: Automata models for effective bug pattern description. In: 28th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2025, Grand Rapids, MI, USA, October 5-10, 2025. pp. 119–129. IEEE (2025). https://doi.org/10.1109/ MODELS67397.2025.00017, https://doi.org/10.1109/MODELS6...
-
[31]
ifMEM(c w1,i),MEM(c w2,j)∈ {0,1}, thenα w(i, j) =MEM(c w1,i)→MEM(c w2,j),
-
[32]
else if(c w1,i, cw2,j)∈MEM(c w1,i,{c w2,j}), thenα w(i, j) = 1,
-
[33]
else ifm[xcw1 ,i ],m[x cw2 ,j ]are defined, thenαw(i, j) =m[x cw1 ,i ]→m[x cw2 ,j ],
-
[34]
Property 2.α w(0,0) = 1andα w(|w1|,|w 2|) = 0
elseα w(i, j) =□. Property 2.α w(0,0) = 1andα w(|w1|,|w 2|) = 0. Let⪯be the partial order relation on[0..|w 1|]×[0..|w 2|]such that (i, j)⪯(i ′, j′)↔(i≤i ′)∧(j≤j ′). We considerrectanglesinstead of intervals: Definition 12.ABreaking Rectangle(BR) is a rectangle[i .. i ′]×[j .. j ′]such that(i, j),(i ′, j′)∈[0 :|w 1|]×[0 :|w 2|],(i, j)≺(i ′, j′),{α w(i, j)...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.