pith. sign in

arxiv: 2605.15294 · v1 · pith:QMR3XZRXnew · submitted 2026-05-14 · 💻 cs.FL

An L^(\#) Based Algorithm for Active Learning of Minimal Separating Automata

Pith reviewed 2026-05-19 15:47 UTC · model grok-4.3

classification 💻 cs.FL
keywords active learningseparating automataminimal DFAL#formal languagesautomata learningverification
0
0 comments X

The pith

A simple algorithm based on L# learns minimal DFAs that separate two disjoint languages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces an active learning algorithm inspired by L# for finding the smallest DFA that separates two disjoint languages L1 and L2. This DFA accepts all words from L1 while rejecting all from L2. Readers would care because this has direct uses in verifying network invariants, learning assumptions for compositional verification, extracting models from logs, and identifying bug patterns, with experiments showing better performance than existing methods.

Core claim

We propose a simple active learning algorithm, inspired by L#, that learns a minimal separating DFA for disjoint languages L1 and L2 if one exists. Experiments show that our algorithm significantly outperforms existing active learning algorithms on both randomly generated and industrial benchmarks.

What carries the argument

The L# inspired active learning algorithm that builds a minimal hypothesis DFA through targeted queries to an oracle.

Load-bearing premise

The two languages are disjoint and there exists a finite minimal DFA that separates them.

What would settle it

Take two disjoint regular languages with a known small minimal separating DFA and run the algorithm to see if it returns exactly that minimal DFA or a correct separator of that size.

Figures

Figures reproduced from arXiv: 2605.15294 by Frits Vaandrager, Jasper Laumen, Leonne Snel.

Figure 1
Figure 1. Figure 1: A prefix tree acceptor T (top) for a DFA A (bottom). Example 2. For the PTA of [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Final PTA (left), hypothesis H1 (top right) and hypothesis H2 (bottom right). Remark. Unlike L #, the L # ✷ algorithm does not require that the basis is con￾nected (prefixed closed). This assumption is useful in L # because construction of a hypothesis becomes trivial: just merge frontier states with nonconflicting basis states. In L #, without this assumption, some basis states may become unreachable in t… view at source ↗
Figure 3
Figure 3. Figure 3: States p and q are incompatible but not apart. sequences that are enabled by both p and q are ϵ and b, but these do not lead to conflicting pairs of states. Appendix B shows how RPNI state merging [25] can be used to prove incompatibility of p and q. Informally, the argument goes as follows. Assume p and q are compatible. Then there is a morphism from C that maps p and q to the same state of some DFA B. Bu… view at source ↗
Figure 4
Figure 4. Figure 4: Merging states p and q (left), and then merging p and r (right). The example suggests an optimisation of L # ✷ with a modified promotion rule, in which a state may be added to the basis when it is incompatible with all current basis states. In this way we maintain the crucial invariant that all states in the basis correspond to different states of the hidden DFA B. A key advantage of apartness over incompa… view at source ↗
Figure 5
Figure 5. Figure 5: Number of completed benchmarks for L ∗ □ and L # □ [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Comparison of running time of L # □ and L ∗ □ 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Benchmark 10 1 10 2 10 3 10 4 Membership Queries L# L* [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of membership queries of L # □ and L ∗ □ [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Comparison of validity queries of L # □ and L ∗ □ 5.2 Learning Bug Description DFAs We use the setup from [36], which is based on industrial benchmarks of control software components of ASML’s TWINSCAN lithography machines from the RERS challenge 2019 [17]. These models are given in obfuscated Java code, each containing a bug. Our goal is to learn a DFA that accepts any valid input that leads to this bug, … view at source ↗
Figure 10
Figure 10. Figure 10: These results are taken as the mean of 10 runs. Besides the performance [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 9
Figure 9. Figure 9: Concise bug description of online store using an incomplete teacher interpretable DFAs. For example, we can ask the solver to maximise the number of self-transitions. Sometimes, the model learned by L # □ is larger than the model learned by 3DFA-L ∗ , even though L # □ theoretically learns minimal models. This can be explained by the fact that we do not have access to a perfect validity oracle in this sett… view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of L ∗ for 3DFAs and L # □ 5.3 Evaluation of Optimisations To evaluate the performance of the optimisations of Section 4 and the redun￾dant clauses of the SMT solver, we performed an ablation-style study for the benchmarks of Oliveira and Silva [24] [PITH_FULL_IMAGE:figures/full_fig_p018_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison of running time of redundant clauses [PITH_FULL_IMAGE:figures/full_fig_p024_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Comparison of input symbols of basis replacement 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Benchmark 10 2 10 1 10 0 10 1 10 2 Running Time (seconds) Using apartness Using compatibility [PITH_FULL_IMAGE:figures/full_fig_p025_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Comparison of running time of apartness and compatibility [PITH_FULL_IMAGE:figures/full_fig_p025_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: A prefix tree acceptor T (top) for a NFA B (bottom). In general, provided the teacher answers queries, we may always establish apartness of incomplete basis states: (a) let c be the number of transitions from p to q. (a) There must be two states r0 and rk of T that are apart and mapped to the same state s on the cycle in B (otherwise p and q would be compatible). (b) Let w be a witness for r0 # rk and let… view at source ↗
read the original abstract

A DFA separates two disjoint languages $L_1$ and $L_2$ if it accepts every word in $L_1$ and rejects every word in $L_2$. Algorithms for active learning of small separating DFAs have many applications, e.g., for learning network invariants, learning contextual assumptions in compositional verification, learning state machines from large amounts of log data, and learning bug pattern descriptions. We propose a simple active learning algorithm, inspired by $L^{\#}$, that learns a minimal separating DFA for disjoint languages $L_1$ and $L_2$ if one exists. Experiments show that our algorithm significantly outperforms existing active learning algorithms on both randomly generated and industrial benchmarks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript proposes a simple active learning algorithm inspired by L# that constructs a minimal separating DFA for two disjoint languages L1 and L2 whenever a finite minimal separator exists. The central claim is conditional correctness under the stated preconditions together with empirical evidence of outperformance versus prior active-learning methods on random and industrial benchmarks.

Significance. If the algorithm is correct, the work supplies a practical tool for automata-learning tasks that arise in compositional verification, network-invariant inference, log-data mining, and bug-pattern extraction. The explicit focus on minimality and the reported experimental gains on both synthetic and real-world instances constitute a concrete contribution to the active-learning literature.

minor comments (2)
  1. Abstract: the statement that the algorithm 'learns a minimal separating DFA' should be accompanied by a brief indication of the query model (membership/equivalence) and the precise termination condition to avoid ambiguity for readers who consult only the abstract.
  2. The experimental section should report the precise performance metrics (e.g., number of queries, automaton size, runtime) and the baseline algorithms used, together with any statistical significance tests, so that the claimed outperformance can be independently assessed.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, the recognition of potential applications in compositional verification and related areas, and the recommendation for minor revision. We are pleased that the conditional correctness claim and experimental outperformance are viewed as a concrete contribution.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a conditional algorithmic construction: an active-learning procedure, derived from the existing L# algorithm, that returns a minimal separating DFA for disjoint languages L1 and L2 whenever at least one finite separating DFA exists. The stated preconditions (disjointness and existence of a finite minimal separator) directly match the claim's scope, and the algorithm is offered as a simple, terminating procedure under those preconditions rather than a derivation that reduces to its own fitted values or self-referential definitions. No equations, parameter fits, or load-bearing self-citations that would force the result by construction appear in the provided material; empirical outperformance on random and industrial benchmarks is reported separately. The derivation chain is therefore self-contained against external benchmarks and does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; the algorithm appears to rest on standard automata theory rather than new postulates.

axioms (1)
  • standard math Deterministic finite automata are closed under the operations needed to construct separators from query answers.
    Implicit in any active-learning algorithm for DFAs; invoked by the claim that a minimal separator can be learned.

pith-pipeline@v0.9.0 · 5650 in / 1178 out tokens · 36359 ms · 2026-05-19T15:47:36.624540+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

36 extracted references · 36 canonical work pages

  1. [1]

    Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987).https://doi.org/10.1016/0890-5401(87)90052-6

  2. [2]

    Balle, B., Carreras, X., Luque, F.M., Quattoni, A.: Spectral learning of weighted automata - A forward-backward perspective. Mach. Learn.96(1-2), 33–63 (2014). https://doi.org/10.1007/S10994-013-5416-X

  3. [3]

    Bar-Sinai, M., Elyasaf, A., Weiss, G., Weiss, Y.: Provengo: A tool suite for scenario driven model-based testing. In: ASE. pp. 2062–2065. IEEE (2023).https://doi. org/10.48550/ARXIV.2308.15938

  4. [4]

    IEEE Trans

    Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Computers21(6), 592–597 (1972),https: //doi.org/10.1109/TC.1972.5009015

  5. [5]

    In: Prabhakar, P., Vandin, A

    Bruy` ere, V., Garhewal, B., P´ erez, G.A., Staquet, G., Vaandrager, F.W.: Active learning of mealy machines with timers. In: Prabhakar, P., Vandin, A. (eds.) Quan- titative Evaluation of Systems and Formal Modeling and Analysis of Timed Sys- tems - Second International Joint Conference, QEST+FORMATS 2025, Aarhus, Active Learning of Minimal Separating Aut...

  6. [6]

    In: Kowalewski, S., Philippou, A

    Chen, Y., Farzan, A., Clarke, E.M., Tsay, Y., Wang, B.: Learning minimal sepa- rating dfa’s for compositional verification. In: Kowalewski, S., Philippou, A. (eds.) Tools and Algorithms for the Construction and Analysis of Systems, 15th Interna- tional Conference, TACAS 2009, Held as Part of the Joint European Conferences on Theory and Practice of Softwar...

  7. [7]

    IEEE Trans

    Chow, T.: Testing software design modeled by finite-state machines. IEEE Trans. Software Eng.4(3), 178–187 (1978).https://doi.org/10.1109/TSE.1978.231496

  8. [8]

    D’Antoni, L., Veanes, M.: Automata modulo theories. Commun. ACM64(5), 86– 95 (2021).https://doi.org/10.1145/3419404

  9. [9]

    In: Finkbeiner, B., Kov´ acs, L

    Dierl, S., Fiterau-Brostean, P., Howar, F., Jonsson, B., Sagonas, K., T˚ aquist, F.: Scalable tree-based register automata learning. In: Finkbeiner, B., Kov´ acs, L. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 30th Interna- tional Conference, TACAS 2024, Held as Part of the European Joint Conferences on Theory and Practice o...

  10. [10]

    Logical Methods in Computer ScienceV olume 17, Issue 3(Jul 2021).https://doi.org/10.46298/ lmcs-17(3:15)2021

    Geuvers, H., Jacobs, B.: Relating apartness and bisimulation. Logical Methods in Computer ScienceV olume 17, Issue 3(Jul 2021).https://doi.org/10.46298/ lmcs-17(3:15)2021

  11. [11]

    Gold, E.M.: Complexity of automaton identification from given data. Inf. Control. 37(3), 302–320 (1978).https://doi.org/10.1016/S0019-9958(78)90562-4

  12. [12]

    In: Furbach, U., Shankar, N

    Grinchtein, O., Leucker, M., Piterman, N.: Inferring network invariants automat- ically. In: Furbach, U., Shankar, N. (eds.) Automated Reasoning, Third Interna- tional Joint Conference, IJCAR 2006, Seattle, WA, USA, August 17-20, 2006, Proceedings. Lecture Notes in Computer Science, vol. 4130, pp. 483–497. Springer (2006),https://doi.org/10.1007/11814771_40

  13. [13]

    In: ICGI

    Heule, M., Verwer, S.: Exact DFA identification using SAT solvers. In: ICGI. Lec- ture Notes in Computer Science, vol. 6339, pp. 66–79. Springer (2010).https: //doi.org/10.1007/978-3-642-15488-1_7

  14. [14]

    Cambridge University Press (Apr 2010).https://doi.org/10.1017/ CBO9781139194655

    de la Higuera, C.: Grammatical Inference: Learning Automata and Gram- mars. Cambridge University Press (Apr 2010).https://doi.org/10.1017/ CBO9781139194655

  15. [15]

    Addison-Wesley (1979)

    Hopcroft, J., Ullman, J.: Introduction to Automata Theory, Languages and Com- putation. Addison-Wesley (1979)

  16. [16]

    Ishigami, Y., Tani, S.: VC-dimensions of finite automata and commutative finite automata with k letters and n states. Discret. Appl. Math.74(2), 123–134 (1997). https://doi.org/10.1016/S0166-218X(96)00025-X

  17. [17]

    In: TACAS (3)

    Jasper, M., Mues, M., Murtovi, A., Schl¨ uter, M., Howar, F., Steffen, B., Schordan, M., Hendriks, D., Schiffelers, R.R.H., Kuppens, H., Vaandrager, F.W.: RERS 2019: Combining synthesis with real-world models. In: TACAS (3). Lecture Notes in Computer Science, vol. 11429, pp. 101–115. Springer (2019).https://doi.org/ 10.1007/978-3-030-17502-3_7

  18. [18]

    In: Fischer, B., Schaefer, I

    Lee, M., So, S., Oh, H.: Synthesizing regular expressions from examples for intro- ductory automata assignments. In: Fischer, B., Schaefer, I. (eds.) Proceedings of the 2016 ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences, GPCE 2016, Amsterdam, The Netherlands, October 31 22 Jasper Laumen, Leonne Snel, Frits Vaandr...

  19. [19]

    In: Margaria, T., Steffen, B

    Leucker, M., Neider, D.: Learning minimal deterministic automata from inexpe- rienced teachers. In: Margaria, T., Steffen, B. (eds.) Leveraging Applications of Formal Methods, Verification and Validation. Technologies for Mastering Change - 5th International Symposium, ISoLA 2012, Heraklion, Crete, Greece, October 15-18, 2012, Proceedings, Part I. Lecture...

  20. [20]

    In: ECOOP

    Moeller, M., Wiener, T., Solko-Breslin, A., Koch, C., Foster, N., Silva, A.: Au- tomata learning with an incomplete teacher. In: ECOOP. LIPIcs, vol. 263, pp. 21:1–21:30. Schloss Dagstuhl - Leibniz-Zentrum f¨ ur Informatik (2023).https: //doi.org/10.4230/LIPICS.ECOOP.2023.21

  21. [21]

    Z3 : An efficient SMT solver

    de Moura, L.M., Bjørner, N.S.: Z3: an efficient SMT solver. In: TACAS. Lecture Notes in Computer Science, vol. 4963, pp. 337–340. Springer (2008).https://doi. org/10.1007/978-3-540-78800-3_24

  22. [22]

    Innovations in Systems and Software Engineering18, 1–10 (03 2022).https://doi.org/10.1007/s11334-022-00449-3

    Muˇ skardin, E., Aichernig, B., Pferscher, A., Tappler, M.: Aalpy: an active automata learning library. Innovations in Systems and Software Engineering18, 1–10 (03 2022).https://doi.org/10.1007/s11334-022-00449-3

  23. [23]

    In: Chakraborty, S., Mukund, M

    Neider, D.: Computing minimal separating dfas and regular invariants using SAT and SMT solvers. In: Chakraborty, S., Mukund, M. (eds.) Automated Technol- ogy for Verification and Analysis - 10th International Symposium, ATVA 2012, Thiruvananthapuram, India, October 3-6, 2012. Proceedings. Lecture Notes in Computer Science, vol. 7561, pp. 354–369. Springer...

  24. [24]

    Oliveira, A.L., Silva, J.P.M.: Efficient algorithms for the inference of minimum size DFAs. Mach. Learn.44(1/2), 93–119 (2001).https://doi.org/10.1023/A: 1010828029885

  25. [25]

    World Scientific (01 1992).https://doi.org/10.1142/9789812797902_0004

    Oncina, J., Garc´ ıa, P.: Inferring regular languages in polynomial update time. World Scientific (01 1992).https://doi.org/10.1142/9789812797902_0004

  26. [26]

    IEEE Trans

    Pena, J.M., Oliveira, A.L.: A new algorithm for exact reduction of incompletely specified finite state machines. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst.18(11), 1619–1632 (1999),https://doi.org/10.1109/43.806807

  27. [27]

    IEEE Trans

    Pfleeger, C.P.: State reduction in incompletely specified finite-state machines. IEEE Trans. Computers22(12), 1099–1102 (1973),https://doi.org/10.1109/ T-C.1973.223655

  28. [28]

    In: 10th Conference on Algebra and Coalgebra in Computer Science (CALCO 2023)

    Rot, J., Wißmann, T.: Bisimilar states in uncertain structures. In: 10th Conference on Algebra and Coalgebra in Computer Science (CALCO 2023). pp. 12:1–12:17. Leibniz International Proceedings in Informatics (09 2023).https://doi.org/10. 4230/LIPIcs.CALCO.2023.12

  29. [29]

    In: LATA

    Smetsers, R., Fiterau-Brostean, P., Vaandrager, F.W.: Model learning as a satisfiability modulo theories problem. In: LATA. Lecture Notes in Computer Science, vol. 10792, pp. 182–194. Springer (2018).https://doi.org/10.1007/ 978-3-319-77313-1_14

  30. [30]

    Cambridge University Press, July 2000.isbn: 9781139168717.doi: 10.1017/cbo9781139168717

    Troelstra, A.S., Schwichtenberg, H.: Basic Proof Theory. Cambridge Tracts in The- oretical Computer Science, Cambridge University Press, 2 edn. (2000).https: //doi.org/10.1017/CBO9781139168717

  31. [31]

    In: Fisman, D., Rosu, G

    Vaandrager, F.W., Garhewal, B., Rot, J., Wißmann, T.: A new approach for active automata learning based on apartness. In: Fisman, D., Rosu, G. (eds.) Tools and Algorithms for the Construction and Analysis of Systems - 28th International Conference, TACAS 2022, Munich, Germany, April 2-7, 2022, Proceedings, Part Active Learning of Minimal Separating Automa...

  32. [32]

    In: Jansen, N., Junges, S., Kaminski, B.L., Matheja, C., Noll, T., Quatmann, T., Stoelinga, M., Volk, M

    Vaandrager, F.W., Sanders, M.: L# for DFAs. In: Jansen, N., Junges, S., Kaminski, B.L., Matheja, C., Noll, T., Quatmann, T., Stoelinga, M., Volk, M. (eds.) Principles of Verification: Cycling the Probabilistic Landscape - Essays Dedicated to Joost- Pieter Katoen on the Occasion of His 60th Birthday, Part III. Lecture Notes in Computer Science, vol. 15262,...

  33. [33]

    Cybernetics and System Analysis 9(4), 653–665 (1973).https://doi.org/https://doi.org/10.1007/BF01068590, (Translated from Kibernetika, No

    Vasilevskii, M.: Failure diagnosis of automata. Cybernetics and System Analysis 9(4), 653–665 (1973).https://doi.org/https://doi.org/10.1007/BF01068590, (Translated from Kibernetika, No. 4, pp. 98-108, July-August, 1973.)

  34. [34]

    In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017

    Verwer, S., Hammerschmidt, C.A.: flexfringe: A passive automaton learning pack- age. In: 2017 IEEE International Conference on Software Maintenance and Evo- lution, ICSME 2017, Shanghai, China, September 17-22, 2017. pp. 638–642. IEEE Computer Society (2017),https://doi.org/10.1109/ICSME.2017.58

  35. [35]

    CoRRabs/2406.07208(2024),https://doi.org/10.48550/arXiv.2406.07208

    Walinga, H., Baumgartner, R., Verwer, S.: Database-assisted automata learning. CoRRabs/2406.07208(2024),https://doi.org/10.48550/arXiv.2406.07208

  36. [36]

    yes” or “no

    Yaacov, T., Weiss, G., Amram, G., Hayoun, A.: Automata models for effective bug pattern description. In: 28th ACM/IEEE International Conference on Model Driven Engineering Languages and Systems, MODELS 2025, Grand Rapids, MI, USA, October 5-10, 2025. pp. 119–129. IEEE (2025),https://doi.org/10.1109/ MODELS67397.2025.00017 24 Jasper Laumen, Leonne Snel, Fr...