Recognition: 2 theorem links
· Lean TheoremLearning Tree Automata with Term Rewriting
Pith reviewed 2026-05-11 02:28 UTC · model grok-4.3
The pith
Supplying a term rewriting system allows the learning algorithm to deduce answers to some queries and thereby reduce the total number required to identify a tree automaton.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that an Angluin-style learning procedure for tree automata can be augmented with deductive inference: a user-provided term rewriting system that is sound and consistent with the target language is used to answer some queries without oracle interaction, producing a correct automaton while lowering overall query complexity.
What carries the argument
A term rewriting system that encodes algebraic properties of the target tree language and is consulted to deduce correct query answers during the learning loop.
If this is right
- Languages whose tree structure obeys rewrite rules such as commutativity or associativity become learnable with far fewer direct queries.
- The learned automaton remains correct whenever the supplied rewrite system is sound and matches the language.
- Natural data properties expressible by short rewrite systems produce the largest practical savings in query volume.
- The same deductive mechanism can be applied to any property that can be stated as a finite set of rewrite rules over the alphabet.
- Equivalence queries can also be answered in part by rewriting, further trimming the interaction with the oracle.
Where Pith is reading between the lines
- The technique could be paired with active sampling strategies to further cut queries when the rewrite system is only partially known.
- Domains that already maintain rewrite systems, such as term databases or algebraic data types, could feed those systems directly into the learner.
- If the rewrite system is allowed to be approximate, the algorithm might still converge faster while producing an automaton that is correct up to the stated properties.
- The same idea of deductive query answering might transfer to learning other structures, such as string automata or graph grammars, whenever a suitable rewriting relation exists.
Load-bearing premise
The supplied term rewriting system must be sound, complete, and fully consistent with the unknown target language so that every deduction it produces is both correct and captures all relevant equivalences.
What would settle it
Run the algorithm on a small tree language whose properties are captured by a simple rewrite system and count whether the number of oracle queries actually drops below the count produced by the unaugmented algorithm; if it does not, or if the returned automaton is incorrect, the claim fails.
read the original abstract
We present an extension of the Angluin-style learning algorithm for tree automata that incorporates deductive inference. The learning algorithm is provided with a term rewriting system that specifies properties of the target tree language (e.g., the order of subtrees under a symbol f is irrelevant). This term rewriting system is used to infer answers to some queries, which reduces the query complexity of the learning algorithm. We present examples of rewrite systems that express natural properties of tree-structured data, which yield a significant reduction in the number of queries.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extends the Angluin-style learning algorithm for tree automata by supplying a user-provided term rewriting system (TRS) that encodes properties of the target language (e.g., commutativity of subtrees). The TRS is used to deduce answers to selected membership queries, with the goal of lowering oracle query complexity. Examples of natural rewrite systems are presented that are claimed to yield significant query savings.
Significance. If the extension preserves correctness and the supplied TRS is sound and consistent with the target language, the technique offers a principled way to inject domain knowledge into tree-automaton learning and could reduce query cost in applications involving structured data. The approach builds directly on prior Angluin-style procedures without introducing new free parameters.
major comments (2)
- Abstract: the central claim of query reduction is stated at a high level but is unsupported by any correctness proof for the extended observation-table construction, any complexity analysis, or concrete query-count measurements; these are load-bearing for the contribution.
- Algorithm integration (high-level description): it is not shown that deductive answers from the TRS preserve the termination and equivalence invariants of the base Angluin procedure when the TRS is only assumed sound and consistent; a formal argument or invariant is required.
minor comments (1)
- Abstract: the weakest assumption on the TRS (soundness, completeness, consistency) should be stated explicitly rather than left implicit.
Simulated Author's Rebuttal
We thank the referee for the detailed review and constructive suggestions. We address each major comment below and will incorporate revisions to strengthen the formal aspects of the work.
read point-by-point responses
-
Referee: Abstract: the central claim of query reduction is stated at a high level but is unsupported by any correctness proof for the extended observation-table construction, any complexity analysis, or concrete query-count measurements; these are load-bearing for the contribution.
Authors: We agree that the abstract states the query-reduction claim at a high level. The manuscript presents the extended algorithm, shows how the TRS is used to answer selected membership queries, and includes concrete examples of natural rewrite systems (e.g., commutativity) together with illustrative query savings on small instances. However, we acknowledge that a self-contained correctness argument for the modified observation table, a complexity analysis, and tabulated query-count measurements are not provided. In the revision we will add a dedicated subsection proving that the extended table construction preserves the invariants of the base Angluin procedure (under the stated soundness and consistency assumptions on the TRS), include asymptotic bounds on the number of remaining oracle queries, and report explicit query counts for the running examples. revision: yes
-
Referee: Algorithm integration (high-level description): it is not shown that deductive answers from the TRS preserve the termination and equivalence invariants of the base Angluin procedure when the TRS is only assumed sound and consistent; a formal argument or invariant is required.
Authors: We appreciate this observation. The current text assumes that any answer deduced from a sound and consistent TRS is identical to the answer the target language would give, and therefore that the standard termination and equivalence arguments carry over. We will make this reasoning explicit by adding an invariant stating that every entry in the observation table (whether obtained from the oracle or deduced) remains consistent with the target language. Under this invariant the original proofs of termination and correctness of the learned automaton apply unchanged. The revised manuscript will contain a short lemma formalizing the invariant together with a proof sketch. revision: yes
Circularity Check
No significant circularity; extension relies on external TRS input and prior Angluin algorithm
full rationale
The claimed contribution is an algorithmic extension of the established Angluin-style tree-automaton learning procedure. A user-supplied term rewriting system (TRS) is provided as an external artifact that is assumed sound, complete, and consistent with the target language; this TRS is used only to answer a subset of membership queries by deduction rather than oracle calls. The base learning invariants, termination, and correctness are inherited from the prior Angluin algorithm, which is treated as independent prior work. No step in the described derivation reduces a result to a fitted parameter, renames an input as a prediction, or rests on a self-citation chain whose cited result itself depends on the present paper. The derivation chain therefore contains independent content and does not collapse by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The term rewriting system correctly and completely captures the relevant equivalences of the target tree language.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.leanabsolute_floor_iff_bare_distinguishability unclearWe present an extension of the Angluin-style learning algorithm for tree automata that incorporates deductive inference. The learning algorithm is provided with a term rewriting system that specifies properties of the target tree language
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearLemma 6. ... The TRSRisconsistentwithL(A)if and only if the functions ⃗δA(l)and ⃗δA(r)coincide, for each rulel→r∈ R.
Reference graph
Works this paper leans on
-
[1]
A. V . Aho, M. R. Garey, and J. D. Ullman. The transitive reduction of a directed graph.SIAM Journal on Computing, 1(2):131–137, 1972. 12
work page 1972
-
[2]
Closure properties and decision problems of dag automata.Inf
Siva Anantharaman, Paliath Narendran, and Micha ¨el Rusinowitch. Closure properties and decision problems of dag automata.Inf. Process. Lett., 94(5):231–240, 2005
work page 2005
-
[3]
Dana Angluin. Learning regular sets from queries and counterexamples.Information and computation, 75(2):87– 106, 1987
work page 1987
-
[4]
Unification of concept terms in description logics
Franz Baader and Paliath Narendran. Unification of concept terms in description logics. In Henri Prade, editor, ECAI 1998, pages 331–335. John Wiley and Sons, 1998
work page 1998
-
[5]
Cambridge University Press, 1998
Franz Baader and Tobias Nipkow.Term rewriting and all that. Cambridge University Press, 1998
work page 1998
-
[6]
Beno ˆıt Barbot, Benedikt Bollig, Alain Finkel, Serge Haddad, Igor Khmelnitsky, Martin Leucker, Daniel Neider, Rajarshi Roy, and Lina Ye. Extracting context-free grammars from recurrent neural networks using tree-automata learning and a* search. InICGI 2021, pages 113–129. PMLR, 2021
work page 2021
- [7]
-
[8]
Automata learning and identification of the support of language models
Satwik Bhattamishra, Michael Hahn, and Varun Kanade. Automata learning and identification of the support of language models
-
[9]
On the regularity and learnability of ordered DAG lan- guages
Henrik Bj ¨orklund, Johanna Bj¨orklund, and Petter Ericson. On the regularity and learnability of ordered DAG lan- guages. In Arnaud Carayol and Cyril Nicaud, editors,CIAA 2017, volume 10329 ofLecture Notes in Computer Science, pages 27–39. Springer, 2017
work page 2017
-
[10]
Automata and logics for unranked and unordered trees
Iovka Boneva and Jean-Marc Talbot. Automata and logics for unranked and unordered trees. In J ¨urgen Giesl, editor,RTA 2005, volume 3467 ofLecture Notes in Computer Science, pages 500–515. Springer, 2005
work page 2005
-
[11]
J ´an ˇCern`y. Pozn ´amka k homog´ennym experimentom s koneˇcn`ymi automatmi.Matematicko-fyzik ´alny ˇcasopis, 14(3):208–216, 1964
work page 1964
-
[12]
Automata on dag representations of finite trees
Witold Charatonik. Automata on dag representations of finite trees. 1999
work page 1999
-
[13]
Hubert Comon, Max Dauchet, R ´emi Gilleron, Florent Jacquemard, Denis Lugiez, Christof L¨oding, Sophie Tison, and Marc Tommasi.Tree Automata Techniques and Applications. 2008
work page 2008
-
[14]
Scalable tree-based register automata learning
Simon Dierl, Paul Fiterau-Brostean, Falk Howar, Bengt Jonsson, Konstantinos Sagonas, and Fredrik T ˚aquist. Scalable tree-based register automata learning. InTACAS 2024, pages 87–108, 2024
work page 2024
-
[15]
Learning a regular tree language from a teacher
Frank Drewes and Johanna H ¨ogberg. Learning a regular tree language from a teacher. In Zolt´an ´Esik and Zolt´an F¨ul¨op, editors,DLT 2003, volume 2710 ofLecture Notes in Computer Science, pages 279–291. Springer, 2003
work page 2003
-
[16]
Query learning of regular tree languages: How to avoid dead states.Theory Comput
Frank Drewes and Johanna H ¨ogberg. Query learning of regular tree languages: How to avoid dead states.Theory Comput. Syst., 40(2):163–185, 2007
work page 2007
-
[17]
Reset sequences for monotonic automata.SIAM J
David Eppstein. Reset sequences for monotonic automata.SIAM J. Comput., 19(3):500–510, 1990
work page 1990
-
[18]
Active automata learning with advice
Michal Fica and Jan Otop. Active automata learning with advice. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1655–1662. IOS Press, 2025
work page 2025
-
[19]
Active automata learning with adaptive distinguishing sequences.CoRR, abs/1902.01139, 2019
Markus Theo Frohme. Active automata learning with adaptive distinguishing sequences.CoRR, abs/1902.01139, 2019
-
[20]
Learning definable hypotheses on trees
´Emilie Grienenberger and Martin Ritzert. Learning definable hypotheses on trees. In Pablo Barcel ´o and Marco Calautti, editors,ICDT 2019, volume 127 ofLIPIcs, pages 24:1–24:18. Schloss Dagstuhl - Leibniz-Zentrum f ¨ur Informatik, 2019. 13
work page 2019
-
[21]
Learning multiplicity tree automata
Amaury Habrard and Jos ´e Oncina. Learning multiplicity tree automata. In Yasubumi Sakakibara, Satoshi Kobayashi, Kengo Sato, Tetsuro Nishino, and Etsuji Tomita, editors,ICGI 2006, volume 4201 ofLecture Notes in Computer Science, pages 268–280. Springer, 2006
work page 2006
-
[22]
The TTT algorithm: A redundancy-free approach to active automata learning
Malte Isberner, Falk Howar, and Bernhard Steffen. The TTT algorithm: A redundancy-free approach to active automata learning. InRV 2014, pages 307–322, 2014
work page 2014
-
[23]
Four one-shot learners for regular tree languages and their polynomial characterizability.Theor
Anna Kasprzik. Four one-shot learners for regular tree languages and their polynomial characterizability.Theor. Comput. Sci., 485:85–106, 2013
work page 2013
-
[24]
Learning tree automata with term rewriting
Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting. Into appear in IJCAI 2026. ijcai.org, 2026
work page 2026
-
[25]
Learning tree automata with term rewriting: code and data
Jakub Kopystia ´nski and Jan Otop. Learning tree automata with term rewriting: code and data. https://github. com/jotop/LearnDFTAwithTRS, 2026
work page 2026
-
[26]
Information extraction from web documents based on local unranked tree automaton inference
Raymond Kosala, Maurice Bruynooghe, Jan Van den Bussche, and Hendrik Blockeel. Information extraction from web documents based on local unranked tree automaton inference. InIJCAI 2003, pages 403–408. Morgan Kaufmann, 2003
work page 2003
-
[27]
Small test suites for active automata learning
Loes Kruger, Sebastian Junges, and Jurriaan Rot. Small test suites for active automata learning. InTACAS 2024, pages 109–129, 2024
work page 2024
-
[28]
Actively learning el terminologies from large language models
Matteo Magnini, Riccardo Squarcialupi, Martin Tunge Sterri, Ana Ozaki, and Rivera Castillo. Actively learning el terminologies from large language models. InECAI 2025, volume 413 ofFrontiers in Artificial Intelligence and Applications, pages 1792–1799. IOS Press, 2025
work page 2025
-
[29]
Ines Marusic and James Worrell. Complexity of equivalence and learning for multiplicity tree automata.Journal of Machine Learning Research, 16:2465–2500, 2015
work page 2015
-
[30]
An introduction to description logics.Description logic handbook, 1:40, 2003
Daniele Nardi, Ronald J Brachman, et al. An introduction to description logics.Description logic handbook, 1:40, 2003
work page 2003
-
[31]
Learning of structurally unambiguous probabilistic gram- mars
Dolav Nitay, Dana Fisman, and Michal Ziv-Ukelson. Learning of structurally unambiguous probabilistic gram- mars. InAAAI 2021, pages 9170–9178. AAAI Press, 2021
work page 2021
-
[32]
Moeketsi Raselimo and Bernd Fischer. Automatic grammar repair. In Eelco Visser, Dimitris S. Kolovos, and Emma S¨oderberg, editors,SLE 2021, pages 126–142. ACM, 2021
work page 2021
-
[33]
Learning regular tree languages from correction and equivalence queries.J
Catalin Ionut Tirn ˘auc˘a and Cristina Tirn˘auc˘a. Learning regular tree languages from correction and equivalence queries.J. Autom. Lang. Comb., 12(4):501–524, 2007
work page 2007
-
[34]
Craig A. Tovey. A simplified np-complete satisfiability problem.Discret. Appl. Math., 8(1):85–89, 1984
work page 1984
-
[35]
Model learning.Communications of the ACM, 60(2):86–95, 2017
Frits Vaandrager. Model learning.Communications of the ACM, 60(2):86–95, 2017
work page 2017
-
[36]
Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann
Frits W. Vaandrager, Bharat Garhewal, Jurriaan Rot, and Thorsten Wißmann. A new approach for active automata learning based on apartness. InTACAS 2022, pages 223–243, 2022
work page 2022
-
[37]
Gerco van Heerdt, Tobias Kapp´e, Jurriaan Rot, and Alexandra Silva. Learning pomset automata. In Stefan Kiefer and Christine Tasson, editors,ETAPS 2021, volume 12650, pages 510–530. Springer, 2021
work page 2021
-
[38]
Witwicki, Matei Zaharia, and Sanjit A
Marcell Vazquez-Chanlatte, Karim Elmaaroufi, Stefan J. Witwicki, Matei Zaharia, and Sanjit A. Seshia.l ∗lm: Learning automata from examples using natural language oracles, 2025
work page 2025
-
[39]
Synthesizing transformations on hierarchically structured data
Navid Yaghmazadeh, Christian Klinger, Isil Dillig, and Swarat Chaudhuri. Synthesizing transformations on hierarchically structured data. In Chandra Krintz and Emery D. Berger, editors,PLDI 2016, pages 508–521. ACM, 2016. 14
work page 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.