Learning higher-order logic programs
Pith reviewed 2026-05-24 16:10 UTC · model grok-4.3
The pith
Learning higher-order programs reduces textual complexity and hypothesis space size in inductive logic programming.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Our theoretical results show that learning higher-order programs, rather than first-order programs, can reduce the textual complexity required to express programs which in turn reduces the size of the hypothesis space and sample complexity. We extend meta-interpretive learning to support learning higher-order programs by allowing for higher-order definitions to be used as background knowledge. Both new systems support learning higher-order programs and higher-order predicate invention, such as inventing functions for map/3 and conditions for filter/3.
What carries the argument
Higher-order definitions supplied as background knowledge in meta-interpretive learning, which enables programs using abstractions such as map and filter.
If this is right
- Reduced textual complexity for expressing the target programs.
- Smaller hypothesis space sizes and therefore lower sample complexity.
- Support for higher-order predicate invention during learning.
- Higher predictive accuracy and shorter learning times on the tested domains.
Where Pith is reading between the lines
- The same reduction might appear in other inductive synthesis settings that allow reusable abstractions.
- One could test whether the benefit persists when the higher-order definitions themselves must be discovered rather than given.
- The approach might make logic-based learners more competitive with systems that already use higher-order constructs by default.
Load-bearing premise
Suitable higher-order definitions can be supplied as background knowledge in advance without introducing new sources of complexity that offset the claimed reduction in hypothesis space size.
What would settle it
An experiment on one of the four domains where supplying higher-order definitions increases the number of examples or the search time needed compared with first-order learning.
Figures
read the original abstract
A key feature of inductive logic programming (ILP) is its ability to learn first-order programs, which are intrinsically more expressive than propositional programs. In this paper, we introduce techniques to learn higher-order programs. Specifically, we extend meta-interpretive learning (MIL) to support learning higher-order programs by allowing for \emph{higher-order definitions} to be used as background knowledge. Our theoretical results show that learning higher-order programs, rather than first-order programs, can reduce the textual complexity required to express programs which in turn reduces the size of the hypothesis space and sample complexity. We implement our idea in two new MIL systems: the Prolog system \namea{} and the ASP system \nameb{}. Both systems support learning higher-order programs and higher-order predicate invention, such as inventing functions for \tw{map/3} and conditions for \tw{filter/3}. We conduct experiments on four domains (robot strategies, chess playing, list transformations, and string decryption) that compare learning first-order and higher-order programs. Our experimental results support our theoretical claims and show that, compared to learning first-order programs, learning higher-order programs can significantly improve predictive accuracies and reduce learning times.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper extends meta-interpretive learning (MIL) to higher-order programs by permitting higher-order definitions as background knowledge. It presents theoretical results claiming that higher-order programs reduce textual complexity relative to first-order programs, thereby shrinking hypothesis-space size and sample complexity. Two implementations are described (a Prolog system and an ASP system) that also support higher-order predicate invention. Experiments on robot strategies, chess, list transformations, and string decryption compare first-order and higher-order learning and report improved predictive accuracy and reduced runtimes for the higher-order case.
Significance. If the claimed reduction in hypothesis-space cardinality holds after accounting for the enlarged signature, the work would provide a principled route to more compact and sample-efficient ILP. The dual-system implementation and the experimental coverage across four domains are concrete strengths; the support for inventing higher-order predicates such as map/3 and filter/3 conditions adds practical value.
major comments (1)
- [theoretical results] The central theoretical claim (stated in the abstract and developed in the theoretical-results section) asserts that higher-order definitions reduce |H| and sample complexity. This reduction is not automatic once the language bias is enlarged by the higher-order predicates and their associated clauses; an explicit cardinality comparison or covering-number bound that includes the cost of the added background knowledge is required to establish the net decrease.
minor comments (1)
- [abstract] The abstract refers to the two systems as “namea” and “nameb”; the full names and citation details should appear at first use in the main text.
Simulated Author's Rebuttal
We thank the referee for the constructive review and for highlighting the need for a more precise accounting in the theoretical claims. We address the single major comment below.
read point-by-point responses
-
Referee: The central theoretical claim (stated in the abstract and developed in the theoretical-results section) asserts that higher-order definitions reduce |H| and sample complexity. This reduction is not automatic once the language bias is enlarged by the higher-order predicates and their associated clauses; an explicit cardinality comparison or covering-number bound that includes the cost of the added background knowledge is required to establish the net decrease.
Authors: We agree that establishing a net reduction in hypothesis-space size requires an explicit comparison that accounts for the enlarged signature. Section 3 of the manuscript shows that, for programs expressible in both settings, higher-order definitions can reduce textual complexity (measured by the number of symbols in the shortest program), which in turn yields a smaller hypothesis space when the search is bounded by program length. However, the current proofs do not include a direct cardinality argument that subtracts the additional hypotheses introduced by the higher-order background-knowledge predicates themselves. We will revise the theoretical section to add a clarifying discussion of this point and, where feasible, a bound that incorporates the cost of the added language bias. This revision will make explicit the conditions under which the claimed net decrease holds. revision: partial
Circularity Check
No circularity: theoretical claims rest on new MIL extensions without self-referential reduction
full rationale
The paper extends meta-interpretive learning by allowing higher-order definitions as background knowledge and claims this reduces textual complexity, hypothesis space size, and sample complexity. No quoted equations, definitions, or self-citations in the abstract or description reduce the central claim to a fit, renaming, or prior self-work by construction. The derivation is presented as following from the new language bias and theoretical analysis of program length, which is independent of the target result. The skeptic concern about net |H| cardinality is a potential gap in the proof but does not constitute circularity under the specified criteria, as no load-bearing step is shown to be equivalent to its inputs by definition.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Top-down induction of first-order logical decision trees
Hendrik Blockeel and Luc De Raedt. Top-down induction of first-order logical decision trees. Artif. Intell., 101(1-2):285–297, 1998
work page 1998
-
[2]
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Occam’s razor.Inf. Process. Lett., 24(6):377–380, 1987
work page 1987
-
[3]
A representation for pattern-knowledge in chess endgames
Ivan Bratko and Donald Michie. A representation for pattern-knowledge in chess endgames. Ad- vances in Computer Chess, 2:31–56, 1980
work page 1980
-
[4]
On understanding types, data abstraction, and polymorphism.ACM Comput
Luca Cardelli and Peter Wegner. On understanding types, data abstraction, and polymorphism.ACM Comput. Surv., 17(4):471–522, 1985
work page 1985
-
[5]
K.L. Clark. Negation as failure. In M. L. Ginsberg, editor, Readings in Nonmonotonic Reasoning, pages 311–325. Kaufmann, Los Altos, CA, 1987
work page 1987
-
[6]
Efficiently learning efficient programs
Andrew Cropper. Efficiently learning efficient programs . PhD thesis, Imperial College London, UK, 2017
work page 2017
-
[7]
Andrew Cropper and Stephen H. Muggleton. Learning efficient logical robot strategies involving composable objects. In Qiang Yang and Michael Wooldridge, editors, Proceedings of the Twenty- Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, Buenos Aires, Argentina, July 25-31, 2015, pages 3423–3429. AAAI Press, 2015
work page 2015
-
[8]
Andrew Cropper and Stephen H. Muggleton. Learning higher-order logic programs through abstrac- tion and invention. In Subbarao Kambhampati, editor, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016 , pages 1418–1424. IJCAI/AAAI Press, 2016
work page 2016
- [9]
- [10]
-
[11]
Andrew Cropper, Alireza Tamaddoni-Nezhad, and Stephen H. Muggleton. Meta-interpretive learn- ing of data transformation programs. In Katsumi Inoue, Hayato Ohwada, and Akihiro Yamamoto, editors, Inductive Logic Programming - 25th International Conference, ILP 2015, Kyoto, Japan, August 20-22, 2015, Revised Selected Papers, volume 9575 ofLecture Notes in Co...
work page 2015
-
[12]
Derivation reduction of metarules in meta-interpretive learn- ing
Andrew Cropper and Sophie Tourret. Derivation reduction of metarules in meta-interpretive learn- ing. In Fabrizio Riguzzi, Elena Bellodi, and Riccardo Zese, editors, Inductive Logic Programming - 28th International Conference, ILP 2018, Ferrara, Italy, September 2-4, 2018, Proceedings , volume 11105 of Lecture Notes in Computer Science, pages 1–21. Springer, 2018
work page 2018
-
[13]
A model building framework for answer set programming with external computations
Thomas Eiter, Michael Fink, Giovambattista Ianni, Thomas Krennwallner, Christoph Redl, and Peter Schüller. A model building framework for answer set programming with external computations. TPLP, 16(4):418–464, 2016
work page 2016
-
[14]
The discovery of the equator or con- cept driven learning
Werner Emde, Christopher Habel, and Claus-Rainer Rollinger. The discovery of the equator or con- cept driven learning. In Alan Bundy , editor,Proceedings of the 8th International Joint Conference on Artificial Intelligence. Karlsruhe, FRG, August 1983, pages 455–458. William Kaufmann, 1983
work page 1983
-
[15]
Towards inductive generalization in higher order logic
Cao Feng and Stephen Muggleton. Towards inductive generalization in higher order logic. In Derek H. Sleeman and Peter Edwards, editors, Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), Aberdeen, Scotland, UK, July 1-3, 1992, pages 154–162. Morgan Kauf- mann, 1992
work page 1992
-
[16]
Feser, Swarat Chaudhuri, and Isil Dillig
John K. Feser, Swarat Chaudhuri, and Isil Dillig. Synthesizing data structure transformations from input-output examples. In Proceedings of the 36th ACM SIGPLAN Conference on Programming Lan- guage Design and Implementation, Portland, OR, USA, June 15-17, 2015, pages 229–239, 2015
work page 2015
-
[17]
Inductive synthesis of recursive logic programs: Achievements and prospects
Pierre Flener and Serap Yilmaz. Inductive synthesis of recursive logic programs: Achievements and prospects. J. Log. Program., 41(2-3):141–195, 1999
work page 1999
-
[18]
Example-directed syn- thesis: a type-theoretic interpretation
Jonathan Frankle, Peter-Michael Osera, David Walker, and Steve Zdancewic. Example-directed syn- thesis: a type-theoretic interpretation. In Rastislav Bodík and Rupak Majumdar, editors,Proceedings of the 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2016, St. Petersburg, FL, USA, January 20 - 22, 2016, pages 802–815....
work page 2016
-
[19]
Classical negation in logic programs and disjunctive databases
Michael Gelfond and Vladimir Lifschitz. Classical negation in logic programs and disjunctive databases. New Generation Comput., 9(3/4):365–386, 1991
work page 1991
-
[20]
Automating string processing in spreadsheets using input-output examples
Sumit Gulwani. Automating string processing in spreadsheets using input-output examples. In Proceedings of the 38th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, POPL 2011, Austin, TX, USA, January 26-28, 2011, pages 317–330, 2011
work page 2011
-
[21]
The heuristic search and the game of chess
Larry Harris. The heuristic search and the game of chess. a study of quiescence, sacrifices, and plan oriented play . InComputer Chess Compendium, pages 136–142. Springer, 1988. Learning higher-order logic programs 35
work page 1988
-
[22]
Completing causal networks by meta- level abduction
Katsumi Inoue, Andrei Doncescu, and Hidetomo Nabeshima. Completing causal networks by meta- level abduction. Machine Learning, 91(2):239–277, 2013
work page 2013
-
[23]
Exploiting answer set programming with exter- nal sources for meta-interpretive learning
Tobias Kaminski, Thomas Eiter, and Katsumi Inoue. Exploiting answer set programming with exter- nal sources for meta-interpretive learning. TPLP, 18(3-4):571–588, 2018
work page 2018
-
[24]
Susumu Katayama. Efficient exhaustive generation of functional programs using monte-carlo search with iterative deepening. In PRICAI 2008: Trends in Artificial Intelligence, 10th Pacific Rim Interna- tional Conference on Artificial Intelligence, Hanoi, Vietnam, December 15-19, 2008. Proceedings, pages 199–210, 2008
work page 2008
-
[25]
Data-driven induction of functional programs
Emanuel Kitzelmann. Data-driven induction of functional programs. In ECAI 2008 - 18th European Conference on Artificial Intelligence, Patras, Greece, July 21-25, 2008, Proceedings , pages 781–782, 2008
work page 2008
-
[26]
J.W . Lloyd. Logic for Learning. Springer, Berlin, 2003
work page 2003
- [27]
-
[28]
Making robots conscious of their mental states
John McCarthy . Making robots conscious of their mental states. InMachine Intelligence 15, Intelligent Agents [St. Catherine’s College, Oxford, July 1995], pages 3–17, 1995
work page 1995
- [29]
-
[30]
Rolf Morel, Andrew Cropper, and C.-H. Luke Ong. Typed meta-interpretive learning of logic pro- grams. In Francesco Calimeri, Nicola Leone, and Marco Manna, editors, Logics in Artificial Intel- ligence - 16th European Conference, JELIA 2019, Rende, Italy, May 7-11, 2019, Proceedings , volume 11468 of Lecture Notes in Computer Science, pages 198–213. Springer, 2019
work page 2019
-
[31]
Stephen Muggleton. Inverse entailment and progol. New Generation Comput., 13(3&4):245–286, 1995
work page 1995
-
[32]
Stephen Muggleton and Wray L. Buntine. Machine invention of first order predicates by inverting resolution. In Machine Learning, Proceedings of the Fifth International Conference on Machine Learn- ing, Ann Arbor, Michigan, USA, June 12-14, 1988, pages 339–352, 1988
work page 1988
-
[33]
Flach, Katsumi Inoue, and Ashwin Srinivasan
Stephen Muggleton, Luc De Raedt, David Poole, Ivan Bratko, Peter A. Flach, Katsumi Inoue, and Ashwin Srinivasan. ILP turns 20 - biography and future challenges. Machine Learning, 86(1):3–23, 2012
work page 2012
-
[34]
Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad
Stephen H. Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad. Meta- interpretive learning: application to grammatical inference. Machine Learning, 94(1):25–49, 2014
work page 2014
-
[35]
Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad
Stephen H. Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad. Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited. Machine Learning, 100(1):49–73, 2015
work page 2015
-
[36]
Type-and-example-directed program synthesis
Peter-Michael Osera and Steve Zdancewic. Type-and-example-directed program synthesis. In David Grove and Steve Blackburn, editors, Proceedings of the 36th ACM SIGPLAN Conference on Program- ming Language Design and Implementation, Portland, OR, USA, June 15-17, 2015 , pages 619–630. ACM, 2015
work page 2015
-
[37]
J. Ross Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990
work page 1990
-
[38]
Luc De Raedt and Maurice Bruynooghe. Interactive concept-learning and constructive induction by analogy .Machine Learning, 8:107–150, 1992
work page 1992
-
[39]
Abstraction in artificial intelligence and complex systems
Lorenza Saitta and Jean-Daniel Zucker. Abstraction in artificial intelligence and complex systems . Springer, 2013
work page 2013
- [40]
-
[41]
A. Srinivasan. The ALEPH manual. Machine Learning at the Computing Laboratory, Oxford University, 2001
work page 2001
-
[42]
The appropriateness of predicate invention as bias shift operation in ILP
Irene Stahl. The appropriateness of predicate invention as bias shift operation in ILP. Machine Learning, 20(1-2):95–117, 1995
work page 1995
-
[43]
SWI-Prolog.Theory and Practice of Logic Programming, 12(1-2):67–96, 2012
Jan Wielemaker, Tom Schrijvers, Markus Triska, and Torbjörn Lager. SWI-Prolog.Theory and Practice of Logic Programming, 12(1-2):67–96, 2012
work page 2012
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.