pith. sign in

arxiv: 2605.07584 · v1 · submitted 2026-05-08 · 💻 cs.AI

Parallel Lifted Planning via Semi-Naive Datalog Evaluation

Pith reviewed 2026-05-11 01:56 UTC · model grok-4.3

classification 💻 cs.AI
keywords lifted planningDatalog evaluationparallel executionsemi-naive evaluationclassical planningclique enumerationheuristic searchFF heuristic
0
0 comments X

The pith

Lifted planners using parallel semi-naive Datalog evaluation solve more tasks than baselines, with speedups increasing on multiple cores.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a parallel execution model for Datalog-based lifted planning with rule-level and grounding parallelism. This model is specialized using a clique enumeration grounder that supports semi-naive evaluation for planning tasks. By exploiting multi-core processors, it reduces the time spent on repeated instantiations during search. Experiments demonstrate improved task coverage over baselines even on one core, with the advantage growing as cores are added and up to 6x speedup on hard instances.

Core claim

The central claim is that introducing two levels of parallelism in semi-naive Datalog evaluation, combined with a planning-specific grounder based on clique enumeration, enables lifted planners to solve more tasks and achieve substantial speedups on multi-core systems compared to existing approaches.

What carries the argument

Two-level parallel execution model (rule-level parallelism and grounding parallelism) for semi-naive Datalog evaluation, powered by a clique-enumeration grounder.

If this is right

  • The implementation solves more tasks than baselines using greedy best-first search with the FF heuristic.
  • The performance gap widens as additional cores are used.
  • On hard-to-ground tasks, an average parallel fraction of 92.4% is achieved where Datalog takes 97.6% of runtime.
  • Up to a 6-fold speedup is realized on 8 cores in practice.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This parallel Datalog approach could be applied to other lifted or relational reasoning tasks in artificial intelligence.
  • Integrating it with different heuristics or search algorithms might yield further gains in planning efficiency.
  • The technique suggests lifted planning can scale better with hardware parallelism, potentially reducing reliance on full grounding methods.

Load-bearing premise

The measured parallel fraction and speedups will generalize beyond the chosen benchmark set and that the overhead of the new grounder and parallelism management remains negligible when Datalog is not the dominant cost.

What would settle it

Evaluating the planner on a different set of planning benchmarks with varying ratios of grounding time to see if task coverage and speedups still exceed those of the baselines.

Figures

Figures reproduced from arXiv: 2605.07584 by Dominik Drexler, Jendrik Seipp, Oliver Joergensen.

Figure 1
Figure 1. Figure 1: Synchronous execution architecture with two-level parallelization. Level L1 denotes rule-level parallelism, where rules [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The number of solved tasks (S) grouped by their [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 4
Figure 4. Figure 4: Per domain average rule skew against relative speed [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Lifted classical planners operate directly on first-order planning tasks to avoid the computationally demanding grounding step. However, lifted planning is typically slower, as planners must repeatedly instantiate ground structures during search. Many core components of lifted classical planning, such as successor generation, axiom evaluation, task grounding, and delete-relaxed heuristics, have previously been studied through the lens of Datalog evaluation. We build upon this line of work and extend it by developing and analyzing an execution model with two levels of parallelism: rule-level parallelism and grounding parallelism. We further specialize this solver for planning-specific workloads with a grounder based on clique enumeration, which we extend to support semi-naive Datalog evaluation. Our experimental evaluation using greedy best-first search with the FF heuristic shows that our implementation already solves more tasks than the baselines on a single core, and the gap widens as additional cores are used. Moreover, on hard-to-ground tasks where on average 97.6% of the total runtime is spent in Datalog execution, the proposed execution model exhibits an average parallel fraction of 92.4%, while achieving up to a 6-fold speedup on 8 cores in practice.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper develops a two-level parallel execution model (rule-level and grounding parallelism) for lifted classical planning by extending semi-naive Datalog evaluation. It introduces a planning-specialized grounder based on clique enumeration that supports semi-naive evaluation, and evaluates the resulting planner using greedy best-first search with the FF heuristic on standard benchmarks. The central empirical claims are that the implementation already solves more tasks than baselines on a single core (with the advantage increasing with core count), that Datalog execution dominates runtime (97.6% on hard-to-ground tasks), and that the model achieves an average parallel fraction of 92.4% with up to 6-fold speedup on 8 cores.

Significance. If the reported speedups and coverage gains hold under scrutiny, the work would be a meaningful step toward making lifted planning competitive by exploiting parallelism in the dominant Datalog components. The concrete metrics on parallel fraction and Datalog dominance on a recognizable benchmark subset provide a clear, falsifiable basis for assessing the approach's practical value.

major comments (2)
  1. §5 (Experimental Evaluation), paragraph on hard-to-ground task selection: the criteria used to identify the subset where Datalog accounts for 97.6% of runtime are not stated explicitly. Without this, it is impossible to verify that the reported parallel fraction and speedups are not an artifact of post-hoc filtering, which directly affects the strength of the scaling claim.
  2. §4.2 (Grounder based on clique enumeration) and §5.1 (baseline implementations): the paper does not provide sufficient detail on how the new grounder and the baseline lifted planners were implemented or whether they share the same underlying Datalog engine. This makes it difficult to isolate whether the observed single-core advantage stems from the parallel model, the clique-based grounder, or implementation differences.
minor comments (2)
  1. Figure 3 (speedup plots): axis labels and legend entries should explicitly state the number of cores and the exact metric (wall-clock time or CPU time) to avoid ambiguity in interpreting the 6-fold speedup.
  2. Table 1 (coverage results): include the total number of tasks per domain and the number of tasks solved by each configuration to allow direct computation of coverage percentages.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment below with clarifications and commit to revisions that improve transparency without altering the core claims or results.

read point-by-point responses
  1. Referee: §5 (Experimental Evaluation), paragraph on hard-to-ground task selection: the criteria used to identify the subset where Datalog accounts for 97.6% of runtime are not stated explicitly. Without this, it is impossible to verify that the reported parallel fraction and speedups are not an artifact of post-hoc filtering, which directly affects the strength of the scaling claim.

    Authors: We agree that the selection criteria must be stated explicitly to permit verification and rule out post-hoc concerns. In the revised manuscript we will add a precise description of the criteria applied to the IPC benchmarks (based on runtime profiling that identified tasks with high Datalog dominance). This addition will make the subset reproducible and strengthen the empirical support for the reported parallel fraction and speedups. revision: yes

  2. Referee: §4.2 (Grounder based on clique enumeration) and §5.1 (baseline implementations): the paper does not provide sufficient detail on how the new grounder and the baseline lifted planners were implemented or whether they share the same underlying Datalog engine. This makes it difficult to isolate whether the observed single-core advantage stems from the parallel model, the clique-based grounder, or implementation differences.

    Authors: We acknowledge that the current level of detail is insufficient for isolating contributions and ensuring reproducibility. Our grounder is a planning-specialized clique-enumeration implementation extended to support semi-naive evaluation and integrated into our Datalog engine; baselines were evaluated within comparable search frameworks. In the revision we will expand §4.2 with additional algorithmic description of the grounder and §5.1 with explicit statements on shared versus distinct components of the Datalog engine. These changes will clarify the sources of the single-core coverage gains. revision: yes

Circularity Check

0 steps flagged

No significant circularity; purely empirical claims

full rationale

The paper's core contribution is an implementation of a two-level parallel execution model for lifted planning via semi-naive Datalog, together with a clique-enumeration grounder. All reported results (more tasks solved than baselines, 92.4% average parallel fraction, up to 6x speedup on 8 cores) are direct measurements on IPC benchmarks under greedy best-first search with FF heuristic. No equations, uniqueness theorems, or ansatzes are invoked whose outputs are definitionally identical to their inputs; the work cites prior Datalog-planning literature for context but does not rely on any self-citation chain to establish its measured performance. The derivation chain is therefore self-contained and non-circular.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper relies on standard Datalog semantics and existing lifted-planning encodings; no new free parameters, ad-hoc axioms, or invented entities are introduced in the abstract.

axioms (1)
  • standard math Standard semantics of Datalog and semi-naive evaluation
    Invoked when extending the grounder to support semi-naive evaluation.

pith-pipeline@v0.9.0 · 5504 in / 1298 out tokens · 31582 ms · 2026-05-11T01:56:27.869242+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

22 extracted references · 22 canonical work pages

  1. [1]

    Abiteboul, S.; Hull, R.; and Vianu, V. 1995. Foundations of Databases. Addison-Wesley

  2. [2]

    Bancilhon, F.; and Ramakrishnan, R. 1986. An amateur's introduction to recursive query processing strategies. In Proceedings of the 1986 ACM SIGMOD International Conference on Management of Data, 16--52. Association for Computing Machinery

  3. [3]

    Bron, C.; and Kerbosch, J. 1973. Algorithm 457: finding all cliques of an undirected graph. Communications of the ACM, 16: 575--577

  4. [4]

    Ceri, S.; Gottlob, G.; and Tanca, L. 1989. What you Always Wanted to Know About Datalog (And Never Dared to Ask). IEEE Trans. Knowl. Data Eng., 1: 146--166

  5. [5]

    B.; Franc \`e s, G.; Pommerening, F.; and Helmert, M

    Corr \^e a, A. B.; Franc \`e s, G.; Pommerening, F.; and Helmert, M. 2021. Delete-Relaxation Heuristics for Lifted Classical Planning. In Goldman, R. P.; Biundo, S.; and Katz, M., eds., Proceedings of the Thirty-First International Conference on Automated Planning and Scheduling (ICAPS 2021), 94--102. AAAI Press

  6. [6]

    B.; Pommerening, F.; Helmert, M.; and Franc \`e s, G

    Corr \^e a, A. B.; Pommerening, F.; Helmert, M.; and Franc \`e s, G. 2020. Lifted Successor Generation using Query Optimization Techniques. In Beck, J. C.; Karpas, E.; and Sohrabi, S., eds., Proceedings of the Thirtieth International Conference on Automated Planning and Scheduling (ICAPS 2020), 80--89. AAAI Press

  7. [7]

    B.; Pommerening, F.; Helmert, M.; and Franc \`e s, G

    Corr \^e a, A. B.; Pommerening, F.; Helmert, M.; and Franc \`e s, G. 2022. The FF Heuristic for Lifted Classical Planning. In Honavar, V.; and Spaan, M., eds., Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence ( AAAI 2022) , 9716--9723. AAAI Press

  8. [8]

    Das, A.; and Zaniolo, C. 2019. A Case for Stale Synchronous Distributed Model for Declarative Recursive Computation. Theory and Practice of Logic Programming, 19: 1056--1072

  9. [9]

    Drexler, D. 2026. Loki: A PDDL Parser and Normalizer. https://doi.org/10.5281/zenodo.20081136

  10. [10]

    Gerevini, A.; Howe, A.; Cesta, A.; and Refanidis, I., eds. 2009. Proceedings of the Nineteenth International Conference on Automated Planning and Scheduling (ICAPS 2009). AAAI Press

  11. [11]

    Helmert, M. 2006. The Fast Downward Planning System. Journal of Artificial Intelligence Research, 26: 191--246

  12. [12]

    Helmert, M. 2009. Concise Finite-Domain Representations for PDDL Planning Tasks. Artificial Intelligence, 173: 503--535

  13. [13]

    Hoffmann, J.; and Nebel, B. 2001. The FF Planning System: Fast Plan Generation Through Heuristic Search. Journal of Artificial Intelligence Research, 14: 253--302

  14. [14]

    Jordan, H.; Scholz, B.; and Suboti \'c , P. 2016. Souffl \'e : On Synthesis of Program Analyzers. In International Conference on Computer Aided Verification

  15. [15]

    Kishimoto, A.; Fukunaga, A.; and Botea, A. 2009. Scalable, Parallel Best-First Search for Optimal Sequential Planning. In icaps2009 , 201--208

  16. [16]

    Kishimoto, A.; Fukunaga, A.; and Botea, A. 2013. Evaluation of a simple, scalable, parallel best-first search strategy. Artif. Intell., 195: 222--248

  17. [17]

    Q.; Porat, E.; R \' e , C.; and Rudra, A

    Ngo, H. Q.; Porat, E.; R \' e , C.; and Rudra, A. 2018. Worst-case Optimal Join Algorithms. J. ACM , 65(3): 16:1--16:40

  18. [18]

    A.; Wang, K.; Baker, E

    Phillips, C. A.; Wang, K.; Baker, E. J.; Bubier, J. A.; Chesler, E. J.; and Langston, M. A. 2019. On Finding and Enumerating Maximal and Maximum k-Partite Cliques in k-Partite Graphs. Algorithms, 12

  19. [19]

    Richter, S.; and Helmert, M. 2009. Preferred Operators and Deferred Evaluation in Satisficing Planning. In icaps2009 , 273--280

  20. [20]

    Shkapsky, A.; Yang, M.; Interlandi, M.; Chiu, H.; Condie, T.; and Zaniolo, C. 2016. Big Data Analytics with Datalog Queries on Spark. In \" O zcan, F.; Koutrika, G.; and Madden, S., eds., Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016 , 1135--1149. ACM

  21. [21]

    St hlberg, S. 2023. Lifted Successor Generation by Maximum Clique Enumeration. In Gal, K.; Now \'e , A.; Nalepa, G. J.; Fairstein, R.; and R a dulescu, R., eds., Proceedings of the 26th European Conference on Artificial Intelligence ( ECAI 2023) , 2194--2201. IOS Press

  22. [22]

    Wang, K.; Yu, K.; and Long, C. 2024. Efficient k-Clique Listing: An Edge-Oriented Branching Strategy