HALO trains an orchestrator policy on verifier-approved refinement trajectories across 11 PDDL domains, matching GPT-5-mini success rates at roughly 45x lower orchestration cost and cutting LLM calls by 40-50%.
arXiv preprint arXiv:2403.00092 , year=
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Training the Orchestrator: A Supervised Approach to End-to-End PDDL Planning with LLM Agents
HALO trains an orchestrator policy on verifier-approved refinement trajectories across 11 PDDL domains, matching GPT-5-mini success rates at roughly 45x lower orchestration cost and cutting LLM calls by 40-50%.