Co-pi-tree distills LLM reasoning into a dual policy tree refined via interaction feedback, reporting 35.4% higher rewards, 77.7% fewer LLM queries, and 97.1% lower latency than baselines in Overcooked-AI.
InAdvances in Neural Information Processing Systems
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Distilling LLM Reasoning into an Interpretable Policy Tree for Human-AI Collaboration
Co-pi-tree distills LLM reasoning into a dual policy tree refined via interaction feedback, reporting 35.4% higher rewards, 77.7% fewer LLM queries, and 97.1% lower latency than baselines in Overcooked-AI.