MTR-Bench is a new automated benchmark for multi-turn reasoning in LLMs covering diverse tasks and difficulty levels with 3600 instances.
Language models as zero- shot planners: Extracting actionable knowledge for embodied agents
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 2roles
background 2polarities
background 2representative citing papers
Voyager achieves superior lifelong learning in Minecraft by combining an automatic exploration curriculum, a library of executable skills, and iterative LLM prompting with environment feedback, yielding 3.3x more unique items and 15.3x faster milestone unlocks than prior methods while generalizing技能
citing papers explorer
-
MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation
MTR-Bench is a new automated benchmark for multi-turn reasoning in LLMs covering diverse tasks and difficulty levels with 3600 instances.
-
Voyager: An Open-Ended Embodied Agent with Large Language Models
Voyager achieves superior lifelong learning in Minecraft by combining an automatic exploration curriculum, a library of executable skills, and iterative LLM prompting with environment feedback, yielding 3.3x more unique items and 15.3x faster milestone unlocks than prior methods while generalizing技能