pith. sign in

super hub Canonical reference

Voyager: An Open-Ended Embodied Agent with Large Language Models

Canonical reference. 94% of citing Pith papers cite this work as background.

357 Pith papers citing it
Background 94% of classified citations
abstract

We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: 1) an automatic curriculum that maximizes exploration, 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. Voyager interacts with GPT-4 via blackbox queries, which bypasses the need for model parameter fine-tuning. The skills developed by Voyager are temporally extended, interpretable, and compositional, which compounds the agent's abilities rapidly and alleviates catastrophic forgetting. Empirically, Voyager shows strong in-context lifelong learning capability and exhibits exceptional proficiency in playing Minecraft. It obtains 3.3x more unique items, travels 2.3x longer distances, and unlocks key tech tree milestones up to 15.3x faster than prior SOTA. Voyager is able to utilize the learned skill library in a new Minecraft world to solve novel tasks from scratch, while other techniques struggle to generalize. We open-source our full codebase and prompts at https://voyager.minedojo.org/.

hub tools

citation-role summary

background 59 method 2 dataset 1 other 1

citation-polarity summary

claims ledger

  • abstract We introduce Voyager, the first LLM-powered embodied lifelong learning agent in Minecraft that continuously explores the world, acquires diverse skills, and makes novel discoveries without human intervention. Voyager consists of three key components: 1) an automatic curriculum that maximizes exploration, 2) an ever-growing skill library of executable code for storing and retrieving complex behaviors, and 3) a new iterative prompting mechanism that incorporates environment feedback, execution errors, and self-verification for program improvement. Voyager interacts with GPT-4 via blackbox querie

authors

co-cited works

clear filters

representative citing papers

Continual Harness: Online Adaptation for Self-Improving Foundation Agents

cs.LG · 2026-05-11 · conditional · novelty 8.0

Continual Harness automates online self-improvement for foundation-model embodied agents by refining prompts, sub-agents, skills, and memory within one run, cutting button-press costs on Pokemon Red and Emerald and closing much of the gap to expert harnesses.

SEVerA: Verified Synthesis of Self-Evolving Agents

cs.LG · 2026-03-26 · unverdicted · novelty 8.0

SEVerA uses Formally Guarded Generative Models and a three-stage Search-Verification-Learning process to synthesize self-evolving agents that satisfy hard formal constraints while improving task performance.

Generative Skill Composition for LLM Agents

cs.CL · 2026-06-30 · unverdicted · novelty 7.0

SkillComposer performs task-conditioned skill sequence prediction with a constrained autoregressive decoder to jointly output skill subset, count, and order, raising pass rates by 23.1 and 18.2 percentage points on two production coding agents over no-skill baselines.

Agentic Abstention: Do Agents Know When to Stop Instead of Act?

cs.AI · 2026-06-27 · unverdicted · novelty 7.0

LLM agents often fail to abstain at the right time in uncertain multi-turn tasks, and the CONVOLVE context engineering method raises timely abstention rates on WebShop from 26.7 to 57.4 without parameter updates.

Co-Evolving Skill Generation and Policy Optimization

cs.CL · 2026-06-07 · unverdicted · novelty 7.0

Framework estimates context-dependent marginal utility of candidate skills via reward gaps in matched base vs. skill-augmented rollouts to filter skills and co-train policy as generator.

PACE: Anytime-Valid Acceptance Tests for Self-Evolving Agents

cs.AI · 2026-06-06 · unverdicted · novelty 7.0

PACE is a training-free anytime-valid commit gate using testing-by-betting e-processes that controls per-candidate false-commit probability for self-evolving agents and reduces spurious edits compared to greedy acceptance.

Rosetta Memory: Adaptive Memory for Cross-LLM Agents

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

Rosetta Memory trains two profile-conditioned operators with a minimum-gain sampling curriculum and performance-gap reward to enable memory transfer between LLMs, showing gains on multi-hop QA benchmarks and robustness to unseen models.

AIP: A Graph Representation for Learning and Governing Agent Skills

cs.AI · 2026-06-03 · unverdicted · novelty 7.0

AIP models skills as graphs of discrete steps connected by typed I/O edges under a validated schema, raising agent mean reward from 0.60 to 0.71 and pass rate from 53% to 67% on 27 SkillsBench tasks while enabling node-level fixes.

citing papers explorer

Showing 7 of 7 citing papers after filters.

  • Language Agent Tree Search Unifies Reasoning Acting and Planning in Language Models cs.AI · 2023-10-06 · unverdicted · none · ref 9 · internal anchor

    LATS integrates Monte Carlo Tree Search with language models using in-context learning, value functions, and self-reflection to achieve 92.7% pass@1 on HumanEval and competitive web navigation performance.

  • Mastering Diverse Domains through World Models cs.AI · 2023-01-10 · unverdicted · none · ref 54 · internal anchor

    DreamerV3 uses world models and robustness techniques to solve over 150 tasks across domains with a single configuration, including Minecraft diamond collection from scratch.

  • Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations cs.AI · 2023-12-14 · conditional · none · ref 83 · internal anchor

    Math-Shepherd is an automatically trained process reward model that scores solution steps to verify and reinforce LLMs, lifting Mistral-7B from 77.9% to 89.1% on GSM8K and 28.6% to 43.5% on MATH.

  • SGLang: Efficient Execution of Structured Language Model Programs cs.AI · 2023-12-12 · conditional · none · ref 52 · internal anchor

    SGLang is a new system that speeds up structured LLM programs by up to 6.4x using RadixAttention for KV cache reuse and compressed finite state machines for output decoding.

  • Cognitive Architectures for Language Agents cs.AI · 2023-09-05 · accept · none · ref 81 · internal anchor

    CoALA is a modular cognitive architecture for language agents that organizes memory components, action spaces for internal and external interaction, and a generalized decision-making loop to support more systematic development of capable agents.

  • A Survey on Large Language Model based Autonomous Agents cs.AI · 2023-08-22 · accept · none · ref 38 · internal anchor

    A survey of LLM-based autonomous agents that proposes a unified framework for their construction and reviews applications in social science, natural science, and engineering along with evaluation methods and future directions.

  • The Rise and Potential of Large Language Model Based Agents: A Survey cs.AI · 2023-09-14 · accept · none · ref 191 · internal anchor

    The paper surveys the origins, frameworks, applications, and open challenges of AI agents built on large language models.