AAWM builds training targets for world models by retrieving and synthesizing transition evidence based on the policy's self-identified decision needs at each state.
Reinforcement world model learning for llm-based agents
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 6roles
background 2polarities
background 2representative citing papers
PriorZero uses root-only LLM prior injection in MCTS and alternating world-model training with LLM fine-tuning to raise exploration efficiency and final performance on Jericho text games and BabyAI gridworlds.
In configurable enterprise systems, runtime discovery of transition dynamics from system configuration is more robust to deployment shifts than offline-trained world models.
A survey that organizes existing work on LLM-based agents around code as the central harness, structured in three layers of interfaces, mechanisms, and multi-agent scaling, with applications across domains and listed open challenges.
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
A comprehensive reference book organizing existing techniques for agentic AI systems across LLM substrate, reasoning, agent design patterns, inter-agent coordination, and production deployment.
citing papers explorer
-
Beyond Next-Observation Prediction: Agent-Authored World Modeling for Sequential Decision Making
AAWM builds training targets for world models by retrieving and synthesizing transition evidence based on the policy's self-identified decision needs at each state.
-
PriorZero: Bridging Language Priors and World Models for Decision Making
PriorZero uses root-only LLM prior injection in MCTS and alternating world-model training with LLM fine-tuning to raise exploration efficiency and final performance on Jericho text games and BabyAI gridworlds.
-
Do Enterprise Systems Need Learned World Models? The Importance of Context to Infer Dynamics
In configurable enterprise systems, runtime discovery of transition dynamics from system configuration is more robust to deployment shifts than offline-trained world models.
-
World Action Models: The Next Frontier in Embodied AI
The paper introduces World Action Models as a new paradigm unifying predictive world modeling with action generation in embodied foundation models and provides a taxonomy of existing approaches.
-
The Hitchhiker's Guide to Agentic AI: From Foundations to Systems
A comprehensive reference book organizing existing techniques for agentic AI systems across LLM substrate, reasoning, agent design patterns, inter-agent coordination, and production deployment.