Webshop: Towards scalable real-world web interaction with grounded language agents

Shunyu Yao, Howard Chen, John Yang, Karthik Narasimhan · 2023

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

EnvSimBench: A Benchmark for Evaluating and Improving LLM-Based Environment Simulation

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

EnvSimBench reveals that state-of-the-art LLMs exhibit a universal state change cliff in environment simulation, with a new constraint-driven pipeline raising synthesis yield by 6.8% and cutting costs over 90%.

General Agentic Planning Through Simulative Reasoning with World Models

cs.AI · 2025-07-31 · conditional · novelty 6.0

SiRA uses LLM world models for simulative reasoning to achieve up to 124% higher task completion and 32.2% navigation success versus reactive baselines in web environments.

citing papers explorer

Showing 2 of 2 citing papers.

EnvSimBench: A Benchmark for Evaluating and Improving LLM-Based Environment Simulation cs.AI · 2026-05-08 · unverdicted · none · ref 22
EnvSimBench reveals that state-of-the-art LLMs exhibit a universal state change cliff in environment simulation, with a new constraint-driven pipeline raising synthesis yield by 6.8% and cutting costs over 90%.
General Agentic Planning Through Simulative Reasoning with World Models cs.AI · 2025-07-31 · conditional · none · ref 43
SiRA uses LLM world models for simulative reasoning to achieve up to 124% higher task completion and 32.2% navigation success versus reactive baselines in web environments.

Webshop: Towards scalable real-world web interaction with grounded language agents

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer