Understanding html with large language models

Izzeddin Gur, Ofir Nachum, Yingjie Miao, Mustafa Safdari, Austin Huang, Aakanksha Chowdhery, Sharan Narang, Noah Fiedel, Aleksandra Faust · 2022 · arXiv 2210.03945

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

read on arXiv browse 4 citing papers

representative citing papers

Skim: Speculative Execution for Fast and Efficient Web Agents

cs.AI · 2026-05-15 · unverdicted · novelty 7.0

Skim profiles website patterns offline to enable fast-path speculative execution for web agents, cutting median cost by 1.9x and latency by 33.4% with no accuracy loss on benchmarks.

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

cs.AI · 2024-05-23 · accept · novelty 7.0

AndroidWorld is a dynamic, reproducible Android benchmark that generates unlimited natural-language tasks for autonomous agents and shows current agents succeed on only 30.6 percent of them.

Language Models can Solve Computer Tasks

cs.CL · 2023-03-30 · accept · novelty 6.0

Pre-trained LLMs using recursive criticism and improvement prompting achieve state-of-the-art results on the MiniWoB++ computer task benchmark with only a handful of demonstrations and no task-specific reward function.

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

cs.CL · 2025-03-12 · unverdicted · novelty 5.0

Plan-and-Act trains a dedicated Planner on synthetic plan-annotated trajectories to generate high-level plans that an Executor follows, reaching 57.58% success on WebArena-Lite and 81.36% on WebVoyager.

citing papers explorer

Showing 4 of 4 citing papers.

Skim: Speculative Execution for Fast and Efficient Web Agents cs.AI · 2026-05-15 · unverdicted · none · ref 12
Skim profiles website patterns offline to enable fast-path speculative execution for web agents, cutting median cost by 1.9x and latency by 33.4% with no accuracy loss on benchmarks.
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents cs.AI · 2024-05-23 · accept · none · ref 11
AndroidWorld is a dynamic, reproducible Android benchmark that generates unlimited natural-language tasks for autonomous agents and shows current agents succeed on only 30.6 percent of them.
Language Models can Solve Computer Tasks cs.CL · 2023-03-30 · accept · none · ref 24
Pre-trained LLMs using recursive criticism and improvement prompting achieve state-of-the-art results on the MiniWoB++ computer task benchmark with only a handful of demonstrations and no task-specific reward function.
Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks cs.CL · 2025-03-12 · unverdicted · none · ref 12
Plan-and-Act trains a dedicated Planner on synthetic plan-annotated trajectories to generate high-level plans that an Executor follows, reaching 57.58% success on WebArena-Lite and 81.36% on WebVoyager.

Understanding html with large language models

fields

years

verdicts

representative citing papers

citing papers explorer