pith. sign in

arxiv: 2602.17245 · v2 · pith:75AIDIZAnew · submitted 2026-02-19 · 💻 cs.AI

Web Agents Should Use Typed Actions Instead of Click-Based Browsing

classification 💻 cs.AI
keywords agentsbehaviorlayerproducetypedactionsbrittleexecution
0
0 comments X
read the original abstract

This position paper argues that building a reliable agentic Web requires shifting from low-level interaction primitives to typed actions supported by a semantic layer. Today's web agents primarily operate through clicks, keystrokes, and DOM manipulation, which leads to brittle long-horizon behavior, high execution cost, and limited auditability. We propose web verbs as a concrete design for this layer. A verb exposes a web operation as a typed function with structured inputs, structured outputs, and documented behavior, whether it is backed by a server-side Web API or a maintained client-side workflow. Verb calls can carry preconditions, postconditions, policy tags, and logging hooks, allowing agents to synthesize concise programs with explicit control flow and data flow and to produce checkable execution traces. Using representative case studies, we illustrate how verb-level composition can produce correct, reproducible outcomes, while browser agents using low-level interaction primitives may produce brittle behavior or incorrect reasoning. We conclude with a call to action on standardization, developer tooling, and community processes needed to make this semantic layer deployable and trustworthy at web scale.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Tool Illusion: Rethinking Tool Use in Web Agents

    cs.CL 2026-04 unverdicted novelty 5.0

    A broad empirical study finds tool use in web agents yields inconsistent benefits and requires careful design to avoid drawbacks.