Speculative Actions: A Lossless Framework for Faster Agentic Systems

· 2025 · cs.AI · arXiv 2510.04371

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

open full Pith review browse 4 citing papers arXiv PDF

abstract

AI agents are increasingly deployed in complex, interactive environments, yet their runtime remains a major bottleneck for training, evaluation, and real-world use. Typical agent behavior unfolds sequentially, with each action requiring an API call that can incur substantial latency. For example, a game of chess between two state-of-the-art agents can take hours. We introduce Speculative Actions, a lossless acceleration framework for general agentic systems. Inspired by speculative execution in microprocessors and speculative decoding in LLM inference, our method uses faster models to predict likely future actions and execute them in parallel, committing only when predictions match. We evaluate speculative actions across gaming, e-commerce, and web search environments, and additionally study a lossy extension in an operating systems setting. Across domains, we achieve up to 55% next-action prediction accuracy, translating into up to 20% latency reductions. Finally, we present a cost-latency analysis that formalizes the tradeoff between speculative breadth and time savings. This analysis enables principled tuning and selective branch launching to ensure that multi-branch speculation delivers practical speedups without prohibitive cost growth.

representative citing papers

Skim: Speculative Execution for Fast and Efficient Web Agents

cs.AI · 2026-05-15 · unverdicted · novelty 7.0

Skim profiles website patterns offline to enable fast-path speculative execution for web agents, cutting median cost by 1.9x and latency by 33.4% with no accuracy loss on benchmarks.

Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes

cs.OS · 2026-04-30 · unverdicted · novelty 7.0

Crab bridges the agent-OS semantic gap with an eBPF inspector, turn-aligned coordinator, and host engine to deliver 100% recovery correctness while cutting checkpoint traffic up to 87% and adding under 2% overhead.

SpecHop: Continuous Speculation for Accelerating Multi-Hop Retrieval Agents

cs.CL · 2026-05-21 · unverdicted · novelty 5.0

SpecHop accelerates multi-hop LLM tool use via continuous multi-threaded speculation with asynchronous verification, approaching oracle latency gains and reducing latency up to 40% on retrieval tasks.

AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent

cs.LG · 2026-04-07 · unverdicted · novelty 5.0

AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.

citing papers explorer

Showing 4 of 4 citing papers.

Skim: Speculative Execution for Fast and Efficient Web Agents cs.AI · 2026-05-15 · unverdicted · none · ref 32 · internal anchor
Skim profiles website patterns offline to enable fast-path speculative execution for web agents, cutting median cost by 1.9x and latency by 33.4% with no accuracy loss on benchmarks.
Crab: A Semantics-Aware Checkpoint/Restore Runtime for Agent Sandboxes cs.OS · 2026-04-30 · unverdicted · none · ref 55 · internal anchor
Crab bridges the agent-OS semantic gap with an eBPF inspector, turn-aligned coordinator, and host engine to deliver 100% recovery correctness while cutting checkpoint traffic up to 87% and adding under 2% overhead.
SpecHop: Continuous Speculation for Accelerating Multi-Hop Retrieval Agents cs.CL · 2026-05-21 · unverdicted · none · ref 36 · internal anchor
SpecHop accelerates multi-hop LLM tool use via continuous multi-threaded speculation with asynchronous verification, approaching oracle latency gains and reducing latency up to 40% on retrieval tasks.
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent cs.LG · 2026-04-07 · unverdicted · none · ref 29 · internal anchor
AgentOpt introduces a framework-agnostic package that uses algorithms like UCB-E to find cost-effective model assignments in multi-step LLM agent pipelines, cutting evaluation budgets by 62-76% while maintaining near-optimal accuracy on benchmarks.

Speculative Actions: A Lossless Framework for Faster Agentic Systems

fields

years

verdicts

representative citing papers

citing papers explorer