pith. sign in
Pith Number

pith:ZW7EPVPM

pith:2026:ZW7EPVPMPWXCIKVVDYLGAOW7AF
not attested not anchored not stored refs resolved

Speculative Interaction Agents: Building Real-Time Agents with Asynchronous I/O and Speculative Tool Calling

Amir Gholami, Coleman Hooper, Eric Wen, John Wawrzynek, Kurt Keutzer, Michael W. Mahoney, Minwoo Kang, Nicholas Lee, Suhong Moon, Yakun Sophia Shao

Speculative Interaction Agents reduce real-time tool-calling latency by overlapping external waits with reasoning and executing tools on partial information.

arxiv:2605.13360 v2 · 2026-05-13 · cs.LG

Add to your LaTeX paper
\usepackage{pith}
\pithnumber{ZW7EPVPMPWXCIKVVDYLGAOW7AF}

Prints a linked badge after your title and injects PDF metadata. Compiles on arXiv. Learn more · Embed verified badge

Record completeness

1 Bitcoin timestamp
2 Internet Archive
3 Author claim open · sign in to claim
4 Citations open
5 Replications open
Portable graph bundle live · download bundle · merged state
The bundle contains the canonical record plus signed events. A mirror can host it anywhere and recompute the same current state with the deterministic merge algorithm.

Claims

C1strongest claim

For strong cloud models, our method can be applied out-of-the-box to existing real-time cloud APIs, providing 1.3-1.7× speedups with minor accuracy loss. ... Altogether, this approach provides 1.6-2.2× speedups with the Qwen2.5-3B-Instruct and Llama-3.2-3B-Instruct models across multiple tool calling benchmarks.

C2weakest assumption

That speculative tool calling incurs only minor accuracy loss and that the clock-based training on synthetic data generalizes to real user interactions without introducing errors from premature tool calls.

C3one line summary

Speculative Interaction Agents achieve 1.3-2.2x speedups for real-time tool-calling agents via async I/O decoupling and speculative calls, with clock-based training for small edge models.

References

17 extracted · 17 resolved · 5 Pith anchors

[1] Speakrl: Synergizing reasoning, speaking, and act- ing in language models with reinforcement learning.arXiv preprint arXiv:2512.13159,
[2] Stream rag: Instant and accurate spoken dialogue systems with streaming tool usage.arXiv preprint arXiv:2510.02044,
[3] Whisperx: Time-accurate speech transcription of long-form audio.INTERSPEECH 2023, 2023
[4] Alexandre Défossez, Laurent Mazaré, Manu Orsini, Amélie Royer, Patrick Pérez, Hervé Jégou, Edouard Grave, and Neil Zeghidour
[5] E., Lee, N., Jha, S., Kim, S., Tabrizi, R., Moon, S., Hooper, C., Anumanchipalli, G., Keutzer, K., and Gholami, A
Receipt and verification
First computed 2026-05-18T02:44:48.166571Z
Builder pith-number-builder-2026-05-17-v1
Signature Pith Ed25519 (pith-v1-2026-05) · public key
Schema pith-number/v1.0

Canonical hash

cdbe47d5ec7dae242ab51e16603adf016f94b41cba7193b9662e67420dd513a9

Aliases

arxiv: 2605.13360 · arxiv_version: 2605.13360v2 · doi: 10.48550/arxiv.2605.13360 · pith_short_12: ZW7EPVPMPWXC · pith_short_16: ZW7EPVPMPWXCIKVV · pith_short_8: ZW7EPVPM
Agent API
Verify this Pith Number yourself
curl -sH 'Accept: application/ld+json' https://pith.science/pith/ZW7EPVPMPWXCIKVVDYLGAOW7AF \
  | jq -c '.canonical_record' \
  | python3 -c "import sys,json,hashlib; b=json.dumps(json.loads(sys.stdin.read()), sort_keys=True, separators=(',',':'), ensure_ascii=False).encode(); print(hashlib.sha256(b).hexdigest())"
# expect: cdbe47d5ec7dae242ab51e16603adf016f94b41cba7193b9662e67420dd513a9
Canonical record JSON
{
  "metadata": {
    "abstract_canon_sha256": "ef7d15d1c85b772b12482d2eac7e4a4fb8110381d6e89bf553e2b4cd7e9cd9b6",
    "cross_cats_sorted": [],
    "license": "http://creativecommons.org/licenses/by/4.0/",
    "primary_cat": "cs.LG",
    "submitted_at": "2026-05-13T11:20:52Z",
    "title_canon_sha256": "fa1e1725ee3ae24693c11dfbabdb77c0a6072cb5c06c8b2bdefb8a42cd853285"
  },
  "schema_version": "1.0",
  "source": {
    "id": "2605.13360",
    "kind": "arxiv",
    "version": 2
  }
}