Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs

Guangyu Feng; Huanzhi Mao; Joseph E. Gonzalez; Prabal Dutta

arxiv: 2605.15077 · v1 · pith:MQTZIXSInew · submitted 2026-05-14 · 💻 cs.CL · cs.AI· cs.LG

Concurrency without Model Changes: Future-based Asynchronous Function Calling for LLMs

Guangyu Feng , Huanzhi Mao , Prabal Dutta , Joseph E. Gonzalez This is my paper

classification 💻 cs.CL cs.AIcs.LG

keywords functionexecutionasyncfcdecodingasynchronousbenchmarkscallingcapability

0 comments

read the original abstract

Function calling, also known as tool use, is a core capability of modern LLM agents but is typically constrained by synchronous execution semantics. Under these semantics, LLM decoding is blocked until each function call completes, resulting in increasing end-to-end latency. In this work, we introduce AsyncFC, a pure execution-layer framework that decouples LLM decoding from function execution, enabling overlap between model decoding and function execution as well as inter-function parallelism when dependencies permit. AsyncFC layers over existing models and unmodified function implementations, requiring no fine-tuning or changes to the standard synchronous function-calling protocol. Across standard function-calling benchmarks and adapted software engineering benchmarks, AsyncFC significantly reduces end-to-end task completion time while preserving task accuracy. Furthermore, these results reveal that LLMs possess a native capability to reason over symbolic futures that represent unresolved execution results, enabling an asynchronous paradigm for model-tool interaction.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Ghost Tool Calls: Issue-Time Privacy for Speculative Agent Tools
cs.CR 2026-06 unverdicted novelty 6.0

Ghost tool calls from speculative dispatch create persistent intent leaks that only issue-time policies changing or suppressing call arguments or destinations can reduce, per evaluations of twelve policies on three corpora.