An llm compiler for parallel function calling

Sehoon Kim, Suhong Moon, Ryan Tabrizi, Nicholas Lee, Michael W Mahoney, Kurt Keutzer, Amir Gholami · 2024

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference

cs.DC · 2026-01-19 · unverdicted · novelty 6.0

Sutradhara co-designs orchestrator and LLM serving to overlap tool execution with prefill, stream tool dispatch during decode, and use semantic hints for cache management, yielding up to 77% higher load at fixed median FTR latency or 15% lower median FTR at fixed load.

citing papers explorer

Showing 1 of 1 citing paper.

Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference cs.DC · 2026-01-19 · unverdicted · none · ref 8
Sutradhara co-designs orchestrator and LLM serving to overlap tool execution with prefill, stream tool dispatch during decode, and use semantic hints for cache management, yielding up to 77% higher load at fixed median FTR latency or 15% lower median FTR at fixed load.

An llm compiler for parallel function calling

fields

years

verdicts

representative citing papers

citing papers explorer