Sutradhara co-designs orchestrator and LLM serving to overlap tool execution with prefill, stream tool dispatch during decode, and use semantic hints for cache management, yielding up to 77% higher load at fixed median FTR latency or 15% lower median FTR at fixed load.
Practical considerations for agentic llm systems
2 Pith papers cite this work. Polarity classification is still indexing.
years
2026 2verdicts
UNVERDICTED 2representative citing papers
A review of 114 studies classifies motivations into nine categories, analyzes common models and benchmarks, synthesizes challenges into six categories with 26 subcategories and solutions, and identifies six future research directions with 18 subcategories.
citing papers explorer
-
Sutradhara: An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference
Sutradhara co-designs orchestrator and LLM serving to overlap tool execution with prefill, stream tool dispatch during decode, and use semantic hints for cache management, yielding up to 77% higher load at fixed median FTR latency or 15% lower median FTR at fixed load.
-
LLM-Based Multi-Agent Systems for Code Generation: A Multi-Vocal Literature Review
A review of 114 studies classifies motivations into nine categories, analyzes common models and benchmarks, synthesizes challenges into six categories with 26 subcategories and solutions, and identifies six future research directions with 18 subcategories.