AgentCom- press: Task-aware compression for affordable large language model agents.arXiv preprint arXiv:2601.05191, 2026

Zuhair Ahmed Khan Taha, Mohammed Mudassir Uddin, Shahnawaz Alam · 2026 · arXiv 2601.05191

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

read on arXiv browse 1 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project

cs.LG · 2026-03-22 · unverdicted · novelty 5.0

The Workload-Router-Pool architecture is a 3D framework for LLM inference optimization that synthesizes prior vLLM work into a 3x3 interaction matrix and proposes 21 research directions at the intersections.

citing papers explorer

Showing 1 of 1 citing paper.

The Workload-Router-Pool Architecture for LLM Inference Optimization: A Vision Paper from the vLLM Semantic Router Project cs.LG · 2026-03-22 · unverdicted · none · ref 80
The Workload-Router-Pool architecture is a 3D framework for LLM inference optimization that synthesizes prior vLLM work into a 3x3 interaction matrix and proposes 21 research directions at the intersections.

AgentCom- press: Task-aware compression for affordable large language model agents.arXiv preprint arXiv:2601.05191, 2026

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer