A cost-aware orchestration method using quantitative model performance metrics improves task accuracy by 0.90%-11.92%, energy efficiency by up to 54%, and reduces model selection latency from 4.51 s to 7.2 ms.
CATP-LLM: Empowering large language models for cost-aware tool planning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2025 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Cost-Aware Model Orchestration for LLM-based Systems
A cost-aware orchestration method using quantitative model performance metrics improves task accuracy by 0.90%-11.92%, energy efficiency by up to 54%, and reduces model selection latency from 4.51 s to 7.2 ms.