The paper defines server chain composition for large-memory jobs as an NP-hard problem and gives scalable algorithms with performance guarantees that reduce response times in foundation model serving.
Joint optimization of service function placement and flow distribution for service function chaining.IEEE Journal on Selected Areas in Communications, 35(11):2532–2541, 2017
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Serving Chain-structured Jobs with Large Memory Footprints with Application to Large Foundation Model Serving
The paper defines server chain composition for large-memory jobs as an NP-hard problem and gives scalable algorithms with performance guarantees that reduce response times in foundation model serving.