Strait cuts high-priority deadline violations in ML inference serving by 1-11 percentage points through contention modeling and priority scheduling under high GPU load.
A Survey of Multi-tenant Deep Learning Inference on GPU
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
THEMIS improves multi-tenant FPGA scheduling fairness by 24.2-98.4% over prior methods via spatiotemporal metrics, energy-aware intervals, and heterogeneous region handling, evaluated on Xilinx Zedboard XC7Z020.
citing papers explorer
-
Strait: Perceiving Priority and Interference in ML Inference Serving
Strait cuts high-priority deadline violations in ML inference serving by 1-11 percentage points through contention modeling and priority scheduling under high GPU load.
-
THEMIS: Time, Heterogeneity, and Energy Minded Scheduling for Fair Multi-Tenant Use in FPGAs
THEMIS improves multi-tenant FPGA scheduling fairness by 24.2-98.4% over prior methods via spatiotemporal metrics, energy-aware intervals, and heterogeneous region handling, evaluated on Xilinx Zedboard XC7Z020.