HetRL delivers up to 9.17x higher throughput for LLM RL training on heterogeneous GPUs by using hybrid and ILP-based schedulers to solve a joint optimization problem over computation and data dependencies.
Thunderserve: High-performance and cost-efficient LLM serving in cloud environments
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.DC 2verdicts
UNVERDICTED 2representative citing papers
HexiScale enables LLM training on heterogeneous GPUs via asymmetric parallelism and graph partitioning, matching homogeneous performance at equal FLOPS and delivering 1.5-2.4x higher throughput than prior heterogeneous systems.
citing papers explorer
-
HetRL: Efficient Reinforcement Learning for LLMs in Heterogeneous Environments
HetRL delivers up to 9.17x higher throughput for LLM RL training on heterogeneous GPUs by using hybrid and ILP-based schedulers to solve a joint optimization problem over computation and data dependencies.
-
HexiScale: Facilitating Large Language Model Training over Heterogeneous Hardware
HexiScale enables LLM training on heterogeneous GPUs via asymmetric parallelism and graph partitioning, matching homogeneous performance at equal FLOPS and delivering 1.5-2.4x higher throughput than prior heterogeneous systems.