LiveR enables live reconfiguration for elastic LLM training by asynchronously preparing new parallel worlds and streaming reshaped model state over interconnects, reducing downtime to seconds and achieving 14-23x faster reconfiguration than checkpoint/restart.
Pytorch distributed: Experiences on accelerating data parallel training, 2020
2 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
WhisperRT converts Whisper to a causal streaming ASR model via encoder causality, decoder synchronization on partial states, and fine-tuning, achieving better performance than non-fine-tuned streaming methods on sub-300ms chunks with lower complexity.
citing papers explorer
-
LiveR: Fine-Grained Elasticity via Live Reconfiguration for Model Training
LiveR enables live reconfiguration for elastic LLM training by asynchronously preparing new parallel worlds and streaming reshaped model state over interconnects, reducing downtime to seconds and achieving 14-23x faster reconfiguration than checkpoint/restart.
-
WhisperRT -- Turning Whisper into a Causal Streaming Model
WhisperRT converts Whisper to a causal streaming ASR model via encoder causality, decoder synchronization on partial states, and fine-tuning, achieving better performance than non-fine-tuned streaming methods on sub-300ms chunks with lower complexity.