DynaTrain introduces a Virtual Parameter Space abstraction to enable sub-second online parallelism reconfiguration for elastic LLM training on models up to 235B parameters.
Antman: Dynamic scaling on GPU clus- ters for deep learning
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
DynaTrain: Fast Online Parallelism Switching for Elastic LLM Training
DynaTrain introduces a Virtual Parameter Space abstraction to enable sub-second online parallelism reconfiguration for elastic LLM training on models up to 235B parameters.