Zero: Memory optimizations toward training trillion parameter models

Samyam Rajbhandari, Jeff Rasley, Olatunji Ruwase, Yuxiong He · 2020

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models

cs.CV · 2026-04-14 · unverdicted · novelty 6.0

CoM-PT trains vision foundation models in ascending size order using inverse knowledge transfer, allowing larger models to achieve superior performance with significantly reduced overall computational cost compared to individual training.

citing papers explorer

Showing 1 of 1 citing paper.

Chain-of-Models Pre-Training: Rethinking Training Acceleration of Vision Foundation Models cs.CV · 2026-04-14 · unverdicted · none · ref 53
CoM-PT trains vision foundation models in ascending size order using inverse knowledge transfer, allowing larger models to achieve superior performance with significantly reduced overall computational cost compared to individual training.

Zero: Memory optimizations toward training trillion parameter models

fields

years

verdicts

representative citing papers

citing papers explorer