veScale-FSDP uses RaggedShard and structure-aware planning to support block-wise quantization and non-element-wise optimizers while delivering 5-66% higher throughput and 16-30% lower memory than prior FSDP systems at massive scale.
veScale: Consistent and Efficient Tensor Programming with Eager-Mode SPMD, 2025
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.DC 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
veScale-FSDP: Flexible and High-Performance FSDP at Scale
veScale-FSDP uses RaggedShard and structure-aware planning to support block-wise quantization and non-element-wise optimizers while delivering 5-66% higher throughput and 16-30% lower memory than prior FSDP systems at massive scale.