pith. sign in

Deep optimizer states: Towards scalable training of transformer models using interleaved offloading,

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.DC 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training

cs.DC · 2026-05-15 · unverdicted · novelty 6.0

Asteria is a runtime system that enables second-order optimization for LLMs by dynamically distributing optimizer state across GPU, CPU, and NVMe while using asynchronous inverse-root computations and bounded-staleness synchronization.

citing papers explorer

Showing 1 of 1 citing paper.

  • Runtime-Orchestrated Second-Order Optimization for Scalable LLM Training cs.DC · 2026-05-15 · unverdicted · none · ref 23

    Asteria is a runtime system that enables second-order optimization for LLMs by dynamically distributing optimizer state across GPU, CPU, and NVMe while using asynchronous inverse-root computations and bounded-staleness synchronization.