Moonwalk enables memory-efficient training of deep networks via mixed-mode gradient computation with vector-inverse-Jacobian products for submersive layers and fragmental checkpointing otherwise, matching backprop runtime at over twice the depth.
Title resolution pending
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2024 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Moonwalk: Inverse-Forward Differentiation
Moonwalk enables memory-efficient training of deep networks via mixed-mode gradient computation with vector-inverse-Jacobian products for submersive layers and fragmental checkpointing otherwise, matching backprop runtime at over twice the depth.