pith. sign in

The best performing run among all of these achieved a final loss of 3.12 while the best Shampoo run achieved a final loss of 3.10

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2024 1

verdicts

ACCEPT 1

representative citing papers

SOAP: Improving and Stabilizing Shampoo using Adam

cs.LG · 2024-09-17 · accept · novelty 8.0

SOAP runs Adam in the eigenbasis of Shampoo's preconditioner, cutting iterations by over 40% versus AdamW on 360M-660M language models while adding only one hyperparameter.

citing papers explorer

Showing 1 of 1 citing paper.

  • SOAP: Improving and Stabilizing Shampoo using Adam cs.LG · 2024-09-17 · accept · none · ref 17

    SOAP runs Adam in the eigenbasis of Shampoo's preconditioner, cutting iterations by over 40% versus AdamW on 360M-660M language models while adding only one hyperparameter.