Title resolution pending

· 2025 · arXiv 2511.11163

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

Exploiting weight-space symmetries for approximating curvature

cs.LG · 2026-05-30 · unverdicted · novelty 7.0

A framework that builds tractable structured Hessian approximations by averaging over user-chosen weight-space symmetry groups, recovering Shampoo-like estimates for one choice of group.

Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers

math.OC · 2026-05-18 · unverdicted · novelty 6.0 · 2 refs

Proposes equivariant optimizer updates matched to layer symmetries for embeddings, SwiGLU MLPs, and MoE routers, with reported gains in validation loss and training stability on several language model architectures.

citing papers explorer

Showing 1 of 1 citing paper after filters.

Symmetry-Compatible Principle for Optimizer Design: Embeddings, LM Heads, SwiGLU MLPs, and MoE Routers math.OC · 2026-05-18 · unverdicted · none · ref 121 · 2 links
Proposes equivariant optimizer updates matched to layer symmetries for embeddings, SwiGLU MLPs, and MoE routers, with reported gains in validation loss and training stability on several language model architectures.

Title resolution pending

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer