pith. sign in

hub Canonical reference

nGPT: Normalized transformer with representation learning on the hypersphere

Canonical reference. 100% of citing Pith papers cite this work as background.

10 Pith papers citing it
Background 100% of classified citations

hub tools

citation-role summary

background 6

citation-polarity summary

years

2026 7 2025 3

roles

background 6

polarities

background 6

representative citing papers

Demystifying Manifold Constraints in LLM Pre-training

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Manifold constraints via the new MACRO optimizer independently bound activation scales and enforce rotational equilibrium in LLM pre-training, subsuming RMS normalization and decoupled weight decay while delivering competitive performance with convergence guarantees.

Polaris: Coupled Orbital Polar Embeddings for Hierarchical Concept Learning

cs.LG · 2026-04-30 · unverdicted · novelty 6.0

Polaris learns hierarchical concepts via coupled orbital polar embeddings on hyperspheres that separate meaning from structure using tangent projections, exponential maps, and asymmetric objectives, yielding up to 19-point gains in top-K retrieval.

Superposition Yields Robust Neural Scaling

cs.LG · 2025-05-15 · conditional · novelty 6.0

Strong superposition causes neural loss to scale as the inverse of model dimension due to geometric feature overlaps, explaining scaling laws for broad frequency distributions.

Normalized Matching Transformer

cs.CV · 2025-03-22 · unverdicted · novelty 6.0

Normalized Matching Transformer enforces unit-norm embeddings at every Transformer layer and trains with InfoNCE plus hyperspherical uniformity loss, reaching new state-of-the-art accuracy on PascalVOC and SPair-71k while converging faster than prior matching networks.

citing papers explorer

Showing 10 of 10 citing papers.