pith. sign in

Pythia: A suite for analyzing large language models across training and scaling

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

citation-role summary

baseline 1 method 1

citation-polarity summary

representative citing papers

ANO: A Principled Approach to Robust Policy Optimization

cs.AI · 2026-05-04 · unverdicted · novelty 6.0

ANO derives a robust policy optimizer from geometric principles that replaces clipping with a smooth redescending gradient, showing better performance and stability than PPO, SPO, and GRPO in MuJoCo, Atari, and RLHF experiments.

Superposition Yields Robust Neural Scaling

cs.LG · 2025-05-15 · conditional · novelty 6.0

Strong superposition causes neural loss to scale as the inverse of model dimension due to geometric feature overlaps, explaining scaling laws for broad frequency distributions.

citing papers explorer

Showing 6 of 6 citing papers.