Shortgpt: Layers in large language models are more redundant than you expect

Xin Men, Mingyu Xu, Qingyu Zhang, Qianhao Yuan, Bingning Wang, Hongyu Lin, Yaojie Lu, Xianpei Han, Weipeng Chen · 2025

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs

cs.LG · 2026-05-12 · unverdicted · novelty 6.0

LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.

Compact SO(3) Equivariant Atomistic Foundation Models via Structural Pruning

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on most Matbench Discovery metrics and downstream tasks.

citing papers explorer

Showing 2 of 2 citing papers.

Layer-wise Representation Dynamics: An Empirical Investigation Across Embedders and Base LLMs cs.LG · 2026-05-12 · unverdicted · none · ref 44
LRD framework with Frenet, NRS, and GFMI metrics shows layer-wise structure in 31 models provides usable signal for model selection and pruning on MTEB tasks.
Compact SO(3) Equivariant Atomistic Foundation Models via Structural Pruning cs.LG · 2026-05-09 · unverdicted · none · ref 16
Structural pruning of SO(3) equivariant atomistic models from large checkpoints yields 1.5-4x fewer parameters and 2.5-4x less pre-training compute than small models trained from scratch, while outperforming them on most Matbench Discovery metrics and downstream tasks.

Shortgpt: Layers in large language models are more redundant than you expect

fields

years

verdicts

representative citing papers

citing papers explorer