pith. sign in

Why do small language models underperform? studying lan- guage model saturation via the softmax bottleneck.arXiv preprint arXiv:2404.07647,

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

When Less is More: The LLM Scaling Paradox in Context Compression

cs.LG · 2026-02-10 · unverdicted · novelty 6.0

Larger LLM compressors in lossy setups often yield less faithful context reconstructions due to knowledge overwriting and semantic drift, with mid-sized models outperforming larger ones across 27 tested configurations.

citing papers explorer

Showing 1 of 1 citing paper.

  • When Less is More: The LLM Scaling Paradox in Context Compression cs.LG · 2026-02-10 · unverdicted · none · ref 4

    Larger LLM compressors in lossy setups often yield less faithful context reconstructions due to knowledge overwriting and semantic drift, with mid-sized models outperforming larger ones across 27 tested configurations.