pith. sign in

Llama-3.1-8B.At B≤100 calibration-free SLEB wins by removing fewer layers; the budget ceiling acts as an implicit stopping rule

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

fields

cs.LG 1

years

2026 1

verdicts

UNVERDICTED 1

representative citing papers

No Free Swap: Protocol-Dependent Layer Redundancy in Transformers

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

Replacement and interchange swap-KL protocols for layer redundancy in transformers disagree on pruning safety, with the gap growing during training on Pythia models and producing different removal costs on Qwen3-8B versus Llama-3.1-8B.

citing papers explorer

Showing 1 of 1 citing paper.

  • No Free Swap: Protocol-Dependent Layer Redundancy in Transformers cs.LG · 2026-05-15 · unverdicted · none · ref 16

    Replacement and interchange swap-KL protocols for layer redundancy in transformers disagree on pruning safety, with the gap growing during training on Pythia models and producing different removal costs on Qwen3-8B versus Llama-3.1-8B.