Performance collapse in layer-pruned LLMs stems from disrupting the Silent Phase of decision-making, which blocks the transition to correct predictions, while the later Decisive Phase is robust to pruning.
Proceedings of the AAAI Conference on Artificial Intelligence , volume=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2verdicts
UNVERDICTED 2representative citing papers
Task-aware pruning improves OOD performance by removing layers that distort task-adapted representation profiles, realigning OOD inputs with the geometry observed on ID data.
citing papers explorer
-
Understanding Performance Collapse in Layer-Pruned Large Language Models via Decision Representation Transitions
Performance collapse in layer-pruned LLMs stems from disrupting the Silent Phase of decision-making, which blocks the transition to correct predictions, while the later Decisive Phase is robust to pruning.
-
TAPIOCA: Why Task- Aware Pruning Improves OOD model Capability
Task-aware pruning improves OOD performance by removing layers that distort task-adapted representation profiles, realigning OOD inputs with the geometry observed on ID data.