Analysis of 144 task-model pairs finds mathematical reasoning produces the highest attention entropy in all architectures while decoder models show significantly higher sparsity than encoders.
Petroni et al., ”Language models as knowledge bases?” in Proceedings of EMNLP-IJCNLP, 2019, pp
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.CL 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance
Analysis of 144 task-model pairs finds mathematical reasoning produces the highest attention entropy in all architectures while decoder models show significantly higher sparsity than encoders.