Wolf et al., ”Transformers: State-of-the-art natural language process- ing,” in Proceedings of EMNLP: System Demonstrations, 2020, pp

· 2020

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

representative citing papers

Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance

cs.CL · 2026-05-14 · unverdicted · novelty 3.0

Analysis of 144 task-model pairs finds mathematical reasoning produces the highest attention entropy in all architectures while decoder models show significantly higher sparsity than encoders.

citing papers explorer

Showing 1 of 1 citing paper.

Neural Activation Patterns Across Language Model Architectures: A Comprehensive Analysis of Cognitive Task Performance cs.CL · 2026-05-14 · unverdicted · none · ref 33
Analysis of 144 task-model pairs finds mathematical reasoning produces the highest attention entropy in all architectures while decoder models show significantly higher sparsity than encoders.

Wolf et al., ”Transformers: State-of-the-art natural language process- ing,” in Proceedings of EMNLP: System Demonstrations, 2020, pp

fields

years

verdicts

representative citing papers

citing papers explorer