LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).
Efficient transformers: A survey
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
Kuramoto synchronization dynamics implement a provably unique and globally attractive attention mechanism that replaces softmax for physical substrates and shows competitive empirical performance.
Sparsity-guided distillation enables replacing attention layers in ViTs with simpler sequential modules, with sparser layers showing smaller performance drops.
The paper calls for life cycle assessment to capture embodied hardware costs and full pipeline operational costs in AI development and deployment.
This survey organizes LLM optimizer literature into categories and argues the field is shifting toward rigorous, multi-factor comparisons of convergence, memory, stability, and complexity.
citing papers explorer
-
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).