Efficient transformers: A survey

Tay, Yi, Dehghani, Mostafa, Bahri, Dara, Metzler, Donald , title = · 2022 · DOI 10.1145/3530811

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

cs.CL · 2023-08-28 · unverdicted · novelty 8.0

LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).

Attention by Synchronization in Coupled Oscillator Networks

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

Kuramoto synchronization dynamics implement a provably unique and globally attractive attention mechanism that replaces softmax for physical substrates and shows competitive empirical performance.

From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

Sparsity-guided distillation enables replacing attention layers in ViTs with simpler sequential modules, with sparser layers showing smaller performance drops.

Evaluation of ML Resource Utilization Requires Model Life Cycle Assessment

cs.LG · 2026-05-31 · unverdicted · novelty 4.0

The paper calls for life cycle assessment to capture embodied hardware costs and full pipeline operational costs in AI development and deployment.

Navigating LLM Valley: From AdamW to Memory-Efficient and Matrix-Based Optimizers

cs.LG · 2026-05-09 · unverdicted · novelty 3.0

This survey organizes LLM optimizer literature into categories and argues the field is shifting toward rigorous, multi-factor comparisons of convergence, memory, stability, and complexity.

Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

cs.AR · 2026-05-12

citing papers explorer

Showing 1 of 1 citing paper after filters.

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding cs.CL · 2023-08-28 · unverdicted · none · ref 121
LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).

Efficient transformers: A survey

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer