Efficient transformers: A survey

Yi Tay, Mostafa Dehghani, Dara Bahri, Donald Metzler · 2022 · DOI 10.1145/3530811

6 Pith papers cite this work. Polarity classification is still indexing.

6 Pith papers citing it

open at publisher browse 6 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding

cs.CL · 2023-08-28 · unverdicted · novelty 8.0

LongBench is the first bilingual multi-task benchmark for long context understanding in LLMs, containing 21 datasets in 6 categories with average lengths of 6711 words (English) and 13386 characters (Chinese).

Attention by Synchronization in Coupled Oscillator Networks

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

Kuramoto synchronization dynamics implement a provably unique and globally attractive attention mechanism that replaces softmax for physical substrates and shows competitive empirical performance.

From Sparsity to Simplicity: Enabling Simpler Sequential Replacements via Sparse Attention Distillation

cs.LG · 2026-05-15 · unverdicted · novelty 5.0

Sparsity-guided distillation enables replacing attention layers in ViTs with simpler sequential modules, with sparser layers showing smaller performance drops.

Hardware-Software Co-Design of Scalable, Energy-Efficient Analog Recurrent Computations

cs.AR · 2026-05-12 · unverdicted · novelty 5.0 · 2 refs

BMRUs enable analog recurrent neural network hardware via discrete outputs that suppress noise 20-fold, with one-to-one parameter-to-circuit mapping and linear power scaling for recurrence.

Evaluation of ML Resource Utilization Requires Model Life Cycle Assessment

cs.LG · 2026-05-31 · unverdicted · novelty 4.0

The paper calls for life cycle assessment to capture embodied hardware costs and full pipeline operational costs in AI development and deployment.

Navigating LLM Valley: From AdamW to Memory-Efficient and Matrix-Based Optimizers

cs.LG · 2026-05-09 · unverdicted · novelty 3.0

This survey organizes LLM optimizer literature into categories and argues the field is shifting toward rigorous, multi-factor comparisons of convergence, memory, stability, and complexity.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.

Efficient transformers: A survey

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer