A fast optimization view: Reformu- lating single layer attention in llm based on tensor and svm trick, and solving it in matrix multiplication time

Gao, Y · 2023 · arXiv 2309.07418

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

unclear 1

representative citing papers

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels

cs.LG · 2026-06-05 · unverdicted · novelty 6.0

MoA framework derives a denotational normal form for attention that eliminates all intermediate arrays by algebraic construction, yielding O(n_dk + n_dv) memory traffic with numerical verification against PyTorch.

H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

cs.LG · 2023-06-24 · unverdicted · novelty 6.0

H2O evicts non-heavy-hitter tokens from the KV cache using a dynamic submodular policy, retaining recent and frequent-co-occurrence tokens to reduce memory while preserving accuracy.

citing papers explorer

Showing 2 of 2 citing papers.

Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels cs.LG · 2026-06-05 · unverdicted · none · ref 8
MoA framework derives a denotational normal form for attention that eliminates all intermediate arrays by algebraic construction, yielding O(n_dk + n_dv) memory traffic with numerical verification against PyTorch.
H$_2$O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models cs.LG · 2023-06-24 · unverdicted · none · ref 111
H2O evicts non-heavy-hitter tokens from the KV cache using a dynamic submodular policy, retaining recent and frequent-co-occurrence tokens to reduce memory while preserving accuracy.

A fast optimization view: Reformu- lating single layer attention in llm based on tensor and svm trick, and solving it in matrix multiplication time

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer