MoA framework derives a denotational normal form for attention that eliminates all intermediate arrays by algebraic construction, yielding O(n_dk + n_dv) memory traffic with numerical verification against PyTorch.
Hardware considerations for tensor implementation and analysis using the field programmable gate array
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.LG 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Attention at the Theoretical Minimum: A Mathematics of Arrays Framework for Memory-Optimal Transformer Kernels
MoA framework derives a denotational normal form for attention that eliminates all intermediate arrays by algebraic construction, yielding O(n_dk + n_dv) memory traffic with numerical verification against PyTorch.