Nvidia H100 tensor core GPU architecture

NVIDIA · 2022

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

browse 1 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness

cs.LG · 2022-05-27 · accept · novelty 7.0

FlashAttention reduces GPU high-bandwidth memory accesses in self-attention via tiling, delivering exact attention with lower IO complexity, 2-3x wall-clock speedups on models like GPT-2, and the ability to train on sequences up to 64K long.

citing papers explorer

Showing 1 of 1 citing paper.

FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness cs.LG · 2022-05-27 · accept · none · ref 63
FlashAttention reduces GPU high-bandwidth memory accesses in self-attention via tiling, delivering exact attention with lower IO complexity, 2-3x wall-clock speedups on models like GPT-2, and the ability to train on sequences up to 64K long.

Nvidia H100 tensor core GPU architecture

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer