pith. sign in

Fu, Stefano Ermon, Atri Rudra, and Christopher Ré

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

fields

cs.DS 1 cs.LG 1

years

2026 1 2023 1

representative citing papers

Nearly Optimal Attention Coresets

cs.DS · 2026-05-07 · unverdicted · novelty 8.0

ε-coresets for attention exist of size O(√d e^{ρ+o(ρ)}/ε) for unit-norm keys/values and queries of norm ≤ρ, nearly matching the Ω(√d e^ρ/ε) lower bound.

citing papers explorer

Showing 2 of 2 citing papers.

  • Nearly Optimal Attention Coresets cs.DS · 2026-05-07 · unverdicted · none · ref 17

    ε-coresets for attention exist of size O(√d e^{ρ+o(ρ)}/ε) for unit-norm keys/values and queries of norm ≤ρ, nearly matching the Ω(√d e^ρ/ε) lower bound.

  • FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning cs.LG · 2023-07-17 · accept · none · ref 5

    FlashAttention-2 achieves roughly 2x speedup over FlashAttention by parallelizing attention across thread blocks and distributing work within blocks, reaching 50-73% of theoretical peak FLOPs/s on A100 GPUs.