← back to paper
arxiv: 2605.08913 · 2 revisions
Non-Monotonic Latency in Apple MPS Decoding: KV Cache Interactions and Execution Regimes