In2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)

Splitwise: Efficient generative llm inference using phase splitting

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

browse 2 citing papers

representative citing papers

WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching

cs.DC · 2026-01-15 · unverdicted · novelty 6.0

WISP suppresses wasted drafting time and verification interference in edge-cloud speculative LLM serving through dynamic drafting and SLO-aware batching, delivering up to 2.1x capacity and 1.94x goodput gains over centralized and prior baselines.

AlignedServe: Orchestrating Prefix-aware Batching to Build a High-throughput and Computing-efficient LLM Serving System

cs.DC · 2026-05-22 · unverdicted · novelty 5.0

AlignedServe uses prefix-aware batching, large CPU in-flight request pools, batch scheduling, and GPU-to-GPU KV prefetching to raise decoding throughput up to 1.98x and cut latency up to 7.4x versus prior serving systems.

citing papers explorer

Showing 2 of 2 citing papers.

WISP: Waste- and Interference-Suppressed Distributed Speculative LLM Serving at the Edge via Dynamic Drafting and SLO-Aware Batching cs.DC · 2026-01-15 · unverdicted · none · ref 28
WISP suppresses wasted drafting time and verification interference in edge-cloud speculative LLM serving through dynamic drafting and SLO-aware batching, delivering up to 2.1x capacity and 1.94x goodput gains over centralized and prior baselines.
AlignedServe: Orchestrating Prefix-aware Batching to Build a High-throughput and Computing-efficient LLM Serving System cs.DC · 2026-05-22 · unverdicted · none · ref 28
AlignedServe uses prefix-aware batching, large CPU in-flight request pools, batch scheduling, and GPU-to-GPU KV prefetching to raise decoding throughput up to 1.98x and cut latency up to 7.4x versus prior serving systems.

In2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)

fields

years

verdicts

representative citing papers

citing papers explorer