pith. sign in

V oxserve: Streaming-centric serving system for speech language models

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

citation-role summary

background 1

citation-polarity summary

fields

cs.LG 2 cs.AI 1

years

2026 3

roles

background 1

polarities

background 1

clear filters

representative citing papers

VibeServe: Can AI Agents Build Bespoke LLM Serving Systems?

cs.AI · 2026-05-07 · unverdicted · novelty 8.0

VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.

M*: A Modular, Extensible, Serving System for Multimodal Models

cs.LG · 2026-06-10 · unverdicted · novelty 7.0

M* introduces the Walk Graph abstraction to serve arbitrary compositions of multimodal model components and reports latency and throughput gains over vLLM-Omni and other baselines on text-to-image, text-to-speech, and robotic planning workloads.

MURMUR: An Efficient Inference System for Long-Form ASR

cs.LG · 2026-05-31 · conditional · novelty 6.0

Murmur matches single-pass long-context ASR accuracy on AMI-IHM while cutting latency 4.2x by tuning chunk size and using intra-chunk attention sparsity via KV eviction.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • VibeServe: Can AI Agents Build Bespoke LLM Serving Systems? cs.AI · 2026-05-07 · unverdicted · none · ref 34

    VibeServe demonstrates that AI agents can synthesize bespoke LLM serving systems end-to-end, remaining competitive with vLLM in standard settings while outperforming it in six non-standard scenarios involving unusual models, workloads, or hardware.