arXiv preprint arXiv:2507.10069 , year=

Elasticmm: Efficient multimodal llms serving with elastic multimodal parallelism , author= · 2025 · arXiv 2507.10069

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

citation-role summary

method 1

citation-polarity summary

use method 1

representative citing papers

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

cs.DC · 2026-05-13 · conditional · novelty 7.0

KVServe delivers up to 9.13x job completion time speedup and 32.8x time-to-first-token reduction by making KV cache compression service-aware and adaptive in disaggregated LLM serving.

AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training

cs.DC · 2026-05-18 · unverdicted · novelty 5.0

AdaptiveLoad cuts computational imbalance in video DiT training from 39% to 18.9% and raises throughput 27.2% via memory-compute constraints and a custom LayerNorm-Modulate kernel.

Position: LLM Serving Needs Mathematical Optimization and Algorithmic Foundations, Not Just Heuristics

cs.DC · 2026-05-02 · accept · novelty 4.0

LLM serving requires mathematical optimization and algorithms with provable guarantees rather than generic heuristics that fail unpredictably on LLM workloads.

citing papers explorer

Showing 3 of 3 citing papers.

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving cs.DC · 2026-05-13 · conditional · none · ref 31
KVServe delivers up to 9.13x job completion time speedup and 32.8x time-to-first-token reduction by making KV cache compression service-aware and adaptive in disaggregated LLM serving.
AdaptiveLoad: Towards Efficient Video Diffusion Transformer Training cs.DC · 2026-05-18 · unverdicted · none · ref 21
AdaptiveLoad cuts computational imbalance in video DiT training from 39% to 18.9% and raises throughput 27.2% via memory-compute constraints and a custom LayerNorm-Modulate kernel.
Position: LLM Serving Needs Mathematical Optimization and Algorithmic Foundations, Not Just Heuristics cs.DC · 2026-05-02 · accept · none · ref 45
LLM serving requires mathematical optimization and algorithms with provable guarantees rather than generic heuristics that fail unpredictably on LLM workloads.

arXiv preprint arXiv:2507.10069 , year=

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer