Slo-aware gpu frequency scaling for energy efficient llm inference serving

Andreas Kosmas Kakolyris, Dimosthenis Masouros, Petros Vavaroutsos, Sotirios Xydis, Dimitrios Soudris · 2024 · arXiv 2408.05235

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

read on arXiv browse 2 citing papers

citation-role summary

background 1

citation-polarity summary

background 1

representative citing papers

LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites

cs.LG · 2026-05-07 · unverdicted · novelty 7.0 · 2 refs

LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.

KAIROS: Stateful, Context-Aware Power-Efficient Agentic Inference Serving

cs.DC · 2026-04-17 · unverdicted · novelty 6.0

KAIROS reduces power by 27% on average (up to 39.8%) for agentic AI inference by using long-lived context to jointly manage GPU frequency, concurrency, and request routing across instances.

citing papers explorer

Showing 2 of 2 citing papers.

LLMSpace: Carbon Footprint Modeling for Large Language Model Inference on LEO Satellites cs.LG · 2026-05-07 · unverdicted · none · ref 51 · 2 links
LLMSpace is the first framework to jointly model operational and embodied carbon for LLM inference on LEO satellites, incorporating radiation-hardened hardware, peripheral systems, and workload patterns such as prefill-decode behavior.
KAIROS: Stateful, Context-Aware Power-Efficient Agentic Inference Serving cs.DC · 2026-04-17 · unverdicted · none · ref 26
KAIROS reduces power by 27% on average (up to 39.8%) for agentic AI inference by using long-lived context to jointly manage GPU frequency, concurrency, and request routing across instances.

Slo-aware gpu frequency scaling for energy efficient llm inference serving

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer