GreenCache dynamically manages LLM KV cache resources to reduce carbon emissions by 15.1% on average (up to 25.3%) while meeting latency constraints for over 90% of requests on real traces.
The sunk carbon fallacy: Rethinking carbon footprint metrics for effective carbon-aware scheduling
4 Pith papers cite this work. Polarity classification is still indexing.
4
Pith papers citing it
citation-role summary
method 1
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
UCCL-Zip adds lossless compression to GPU communication to reduce LLM bottlenecks while preserving exact numerical correctness.
vMODB co-designs a Virtual Micro Service programming model with a system that unifies event and data management to enforce ACID properties in distributed asynchronous applications, outperforming eventual-consistency frameworks by up to 3x on two benchmarks.
citing papers explorer
No citing papers match the current filters.