GreenCache dynamically manages LLM KV cache resources to reduce carbon emissions by 15.1% on average (up to 25.3%) while meeting latency constraints for over 90% of requests on real traces.
The sunk carbon fallacy: Rethinking carbon footprint metrics for effective carbon-aware scheduling
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
method 1polarities
use method 1representative citing papers
UCCL-Zip adds lossless compression to GPU communication to reduce LLM bottlenecks while preserving exact numerical correctness.
vMODB co-designs a Virtual Micro Service programming model with a system that unifies event and data management to enforce ACID properties in distributed asynchronous applications, outperforming eventual-consistency frameworks by up to 3x on two benchmarks.
citing papers explorer
-
Cache Your Prompt When It's Green: Carbon-Aware Caching for Large Language Model Serving
GreenCache dynamically manages LLM KV cache resources to reduce carbon emissions by 15.1% on average (up to 25.3%) while meeting latency constraints for over 90% of requests on real traces.
-
vMODB: Unifying Event and Data Management for Distributed Asynchronous Applications
vMODB co-designs a Virtual Micro Service programming model with a system that unifies event and data management to enforce ACID properties in distributed asynchronous applications, outperforming eventual-consistency frameworks by up to 3x on two benchmarks.