Caching for Dollars, Not Hits: An Exact Offline Reference for Cloud-Egress Caching and the Crossover That Decides When It Pays
Pith reviewed 2026-06-26 14:57 UTC · model grok-4.3
The pith
For uniform-size caches the dollar-optimal offline policy can be computed exactly in polynomial time with an integral interval linear program.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For uniform-size page caches with heterogeneous miss costs the offline dollar-optimum is exact in polynomial time via an integral interval linear program validated against brute force; variable sizes are NP-hard, so the paper extends the flow-based offline bound from the hit-ratio objective to dollars as cost-FOO, tight to about four percent. Against this reference the work identifies a heterogeneity-regret law, a contention frontier, and the crossover s* = GET_fee/egress_rate that decides when dollar-aware caching pays.
What carries the argument
The integral interval linear program that encodes the minimum-dollar-cost caching decisions for uniform page sizes.
If this is right
- LRU dollar regret rises with miss-cost dispersion across the workload.
- Cost-aware GreedyDual reduces dollar regret to roughly one tenth of LRU's.
- GreedyDual residual regret reaches near zero precisely when the cache budget covers the expensive working set.
- The closed-form crossover s* = GET_fee/egress_rate predicts the regime in which dollar-aware caching becomes necessary.
- Real traces can be shifted across the crossover by the price vector alone.
Where Pith is reading between the lines
- The exact uniform-size algorithm could be used to audit the dollar gap of any online heuristic on production traces without needing to change the production system.
- The four-percent tightness of the cost-FOO bound for variable sizes suggests it could serve as a practical stopping criterion for heuristic search even when exact solutions remain unavailable.
- Because the crossover depends only on the two price parameters, operators can decide whether to switch policies by measuring their GET fee and egress rate rather than running full simulations.
Load-bearing premise
The per-GET fee and per-byte egress rate are known, fixed, and accurately reflect the actual cloud bill with no other billing dimensions affecting the optimum.
What would settle it
Running the linear program on a trace instance and checking whether its computed dollar cost equals the minimum cost found by exhaustive enumeration on the same small uniform-size instance.
Figures
read the original abstract
When a cache miss fetches from cloud object storage, the bill is per GET request and per byte of egress, not latency. Classic caching minimizes the miss rate, the wrong objective: a rarely but expensively fetched object can cost thousands of times more dollars than a frequently but cheaply fetched one. Generalized-caching theory bounds the miss-cost objective, but no reported benchmark measures how far deployed heuristics sit from the dollar-optimal offline policy on real cloud prices. We supply that reference. For uniform-size page caches with heterogeneous miss costs the offline dollar-optimum is exact in polynomial time via an integral interval linear program -- validated against brute force; variable sizes are NP-hard, so we extend the flow-based offline bound from the hit-ratio objective to dollars (cost-FOO), tight to about four percent. Against this reference we find: (i) a heterogeneity-regret law -- LRU's dollar-regret rises with miss-cost dispersion (Spearman 0.87) while cost-aware GreedyDual cuts it to roughly a tenth; (ii) a contention frontier -- GreedyDual's residual regret collapses to near zero exactly when the budget fits the expensive working set, and is the open slice otherwise; and (iii) a closed-form crossover s* = GET_fee/egress_rate (about 4 KB on S3, 330 B on GCS) that predicts which deployments need dollar-aware caching. On real memcache and CDN traces the price vector alone moves the workload across s*, shifting the regime as predicted. The artifact is a reproducible billing-faithful benchmark; heuristics and bounds it builds on are prior work, credited.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that classic hit-ratio caching is the wrong objective for cloud object storage, where misses incur a two-term dollar cost (per-GET fee plus per-byte egress). For uniform-size objects it supplies an exact polynomial-time offline optimum via an integral interval linear program, validated against brute force; for variable sizes it extends the prior flow-based offline bound (cost-FOO) and reports it is tight to roughly 4 %. Using this reference on real memcache and CDN traces it identifies a heterogeneity-regret law (LRU regret rises with cost dispersion), a contention frontier for GreedyDual, and a closed-form crossover s* = GET_fee / egress_rate that predicts when dollar-aware caching is required.
Significance. If the central claims hold, the work supplies a reproducible, billing-faithful benchmark that shifts evaluation from hit ratio to dollar cost and gives practitioners a simple price-based test (s*) for whether cost-aware policies are needed. The exact ILP for uniform sizes, the cost-FOO extension, and the artifact are concrete strengths.
major comments (1)
- [Abstract] Abstract: the statements that the ILP is 'validated against brute force' and that cost-FOO is 'tight to about four percent' are load-bearing for the central claims, yet the manuscript provides no derivation details, instance-generation procedure, or data-exclusion rules that would allow verification that post-hoc choices do not affect the reported exactness or tightness.
minor comments (2)
- The model is explicitly restricted to the two-term miss cost (GET_fee + size * egress_rate); the paper does not claim completeness with respect to PUT fees, storage, or tiering, so the skeptic concern does not affect the internal validity of the stated results.
- The artifact is described as reproducible and the prior heuristics and flow bounds are credited as external work; these are positive features.
Simulated Author's Rebuttal
We thank the referee for the careful review and the minor-revision recommendation. The single major comment is addressed below.
read point-by-point responses
-
Referee: [Abstract] Abstract: the statements that the ILP is 'validated against brute force' and that cost-FOO is 'tight to about four percent' are load-bearing for the central claims, yet the manuscript provides no derivation details, instance-generation procedure, or data-exclusion rules that would allow verification that post-hoc choices do not affect the reported exactness or tightness.
Authors: We agree that these validation results are load-bearing and that the manuscript currently lacks the procedural details needed for independent verification. In the revised manuscript we will add a dedicated appendix (with forward reference from the abstract) that specifies: (1) the exact brute-force instance generator and enumeration procedure used for the uniform-size ILP validation, (2) the data-exclusion rules applied to the generated instances, and (3) the derivation and instance set underlying the reported 4 % cost-FOO tightness bound. This will allow readers to reproduce the exactness and tightness claims. revision: yes
Circularity Check
No circularity: offline optimum derived from first-principles ILP and flow extension
full rationale
The paper constructs the dollar-optimal offline reference directly from an integral interval LP (uniform sizes) and a cost-FOO flow extension (variable sizes), both defined on the explicit two-term miss-cost model and validated against brute force. These steps are independent of the evaluation data or fitted parameters; the closed-form crossover and regret laws follow from the same model without self-referential reduction. Prior heuristics and hit-ratio bounds are credited as external. No quoted step matches any enumerated circularity pattern.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Cloud billing is exactly the sum of per-GET fees and per-byte egress charges with no other dimensions.
Reference graph
Works this paper leans on
-
[1]
SIAM Journal on Computing , volume =
Bansal, Nikhil and Buchbinder, Niv and Naor, Joseph (Seffi) , title =. SIAM Journal on Computing , volume =
-
[2]
USENIX Symposium on Internet Technologies and Systems (USITS) , year =
Cao, Pei and Irani, Sandy , title =. USENIX Symposium on Internet Technologies and Systems (USITS) , year =
-
[3]
and Beckmann, Nathan and Harchol-Balter, Mor , title =
Berger, Daniel S. and Beckmann, Nathan and Harchol-Balter, Mor , title =. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS / SIGMETRICS) , volume =
-
[4]
Journal of the ACM , volume =
Lykouris, Thodoris and Vassilvitskii, Sergei , title =. Journal of the ACM , volume =
-
[5]
Online Metric Algorithms with Untrusted Predictions , booktitle =
Antoniadis, Antonios and Coester, Christian and Eli. Online Metric Algorithms with Untrusted Predictions , booktitle =
-
[6]
General Caching Is Hard: Even with Small Pages , booktitle =
Folwarczn. General Caching Is Hard: Even with Small Pages , booktitle =
-
[7]
ACM SIGMOD International Conference on Management of Data , year =
Dageville, Benoit and Cruanes, Thierry and Zukowski, Marcin and others , title =. ACM SIGMOD International Conference on Management of Data , year =
-
[8]
Yang, Juncheng and Yue, Yao and Rashmi, K. V. , title =. USENIX Symposium on Operating Systems Design and Implementation (OSDI) , year =
-
[9]
and Li, Kai and Lloyd, Wyatt , title =
Song, Zhenyu and Berger, Daniel S. and Li, Kai and Lloyd, Wyatt , title =. USENIX Symposium on Networked Systems Design and Implementation (NSDI) , year =
-
[10]
and Manasse, Mark S
Karlin, Anna R. and Manasse, Mark S. and McGeoch, Lyle A. and Owicki, Susan , title =. Algorithmica , year =
-
[11]
Belady, L. A. , title =. IBM Systems Journal , volume =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.