Caching for Dollars, Not Hits: An Exact Offline Reference for Cloud-Egress Caching and the Crossover That Decides When It Pays

Madhulatha Mandarapu; Sandeep Kunkunuru

arxiv: 2606.20539 · v2 · pith:C3J6KH5Ynew · submitted 2026-06-18 · 💻 cs.DB · cs.DS

Caching for Dollars, Not Hits: An Exact Offline Reference for Cloud-Egress Caching and the Crossover That Decides When It Pays

Madhulatha Mandarapu , Sandeep Kunkunuru This is my paper

Pith reviewed 2026-06-26 14:57 UTC · model grok-4.3

classification 💻 cs.DB cs.DS

keywords cachingcloud egressmiss costoffline optimallinear programGreedyDualLRUdollar objective

0 comments

The pith

For uniform-size caches the dollar-optimal offline policy can be computed exactly in polynomial time with an integral interval linear program.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Cloud bills for object storage charge per GET request and per byte of egress, so minimizing miss rate can leave large dollar costs on the table when some objects are far more expensive to miss than others. The paper supplies the missing exact offline reference for the true dollar objective. For uniform page sizes it reduces the problem to an integral interval linear program that runs in polynomial time and matches brute force on small cases. For variable sizes it extends an existing flow-based bound to the dollar setting and shows the bound stays within about four percent. Using this reference the work derives a regret law for LRU, a contention frontier for GreedyDual, and a closed-form crossover size that tells when dollar-aware caching is worth the effort.

Core claim

For uniform-size page caches with heterogeneous miss costs the offline dollar-optimum is exact in polynomial time via an integral interval linear program validated against brute force; variable sizes are NP-hard, so the paper extends the flow-based offline bound from the hit-ratio objective to dollars as cost-FOO, tight to about four percent. Against this reference the work identifies a heterogeneity-regret law, a contention frontier, and the crossover s* = GET_fee/egress_rate that decides when dollar-aware caching pays.

What carries the argument

The integral interval linear program that encodes the minimum-dollar-cost caching decisions for uniform page sizes.

If this is right

LRU dollar regret rises with miss-cost dispersion across the workload.
Cost-aware GreedyDual reduces dollar regret to roughly one tenth of LRU's.
GreedyDual residual regret reaches near zero precisely when the cache budget covers the expensive working set.
The closed-form crossover s* = GET_fee/egress_rate predicts the regime in which dollar-aware caching becomes necessary.
Real traces can be shifted across the crossover by the price vector alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The exact uniform-size algorithm could be used to audit the dollar gap of any online heuristic on production traces without needing to change the production system.
The four-percent tightness of the cost-FOO bound for variable sizes suggests it could serve as a practical stopping criterion for heuristic search even when exact solutions remain unavailable.
Because the crossover depends only on the two price parameters, operators can decide whether to switch policies by measuring their GET fee and egress rate rather than running full simulations.

Load-bearing premise

The per-GET fee and per-byte egress rate are known, fixed, and accurately reflect the actual cloud bill with no other billing dimensions affecting the optimum.

What would settle it

Running the linear program on a trace instance and checking whether its computed dollar cost equals the minimum cost found by exhaustive enumeration on the same small uniform-size instance.

Figures

Figures reproduced from arXiv: 2606.20539 by Madhulatha Mandarapu, Sandeep Kunkunuru.

**Figure 2.** Figure 2: Contention frontier: GDSF residual regret collapses at B = Nexp. Real small-object trace (Twitter memcache; [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 4.** Figure 4: Real Wikipedia CDN trace (H = 12– 18): GDSF/LRU regret ratio falls as s ⋆ drops. 5 Related work Cost-aware caching by miss-cost-over-size weighting is GreedyDual-Size [Cao and Irani, 1997], the practical workhorse we measure. The online competitive theory for generalized (size- and cost-) caching is O(log k) randomized [Bansal et al., 2012]; ski-rental / rent-or-buy [Karlin et al., 1994] governs the per-re… view at source ↗

read the original abstract

When a cache miss fetches from cloud object storage, the bill is per GET request and per byte of egress, not latency. Classic caching minimizes the miss rate, the wrong objective: a rarely but expensively fetched object can cost thousands of times more dollars than a frequently but cheaply fetched one. Generalized-caching theory bounds the miss-cost objective, but no reported benchmark measures how far deployed heuristics sit from the dollar-optimal offline policy on real cloud prices. We supply that reference. For uniform-size page caches with heterogeneous miss costs the offline dollar-optimum is exact in polynomial time via an integral interval linear program -- validated against brute force; variable sizes are NP-hard, so we extend the flow-based offline bound from the hit-ratio objective to dollars (cost-FOO), tight to about four percent. Against this reference we find: (i) a heterogeneity-regret law -- LRU's dollar-regret rises with miss-cost dispersion (Spearman 0.87) while cost-aware GreedyDual cuts it to roughly a tenth; (ii) a contention frontier -- GreedyDual's residual regret collapses to near zero exactly when the budget fits the expensive working set, and is the open slice otherwise; and (iii) a closed-form crossover s* = GET_fee/egress_rate (about 4 KB on S3, 330 B on GCS) that predicts which deployments need dollar-aware caching. On real memcache and CDN traces the price vector alone moves the workload across s*, shifting the regime as predicted. The artifact is a reproducible billing-faithful benchmark; heuristics and bounds it builds on are prior work, credited.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a new exact reference for dollar-optimal cloud caching and a crossover rule, but rests on a narrow billing model.

read the letter

The main contribution is an exact offline reference for dollar-optimal caching under cloud egress prices. For uniform sizes they solve it in polynomial time with an integral interval LP, validated against brute force. For variable sizes they extend the flow bound to the cost objective and get within 4% of optimal. From this they extract a heterogeneity-regret law showing LRU regret rises with cost dispersion, a contention frontier where GreedyDual regret drops to zero when budget fits the expensive set, and a closed-form crossover s* = GET_fee / egress_rate that predicts when dollar-aware policies pay off. Real traces show the price vector can move a workload across that point as predicted.

They do a good job building a reproducible benchmark that uses actual cloud prices and credits the prior work on generalized caching. The separation between the new reference and the evaluated heuristics is clean. The artifact should let others reproduce the dollar measurements.

The central assumption is that the miss cost is only the per-GET fee plus bytes times egress rate. The paper does not appear to test what happens when storage fees, PUT costs, or tiering enter the bill. That would change the optimum and could invalidate the crossover formula, so the stress-test concern holds. If the full paper shows they addressed this, that would strengthen the claims.

Without the full experimental setup it is difficult to verify how robust the 4% tightness is or whether post-hoc choices affected the results. The abstract mentions 4% tightness but leaves the derivation and data rules unclear.

This is useful for anyone designing or evaluating caches for cloud object stores who wants to know when cost awareness beats standard policies like LRU. It is worth a serious referee because the reference is new and the decision rule is simple enough to apply in practice.

I would recommend sending it for peer review.

Referee Report

1 major / 2 minor

Summary. The paper claims that classic hit-ratio caching is the wrong objective for cloud object storage, where misses incur a two-term dollar cost (per-GET fee plus per-byte egress). For uniform-size objects it supplies an exact polynomial-time offline optimum via an integral interval linear program, validated against brute force; for variable sizes it extends the prior flow-based offline bound (cost-FOO) and reports it is tight to roughly 4 %. Using this reference on real memcache and CDN traces it identifies a heterogeneity-regret law (LRU regret rises with cost dispersion), a contention frontier for GreedyDual, and a closed-form crossover s* = GET_fee / egress_rate that predicts when dollar-aware caching is required.

Significance. If the central claims hold, the work supplies a reproducible, billing-faithful benchmark that shifts evaluation from hit ratio to dollar cost and gives practitioners a simple price-based test (s*) for whether cost-aware policies are needed. The exact ILP for uniform sizes, the cost-FOO extension, and the artifact are concrete strengths.

major comments (1)

[Abstract] Abstract: the statements that the ILP is 'validated against brute force' and that cost-FOO is 'tight to about four percent' are load-bearing for the central claims, yet the manuscript provides no derivation details, instance-generation procedure, or data-exclusion rules that would allow verification that post-hoc choices do not affect the reported exactness or tightness.

minor comments (2)

The model is explicitly restricted to the two-term miss cost (GET_fee + size * egress_rate); the paper does not claim completeness with respect to PUT fees, storage, or tiering, so the skeptic concern does not affect the internal validity of the stated results.
The artifact is described as reproducible and the prior heuristics and flow bounds are credited as external work; these are positive features.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful review and the minor-revision recommendation. The single major comment is addressed below.

read point-by-point responses

Referee: [Abstract] Abstract: the statements that the ILP is 'validated against brute force' and that cost-FOO is 'tight to about four percent' are load-bearing for the central claims, yet the manuscript provides no derivation details, instance-generation procedure, or data-exclusion rules that would allow verification that post-hoc choices do not affect the reported exactness or tightness.

Authors: We agree that these validation results are load-bearing and that the manuscript currently lacks the procedural details needed for independent verification. In the revised manuscript we will add a dedicated appendix (with forward reference from the abstract) that specifies: (1) the exact brute-force instance generator and enumeration procedure used for the uniform-size ILP validation, (2) the data-exclusion rules applied to the generated instances, and (3) the derivation and instance set underlying the reported 4 % cost-FOO tightness bound. This will allow readers to reproduce the exactness and tightness claims. revision: yes

Circularity Check

0 steps flagged

No circularity: offline optimum derived from first-principles ILP and flow extension

full rationale

The paper constructs the dollar-optimal offline reference directly from an integral interval LP (uniform sizes) and a cost-FOO flow extension (variable sizes), both defined on the explicit two-term miss-cost model and validated against brute force. These steps are independent of the evaluation data or fitted parameters; the closed-form crossover and regret laws follow from the same model without self-referential reduction. Prior heuristics and hit-ratio bounds are credited as external. No quoted step matches any enumerated circularity pattern.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

No free parameters are introduced; the crossover is algebraically derived from the two price components. Relies on standard ILP integrality and flow techniques from prior literature.

axioms (1)

domain assumption Cloud billing is exactly the sum of per-GET fees and per-byte egress charges with no other dimensions.
Used to define the objective function throughout.

pith-pipeline@v0.9.1-grok · 5847 in / 1251 out tokens · 21788 ms · 2026-06-26T14:57:33.947414+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

11 extracted references

[1]

SIAM Journal on Computing , volume =

Bansal, Nikhil and Buchbinder, Niv and Naor, Joseph (Seffi) , title =. SIAM Journal on Computing , volume =
[2]

USENIX Symposium on Internet Technologies and Systems (USITS) , year =

Cao, Pei and Irani, Sandy , title =. USENIX Symposium on Internet Technologies and Systems (USITS) , year =
[3]

and Beckmann, Nathan and Harchol-Balter, Mor , title =

Berger, Daniel S. and Beckmann, Nathan and Harchol-Balter, Mor , title =. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS / SIGMETRICS) , volume =
[4]

Journal of the ACM , volume =

Lykouris, Thodoris and Vassilvitskii, Sergei , title =. Journal of the ACM , volume =
[5]

Online Metric Algorithms with Untrusted Predictions , booktitle =

Antoniadis, Antonios and Coester, Christian and Eli. Online Metric Algorithms with Untrusted Predictions , booktitle =
[6]

General Caching Is Hard: Even with Small Pages , booktitle =

Folwarczn. General Caching Is Hard: Even with Small Pages , booktitle =
[7]

ACM SIGMOD International Conference on Management of Data , year =

Dageville, Benoit and Cruanes, Thierry and Zukowski, Marcin and others , title =. ACM SIGMOD International Conference on Management of Data , year =
[8]

Yang, Juncheng and Yue, Yao and Rashmi, K. V. , title =. USENIX Symposium on Operating Systems Design and Implementation (OSDI) , year =
[9]

and Li, Kai and Lloyd, Wyatt , title =

Song, Zhenyu and Berger, Daniel S. and Li, Kai and Lloyd, Wyatt , title =. USENIX Symposium on Networked Systems Design and Implementation (NSDI) , year =
[10]

and Manasse, Mark S

Karlin, Anna R. and Manasse, Mark S. and McGeoch, Lyle A. and Owicki, Susan , title =. Algorithmica , year =
[11]

Belady, L. A. , title =. IBM Systems Journal , volume =

[1] [1]

SIAM Journal on Computing , volume =

Bansal, Nikhil and Buchbinder, Niv and Naor, Joseph (Seffi) , title =. SIAM Journal on Computing , volume =

[2] [2]

USENIX Symposium on Internet Technologies and Systems (USITS) , year =

Cao, Pei and Irani, Sandy , title =. USENIX Symposium on Internet Technologies and Systems (USITS) , year =

[3] [3]

and Beckmann, Nathan and Harchol-Balter, Mor , title =

Berger, Daniel S. and Beckmann, Nathan and Harchol-Balter, Mor , title =. Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS / SIGMETRICS) , volume =

[4] [4]

Journal of the ACM , volume =

Lykouris, Thodoris and Vassilvitskii, Sergei , title =. Journal of the ACM , volume =

[5] [5]

Online Metric Algorithms with Untrusted Predictions , booktitle =

Antoniadis, Antonios and Coester, Christian and Eli. Online Metric Algorithms with Untrusted Predictions , booktitle =

[6] [6]

General Caching Is Hard: Even with Small Pages , booktitle =

Folwarczn. General Caching Is Hard: Even with Small Pages , booktitle =

[7] [7]

ACM SIGMOD International Conference on Management of Data , year =

Dageville, Benoit and Cruanes, Thierry and Zukowski, Marcin and others , title =. ACM SIGMOD International Conference on Management of Data , year =

[8] [8]

Yang, Juncheng and Yue, Yao and Rashmi, K. V. , title =. USENIX Symposium on Operating Systems Design and Implementation (OSDI) , year =

[9] [9]

and Li, Kai and Lloyd, Wyatt , title =

Song, Zhenyu and Berger, Daniel S. and Li, Kai and Lloyd, Wyatt , title =. USENIX Symposium on Networked Systems Design and Implementation (NSDI) , year =

[10] [10]

and Manasse, Mark S

Karlin, Anna R. and Manasse, Mark S. and McGeoch, Lyle A. and Owicki, Susan , title =. Algorithmica , year =

[11] [11]

Belady, L. A. , title =. IBM Systems Journal , volume =