Support-aware offline policy selection for advertising marketplaces

Caroline Howard; Prashant Shekhar

arxiv: 2605.21736 · v1 · pith:QKLZHCKLnew · submitted 2026-05-20 · 📊 stat.ML · cs.AI· cs.LG

Support-aware offline policy selection for advertising marketplaces

Prashant Shekhar , Caroline Howard This is my paper

Pith reviewed 2026-05-22 08:33 UTC · model grok-4.3

classification 📊 stat.ML cs.AIcs.LG

keywords offline policy selectionadvertising marketplacesreserve pricesupport estimationoff-policy evaluationregret certificationreplay evaluation

0 comments

The pith

A support-aware framework turns logged auction data into certified reserve-policy decisions rather than point estimates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Logged advertising auctions let marketplaces evaluate reserve-price policies offline, yet standard replay often overstates gains when data support is thin or uncertainty is ignored. This paper builds a decision framework that processes the logs into three groups: policies that pass conservative support and uncertainty tests, alternatives that are statistically dominated, and candidates still needing live checks. The central guarantee keeps the strongest policy that clears the gates while removing only those shown to carry certified regret. A sympathetic reader cares because the method replaces risky single-winner rankings with an operational shortlist that limits exposure across many advertiser segments.

Core claim

The main theoretical result gives a unified finite-catalog guarantee showing that, under simultaneous uncertainty control and conservative support gates, the framework preserves the best gate-passing policy while eliminating only policies with certified regret.

What carries the argument

The support-aware offline decision framework that converts logged evidence into a conservative decision object of certified policies, statistically dominated alternatives, and unresolved candidates.

If this is right

A 19-policy catalog shrinks to a two-policy validation shortlist.
Non-harm is certified across 44 advertiser, exchange, and region segments.
The leading reserve rule shows 47.66 percent replay lift together with a 40.71 percent simultaneous lower bound.
Information-theoretic limits on threshold resolution are characterized for the catalog.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same support-gate logic could be tested in other logged-policy domains such as recommendation or pricing where coverage varies by context.
Experiments that deliberately increase bidder heterogeneity beyond the levels studied here would check whether the supporting results on response uncertainty continue to hold.
The emphasis on producing a shortlist of unresolved candidates points toward hybrid systems that combine this offline filter with targeted online A/B tests.

Load-bearing premise

The logged auction data supplies representative samples that permit accurate support estimation and uncertainty control, with bidder-response heterogeneity not overturning localized replay rankings.

What would settle it

Fresh auction logs in which a policy the framework eliminated yields higher revenue than the certified set would falsify the guarantee.

Figures

Figures reproduced from arXiv: 2605.21736 by Caroline Howard, Prashant Shekhar.

**Figure 1.** Figure 1: Offline reserve-policy selection from logged advertising auctions. Logged marketplace data and a finite reservepolicy catalog make offline replay evaluation possible, but naive replay rankings can be misleading because apparent gains may hide weak threshold support, multiple-comparison effects, subgroup harm, or bidder-response uncertainty. The figure illustrates the central decision problem of determinin… view at source ↗

**Figure 2.** Figure 2: Conservative shortlist construction on season two. Panel (a) shows the replay frontier for non-baseline reserve policies, with replay yield lift plotted against retained impression share. Panel (b) shows simultaneous lower-bound ranking. Colored points indicate whether the decision rule certifies the policy, eliminates it as dominated, or leaves it unresolved. Black points show support-adjusted lower bound… view at source ↗

**Figure 3.** Figure 3: Support-localized threshold resolution. Panel (a) reports effective boundary sample size nboundary(h) as the diagnostic boundary window expands. Panel (b) reports support-adjusted lower-bound lift for the leading policies. The same season-two panel can be statistically large overall while remaining locally thin near narrow reserve-threshold bands. from 0 to 0.10, and Appendix B.1 shows that bootstrap repla… view at source ↗

**Figure 4.** Figure 4: Validation readiness through transfer and subgroup safety. Panel (a) compares season-two and season-three replay lifts under the frozen catalog. Panel (b) reports mean segment-level replay lifts for the covered segments with the smallest lower endpoints, with 95% normal confidence bars computed from daily segment-level replay variation. maximum support-adjusted lower bound rises from 9.17% at h = 1 to 47.6… view at source ↗

**Figure 5.** Figure 5: Additional replay diagnostics. Panel (a) shows daily replay-lift dispersion for the leading policies. Panel (b) shows how the Bonferroni critical value and the leader’s simultaneous lower bound change as the policy catalog grows [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗

**Figure 6.** Figure 6: Pairwise boundary-support diagnostics. Panel (a) reports the empirical distribution of pairwise boundary-support shares across policy pairs. Panel (b) relates mean candidate-floor distance to absolute replay-lift gaps, with marker size proportional to boundary support. B.1 Additional replay and replay-concentration diagnostics [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: q-localized replay selection. Panel (a) reports localized boundary lift over localization levels q. Panel (b) reports day-bootstrap winner frequencies. P11 is the stable q-localized boundary-lift leader, while P18 remains the aggregate replay leader. The monotone pattern from the main text remains. Boundary support grows as the diagnostic window widens, and support-adjusted lower bounds become less conserv… view at source ↗

**Figure 8.** Figure 8: Out-of-time transfer for q-localized selections. Panel (a) compares season-two localized boundary lift with season-three aggregate replay lift for policies selected by the q-localized rule. Panel (b) reports season-three aggregate replay lift across localization levels. The localized winner P11 transfers positively but does not exceed P18’s aggregate season-three replay performance [PITH_FULL_IMAGE:figure… view at source ↗

**Figure 9.** Figure 9: Shortlist and decision-rule robustness. Panel (a) reports retained shortlist size as the elimination tolerance varies. Panel (b) compares simpler decision rules with the support-aware elimination shortlist. not eliminated because it is strongly supported locally. P18 remains the preferred validation candidate because it dominates on aggregate replay, conservative lower-bound ranking, and out-of-time transf… view at source ↗

read the original abstract

Logged advertising auctions make offline reserve-price evaluation attractive but risky. Replay tables can identify policies with large apparent yield gains, yet they can also hide weak threshold support, multiple-comparison effects, subgroup harm, and bidder-response uncertainty. Existing replay and off-policy evaluation methods estimate or rank policy values, but they do not directly answer the operational question of whether the available evidence is strong enough to justify validation. This paper develops a support-aware offline decision framework for reserve-policy selection. Rather than outputting a single point-estimate winner, the framework converts logged evidence into a conservative decision object consisting of certified policies, statistically dominated alternatives, and unresolved candidates requiring further validation. The main theoretical result gives a unified finite-catalog guarantee showing that, under simultaneous uncertainty control and conservative support gates, the framework preserves the best gate-passing policy while eliminating only policies with certified regret. Supporting results characterize support-localized replay generalization, establish information-theoretic threshold-resolution limits, and quantify when heterogeneous bidder response can overturn localized replay rankings. Experiments on iPinYou real-time-bidding logs show that the leading reserve rule achieves a 47.66% replay lift in season two, a 40.71% simultaneous lower-bound lift, and a 43.87% frozen out-of-time replay lift in season three. The framework reduces a 19-policy catalog to a two-policy validation shortlist while certifying non-harm across 44 advertiser, exchange, and region segments. The results support the central claim that offline reserve-policy evaluation should produce certified validation decisions rather than point-estimate rankings alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a support-aware framework that turns logged ad auction data into certified policy decisions with a finite-catalog guarantee, though the guarantee's reliability depends on how well support estimation holds up.

read the letter

The main thing here is a framework for offline reserve-price selection that outputs certified policies, statistically dominated alternatives, and unresolved candidates instead of a single replay winner. It adds conservative support gates and uncertainty control to produce a unified finite-catalog guarantee that keeps the best gate-passing policy while dropping only those with certified regret. Supporting analysis covers support-localized replay generalization and cases where heterogeneous bidder response can overturn localized rankings. On iPinYou logs the leading rule shows 47.66% replay lift, 40.71% lower-bound lift, and 43.87% out-of-time lift, while shrinking a 19-policy catalog to two and certifying non-harm across 44 segments. That practical reduction and the segment-level checks are useful. The work does well by focusing on the operational question of whether the evidence justifies validation rather than just ranking values. The soft spots sit in the support estimation step. The guarantee requires the logged auctions to give accurate localized coverage, and the stress-test note is right to flag that bidder heterogeneity or non-stationarity not captured by the season splits could either eliminate the true best policy or leave weak ones uncertified. The paper quantifies some overturning risk under its modeling assumptions, but sensitivity to the free support thresholds and uncertainty parameters is not fully detailed in what is visible. This is for researchers and practitioners working on offline policy selection in advertising marketplaces or similar replay-heavy settings who need something more conservative than point estimates. A reader dealing with risky offline evaluations would get concrete value from the decision objects and the guarantee. It deserves a serious referee because the combination of the theoretical result and the real-log experiments addresses a clear gap, even if the robustness checks could be expanded.

Referee Report

2 major / 2 minor

Summary. The paper develops a support-aware offline decision framework for reserve-price policy selection in advertising marketplaces. Instead of point-estimate rankings, it produces a conservative decision object of certified policies, statistically dominated alternatives, and unresolved candidates. The central theoretical result is a unified finite-catalog guarantee that, under simultaneous uncertainty control and conservative support gates, preserves the best gate-passing policy while eliminating only policies with certified regret. Supporting results address support-localized replay generalization, information-theoretic limits, and heterogeneous bidder response. Experiments on iPinYou RTB logs report a 47.66% replay lift, 40.71% lower-bound lift, and 43.87% out-of-time lift for the leading rule, reducing a 19-policy catalog to a two-policy shortlist while certifying non-harm across 44 segments.

Significance. If the theoretical guarantee holds and the support-estimation assumptions are satisfied, the framework offers a meaningful shift from standard OPE rankings toward certifiable, risk-aware decisions suitable for operational deployment. The empirical demonstration of catalog reduction and segment-level non-harm certification on real auction logs is practically relevant. The work also provides characterizations of localized generalization and bidder-response heterogeneity that could inform future OPE methods. These elements, if substantiated, strengthen the case for conservative offline selection in marketplaces.

major comments (2)

[Main theoretical result] Main theoretical result (unified finite-catalog guarantee): The preservation of the best gate-passing policy while eliminating only certified-regret policies is stated to hold under simultaneous uncertainty control and conservative support gates. However, the construction of these gates and their robustness to non-stationarity or cross-segment dependence in logged auctions is not shown to be independent of the target guarantee; if support estimation fails to certify true coverage, the guarantee can either drop the optimal policy or retain weak alternatives. A concrete counter-example or additional robustness theorem under temporal shifts would be required to substantiate the claim.
[Supporting results on heterogeneous bidder response] Supporting results on heterogeneous bidder response: The quantification of overturning risk for localized replay rankings is performed under specific modeling assumptions and iPinYou season splits. The paper does not demonstrate that these results extend to unmodeled temporal shifts or bidder heterogeneity patterns outside the observed splits; such patterns could overturn the support-gate decisions and thereby invalidate the catalog-reduction and non-harm certification reported in the experiments.

minor comments (2)

[Abstract and introduction] The abstract and introduction introduce several new decision objects (certified policies, statistically dominated alternatives) without an early table or diagram that maps them to standard OPE quantities; adding such a mapping would improve readability.
[Experiments] The reported lifts (47.66% replay, 40.71% lower-bound, 43.87% out-of-time) are given without explicit confidence intervals or details on how the simultaneous lower bound is computed; including these would strengthen the empirical section.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments and for recognizing the potential of our support-aware framework to shift offline policy selection toward certifiable decisions in advertising marketplaces. We address each major comment below with clarifications on the assumptions underlying our results and commit to targeted revisions that strengthen the discussion of robustness without altering the core claims.

read point-by-point responses

Referee: [Main theoretical result] Main theoretical result (unified finite-catalog guarantee): The preservation of the best gate-passing policy while eliminating only certified-regret policies is stated to hold under simultaneous uncertainty control and conservative support gates. However, the construction of these gates and their robustness to non-stationarity or cross-segment dependence in logged auctions is not shown to be independent of the target guarantee; if support estimation fails to certify true coverage, the guarantee can either drop the optimal policy or retain weak alternatives. A concrete counter-example or additional robustness theorem under temporal shifts would be required to substantiate the claim.

Authors: The unified finite-catalog guarantee is derived under the explicit joint conditions of uncertainty control and conservative support gates, as formalized in the main theorem; it does not claim independence from support-estimation quality. The gates are deliberately conservative, requiring empirical coverage thresholds that favor retaining unresolved candidates over risking the elimination of the best-supported policy. Supporting results on support-localized replay generalization already incorporate coverage estimation error. The iPinYou experiments include out-of-time replay on season-three data as an empirical check against temporal effects. We agree that a dedicated robustness subsection would improve clarity. In revision we will add a discussion of sensitivity to non-stationarity and cross-segment dependence, together with a brief illustrative example showing gate behavior under mild shifts, while preserving the conditional nature of the guarantee. revision: partial
Referee: [Supporting results on heterogeneous bidder response] Supporting results on heterogeneous bidder response: The quantification of overturning risk for localized replay rankings is performed under specific modeling assumptions and iPinYou season splits. The paper does not demonstrate that these results extend to unmodeled temporal shifts or bidder heterogeneity patterns outside the observed splits; such patterns could overturn the support-gate decisions and thereby invalidate the catalog-reduction and non-harm certification reported in the experiments.

Authors: The characterizations of overturning risk and bidder-response heterogeneity are explicitly tied to the modeling assumptions and the observed season splits in the iPinYou logs; we do not assert universal extension. The conservative support gates and the 44-segment non-harm certification are computed directly on the available data, and the reported catalog reduction (19 to 2 policies) and lifts are likewise dataset-specific. The framework outputs unresolved candidates precisely when heterogeneity may threaten gate decisions. In revision we will expand the relevant section to state the scope of these supporting results more explicitly and to note that further validation would be required for environments exhibiting substantially different heterogeneity patterns. revision: yes

Circularity Check

0 steps flagged

No significant circularity; theoretical guarantee is self-contained under explicitly defined controls

full rationale

The paper's central claim is a unified finite-catalog guarantee that, under simultaneous uncertainty control and conservative support gates, preserves the best gate-passing policy while eliminating only policies with certified regret. This is presented as a derived result from the framework's construction rather than a reduction to fitted parameters or self-cited premises by definition. The abstract and supporting results on support-localized replay and heterogeneous bidder response characterize the conditions without evidence of self-definitional loops, fitted inputs renamed as predictions, or load-bearing self-citations that collapse the argument. The derivation remains independent, with the decision object built from logged evidence and explicit gates, qualifying as self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 2 invented entities

The framework rests on domain assumptions about data representativeness and bidder response, with new decision constructs introduced; specific free parameters such as support thresholds are implied but not quantified in the abstract.

free parameters (1)

support thresholds and uncertainty control parameters
Conservative support gates and simultaneous uncertainty controls are central to the guarantee but their specific values or selection method are not detailed.

axioms (2)

domain assumption Logged auction data is representative for estimating policy support and performance under the gates
Implicit in the use of replay tables and the finite-catalog guarantee.
domain assumption Heterogeneous bidder responses can be quantified without overturning localized rankings
Referenced in the supporting result on when bidder response can overturn rankings.

invented entities (2)

certified policies no independent evidence
purpose: Policies that pass support and uncertainty checks for safe validation
New decision category introduced by the framework.
statistically dominated alternatives no independent evidence
purpose: Policies eliminated as provably inferior under the guarantee
New decision category introduced by the framework.

pith-pipeline@v0.9.0 · 5808 in / 1648 out tokens · 45838 ms · 2026-05-22T08:33:21.614778+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

[1]

Iavor Bojinov, David Simchi-Levi, and Jinglong Zhao

doi: 10.1214/23-STS883. Iavor Bojinov, David Simchi-Levi, and Jinglong Zhao. Design and analysis of switchback experiments. Management Science, 69(7):3759–3777,

work page doi:10.1214/23-sts883
[2]

Leon Bottou, Jonas Peters, Joaquin Quinonero-Candela, Denis X

doi: 10.1287/mnsc.2022.4444. Leon Bottou, Jonas Peters, Joaquin Quinonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, and Ed Snelson. Counterfactual reasoning and learning systems: The example of computational advertising.Journal of Machine Learning Research, 14:3207–3260,

work page doi:10.1287/mnsc.2022.4444 2022
[3]

Ido Bright, Arthur Delarue, and Ilan Lobel

URL https://www.jmlr.org/papers/v14/bottou13a.html. Ido Bright, Arthur Delarue, and Ilan Lobel. Reducing marketplace interference bias via shadow prices.arXiv preprint arXiv:2205.02274,

work page arXiv
[4]

doi: 10.1145/2591796. 2591867. Kristof Coussement and Dries F. Benoit. Interpretable data science for decision making.Decision Support Systems, 150:113664,

work page doi:10.1145/2591796
[5]

Miroslav Dudík, John Langford, and Lihong Li

doi: 10.1016/j.dss.2021.113664. Miroslav Dudík, John Langford, and Lihong Li. Doubly robust policy evaluation and learning. InProceedings of the 28th International Conference on Machine Learning,

work page doi:10.1016/j.dss.2021.113664 2021
[6]

1Corresponding author:shekharp@erau.edu 12 Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz

URL https://icml.cc/2011/ papers/554_icmlpaper.pdf. 1Corresponding author:shekharp@erau.edu 12 Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords.American Economic Review, 97(1): 242–259,

work page 2011
[7]

Zhe Feng, Sébastien Lahaie, Jon Schneider, and Jinchao Ye

doi: 10.1257/aer.97.1.242. Zhe Feng, Sébastien Lahaie, Jon Schneider, and Jinchao Ye. Reserve price optimization for first price auctions. arXiv preprint arXiv:2006.06519,

work page doi:10.1257/aer.97.1.242 2006
[8]

Limiting bias from test-control interference in online marketplace experiments

David Holtz and Sinan Aral. Limiting bias from test-control interference in online marketplace experiments. arXiv preprint arXiv:2004.12162,

work page arXiv 2004
[9]

doi: 10.1287/mnsc.2021

work page doi:10.1287/mnsc.2021 2021
[10]

Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms

Hannah Li, Geng Zhao, Ramesh Johari, and Gabriel Y Weintraub. Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms. InProceedings of the ACM Web Conference 2022, pages 182–192,

work page 2022
[11]

Michael Ostrovsky and Michael Schwarz

doi: 10.1287/moor.6.1.58. Michael Ostrovsky and Michael Schwarz. Reserve prices in internet advertising auctions. InProceedings of the 12th ACM Conference on Electronic Commerce, pages 59–60,

work page doi:10.1287/moor.6.1.58
[12]

Prashant Shekhar and Caroline Howard

doi: 10.1145/1993574.1993585. Prashant Shekhar and Caroline Howard. Decision support for marketplace policies under incomplete evidence: From replay to launch readiness.arXiv preprint arXiv:2605.12840,

work page doi:10.1145/1993574.1993585
[13]

Shuai Yuan, Jun Wang, Bowei Chen, Peter Mason, and Sam Seljan

doi: 10.1016/j.ijindorg.2006.10.002. Shuai Yuan, Jun Wang, Bowei Chen, Peter Mason, and Sam Seljan. An empirical study of reserve price optimisation in real-time bidding. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1897–1906,

work page doi:10.1016/j.ijindorg.2006.10.002 2006
[14]

1Corresponding author:shekharp@erau.edu 13 A Proofs A.1 Proof of Theorem 4.1 Proof.For eachπ∈ P, define the centered replay difference Z π i :=Y π i −Y 0 i , µ Z,π :=E[Z π i ]

doi: 10.1145/2623330.2623357. 1Corresponding author:shekharp@erau.edu 13 A Proofs A.1 Proof of Theorem 4.1 Proof.For eachπ∈ P, define the centered replay difference Z π i :=Y π i −Y 0 i , µ Z,π :=E[Z π i ]. Then ∆π = µZ,π µ0 . Fixπandq∈(0,1). Let Aπ,q :={|G i −τ π| ≤r π(q)}, m π,q =P(A π,q). By assumption, the replay difference admits the decomposition Z ...

work page doi:10.1145/2623330.2623357

[1] [1]

Iavor Bojinov, David Simchi-Levi, and Jinglong Zhao

doi: 10.1214/23-STS883. Iavor Bojinov, David Simchi-Levi, and Jinglong Zhao. Design and analysis of switchback experiments. Management Science, 69(7):3759–3777,

work page doi:10.1214/23-sts883

[2] [2]

Leon Bottou, Jonas Peters, Joaquin Quinonero-Candela, Denis X

doi: 10.1287/mnsc.2022.4444. Leon Bottou, Jonas Peters, Joaquin Quinonero-Candela, Denis X. Charles, D. Max Chickering, Elon Portugaly, Dipankar Ray, Patrice Simard, and Ed Snelson. Counterfactual reasoning and learning systems: The example of computational advertising.Journal of Machine Learning Research, 14:3207–3260,

work page doi:10.1287/mnsc.2022.4444 2022

[3] [3]

Ido Bright, Arthur Delarue, and Ilan Lobel

URL https://www.jmlr.org/papers/v14/bottou13a.html. Ido Bright, Arthur Delarue, and Ilan Lobel. Reducing marketplace interference bias via shadow prices.arXiv preprint arXiv:2205.02274,

work page arXiv

[4] [4]

doi: 10.1145/2591796. 2591867. Kristof Coussement and Dries F. Benoit. Interpretable data science for decision making.Decision Support Systems, 150:113664,

work page doi:10.1145/2591796

[5] [5]

Miroslav Dudík, John Langford, and Lihong Li

doi: 10.1016/j.dss.2021.113664. Miroslav Dudík, John Langford, and Lihong Li. Doubly robust policy evaluation and learning. InProceedings of the 28th International Conference on Machine Learning,

work page doi:10.1016/j.dss.2021.113664 2021

[6] [6]

1Corresponding author:shekharp@erau.edu 12 Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz

URL https://icml.cc/2011/ papers/554_icmlpaper.pdf. 1Corresponding author:shekharp@erau.edu 12 Benjamin Edelman, Michael Ostrovsky, and Michael Schwarz. Internet advertising and the generalized second-price auction: Selling billions of dollars worth of keywords.American Economic Review, 97(1): 242–259,

work page 2011

[7] [7]

Zhe Feng, Sébastien Lahaie, Jon Schneider, and Jinchao Ye

doi: 10.1257/aer.97.1.242. Zhe Feng, Sébastien Lahaie, Jon Schneider, and Jinchao Ye. Reserve price optimization for first price auctions. arXiv preprint arXiv:2006.06519,

work page doi:10.1257/aer.97.1.242 2006

[8] [8]

Limiting bias from test-control interference in online marketplace experiments

David Holtz and Sinan Aral. Limiting bias from test-control interference in online marketplace experiments. arXiv preprint arXiv:2004.12162,

work page arXiv 2004

[9] [9]

doi: 10.1287/mnsc.2021

work page doi:10.1287/mnsc.2021 2021

[10] [10]

Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms

Hannah Li, Geng Zhao, Ramesh Johari, and Gabriel Y Weintraub. Interference, bias, and variance in two-sided marketplace experimentation: Guidance for platforms. InProceedings of the ACM Web Conference 2022, pages 182–192,

work page 2022

[11] [11]

Michael Ostrovsky and Michael Schwarz

doi: 10.1287/moor.6.1.58. Michael Ostrovsky and Michael Schwarz. Reserve prices in internet advertising auctions. InProceedings of the 12th ACM Conference on Electronic Commerce, pages 59–60,

work page doi:10.1287/moor.6.1.58

[12] [12]

Prashant Shekhar and Caroline Howard

doi: 10.1145/1993574.1993585. Prashant Shekhar and Caroline Howard. Decision support for marketplace policies under incomplete evidence: From replay to launch readiness.arXiv preprint arXiv:2605.12840,

work page doi:10.1145/1993574.1993585

[13] [13]

Shuai Yuan, Jun Wang, Bowei Chen, Peter Mason, and Sam Seljan

doi: 10.1016/j.ijindorg.2006.10.002. Shuai Yuan, Jun Wang, Bowei Chen, Peter Mason, and Sam Seljan. An empirical study of reserve price optimisation in real-time bidding. InProceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1897–1906,

work page doi:10.1016/j.ijindorg.2006.10.002 2006

[14] [14]

1Corresponding author:shekharp@erau.edu 13 A Proofs A.1 Proof of Theorem 4.1 Proof.For eachπ∈ P, define the centered replay difference Z π i :=Y π i −Y 0 i , µ Z,π :=E[Z π i ]

doi: 10.1145/2623330.2623357. 1Corresponding author:shekharp@erau.edu 13 A Proofs A.1 Proof of Theorem 4.1 Proof.For eachπ∈ P, define the centered replay difference Z π i :=Y π i −Y 0 i , µ Z,π :=E[Z π i ]. Then ∆π = µZ,π µ0 . Fixπandq∈(0,1). Let Aπ,q :={|G i −τ π| ≤r π(q)}, m π,q =P(A π,q). By assumption, the replay difference admits the decomposition Z ...

work page doi:10.1145/2623330.2623357