Generative Bid Shading in Real-Time Bidding Advertising

arxiv: 2508.06550 · v3 · submitted 2025-08-06 · 💻 cs.GT · cs.LG

Generative Bid Shading in Real-Time Bidding Advertising

Yinqiu Huang , Hao Ma , Wenshuai Chen , Zongwei Wang , Shuli Wang , Yongqiang Zhang , Xue Wei , Yinhua Zhu

show 2 more authors

Haitao Wang Xingxing Wang

This is my paper

Pith reviewed 2026-05-19 01:18 UTC · model grok-4.3

classification 💻 cs.GT cs.LG

keywords bid shadingreal-time biddinggenerative modelautoregressive generationsurplus optimizationreinforcement learningadvertising auction

0 comments p. Extension

The pith

Generative Bid Shading generates shading ratios autoregressively via stepwise residuals to optimize surplus without unimodal assumptions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents Generative Bid Shading as a direct response to limitations in existing real-time bidding methods. Traditional two-stage pipelines first model bid landscapes under unimodal assumptions then apply operations research steps, which break down on non-convex surplus curves and accumulate errors across stages. GBS instead trains a single autoregressive generative model that builds shading ratios incrementally from residuals, modeling dependencies between value intervals directly from data. A separate reward preference alignment module uses a channel-aware hierarchical network and group relative policy optimization to balance immediate and future surplus gains. The approach is validated through offline experiments and online deployment on a large advertising platform.

Core claim

Generative Bid Shading comprises an end-to-end generative model that utilizes an autoregressive approach to generate shading ratios by stepwise residuals, capturing complex value dependencies without relying on predefined priors, and a reward preference alignment system with CHNet and GRPO that optimizes both short-term and long-term surplus.

What carries the argument

Autoregressive generative model that produces shading ratios through successive residual steps to encode value dependencies directly.

If this is right

Handles non-convex surplus curves that defeat unimodal landscape models.
Eliminates cascading errors by replacing sequential two-stage workflows with a single generative pass.
Captures dependencies across discrete value intervals that independent discretization ignores.
Balances short-term and long-term surplus through explicit reward alignment and exploration terms.
Supports deployment at scale for billions of daily requests without intermediate model handoffs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The residual autoregressive structure may transfer to other continuous decision problems in auctions or dynamic pricing where value surfaces are non-convex.
Removing the need for hand-crafted priors could reduce maintenance costs when bidding environments shift across markets or channels.
The CHNet-GRPO alignment loop suggests a template for incorporating long-horizon rewards in other real-time optimization systems.

Load-bearing premise

That an autoregressive model trained on historical bidding data can accurately represent non-convex surplus curves and extract generalizable features without new errors from discretization or selection bias.

What would settle it

An online A/B test on the production platform showing no statistically significant lift in advertiser surplus when GBS replaces the existing two-stage baseline across millions of bid requests.

Figures

Figures reproduced from arXiv: 2508.06550 by Haitao Wang, Hao Ma, Shuli Wang, Wenshuai Chen, Xingxing Wang, Xue Wei, Yinhua Zhu, Yinqiu Huang, Yongqiang Zhang, Zongwei Wang.

**Figure 2.** Figure 2: The winning rate and surplus curves derived from [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: The proposed generative model employs an encoder-decoder architecture to predict the token sequence autoregres [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: The architecture of CHNet. It outputs the PDF dis [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: The architecture of post-training with GRPO. [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 7.** Figure 7: Architecture of the online deployment with GBS. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 6.** Figure 6: The performance of CHNet. The experimental results are shown in [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

read the original abstract

Bid shading plays a crucial role in Real-Time Bidding (RTB) by adaptively adjusting the bid to avoid advertisers overspending. Existing mainstream two-stage methods, which first model bid landscapes and then optimize surplus using operations research techniques, are constrained by unimodal assumptions that fail to adapt for non-convex surplus curves and are vulnerable to cascading errors in sequential workflows. Additionally, existing discretization models of continuous values ignore the dependence between discrete intervals, reducing the model's error correction ability, while sample selection bias in bidding scenarios presents further challenges for prediction. To address these issues, this paper introduces Generative Bid Shading (GBS), which comprises two primary components: 1) an end-to-end generative model that utilizes an autoregressive approach to generate shading ratios by stepwise residuals, capturing complex value dependencies without relying on predefined priors; and 2) a reward preference alignment system, which incorporates a channel-aware hierarchical dynamic network (CHNet) as the reward model to extract fine-grained features, along with modules for surplus optimization and exploration utility reward alignment, ultimately optimizing both short-term and long-term surplus using group relative policy optimization (GRPO). Extensive experiments on both offline and online A/B tests validate GBS's effectiveness. Moreover, GBS has been deployed on the Meituan DSP platform, serving billions of bid requests daily.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper offers a generative autoregressive approach to bid shading with CHNet and GRPO alignment, but the abstract shows no results or comparisons so the gains stay unproven.

read the letter

The main point on this GBS paper is that it tries to replace two-stage bid shading with an end-to-end autoregressive generator that builds shading ratios through stepwise residuals, then aligns the policy using a channel-aware network and group relative optimization for short- and long-term surplus. That combination is new in this domain even if the pieces come from elsewhere. They correctly flag the unimodal limits and discretization problems in prior work, and the deployment note on Meituan serving billions of requests suggests the system runs at scale in practice. That counts as real-world grounding worth noting. The soft spots sit in the missing evidence. The abstract claims offline and online A/B tests plus deployment but reports no lifts, baselines, error bars, or checks on error accumulation in the residual steps or on sample bias. Without those numbers it is hard to know whether the autoregressive chain actually handles non-convex curves better than the methods it criticizes or whether the hierarchical features generalize under shift. The internal reward definitions also raise the usual circularity question until external benchmarks appear. This work is aimed at engineers and applied researchers who optimize real-time bidding systems. A reader already working on continuous-action bidding or ad surplus maximization could pick up the architecture details and the GRPO framing. It deserves a serious referee because the problem matters at industry scale and the proposed fix is coherent enough to test, even if the current draft is light on validation. Send it out so the authors can supply the metrics and ablations that are needed to judge the claims.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes Generative Bid Shading (GBS) for real-time bidding in advertising. It consists of an end-to-end autoregressive generative model that generates shading ratios through stepwise residuals to capture complex value dependencies without predefined priors, and a reward preference alignment system incorporating a Channel-aware Hierarchical Dynamic Network (CHNet) as the reward model along with modules for surplus optimization and exploration utility reward alignment, optimized using Group Relative Policy Optimization (GRPO) for both short-term and long-term surplus. The approach is validated through offline and online A/B tests and has been deployed on the Meituan DSP platform serving billions of bid requests daily.

Significance. If the empirical results hold under rigorous scrutiny, this work could represent a meaningful advance in bid shading techniques by addressing limitations of two-stage methods and discretization approaches in handling non-convex surplus curves. The integration of generative modeling with preference alignment for multi-term optimization offers a fresh perspective in the game theory and advertising auction literature, potentially leading to more efficient bidding strategies in large-scale RTB systems.

major comments (2)

[Abstract (validation paragraph)] The abstract states that 'Extensive experiments on both offline and online A/B tests validate GBS's effectiveness' but provides no quantitative results, error bars, baseline comparisons, or details on how non-convexity or sample bias were handled. This leaves the central empirical claim without visible supporting evidence in the provided text and undermines the ability to assess the claimed improvements.
[Generative model component (described in abstract)] The autoregressive generation of shading ratios by stepwise residuals is presented as capturing complex dependencies without cascading errors. However, sequential conditional predictions in autoregressive decoding can still accumulate per-step residual errors, especially for continuous ratios. This risks undermining the asserted advantage over unimodal or discretized baselines when modeling non-convex surplus curves.

minor comments (1)

[Notation and terminology] The introduction of new entities such as CHNet and GRPO would benefit from clearer definitions or references to prior work on similar hierarchical networks or policy optimization methods to aid reader comprehension.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment point by point below, clarifying our approach and indicating planned revisions to strengthen the presentation of results and technical details.

read point-by-point responses

Referee: [Abstract (validation paragraph)] The abstract states that 'Extensive experiments on both offline and online A/B tests validate GBS's effectiveness' but provides no quantitative results, error bars, baseline comparisons, or details on how non-convexity or sample bias were handled. This leaves the central empirical claim without visible supporting evidence in the provided text and undermines the ability to assess the claimed improvements.

Authors: We acknowledge that the abstract, due to its brevity, summarizes the validation at a high level without specific metrics. The full manuscript details the offline and online A/B test results in the Experiments section, including quantitative surplus improvements over baselines, error bars from repeated trials, and explicit handling of non-convex surplus curves through the generative residual modeling as well as sample bias mitigation via end-to-end training and the CHNet reward model. To make the central claims more immediately verifiable, we will revise the abstract to include key quantitative highlights (e.g., relative surplus gains and statistical significance) while remaining within length limits. revision: yes
Referee: [Generative model component (described in abstract)] The autoregressive generation of shading ratios by stepwise residuals is presented as capturing complex dependencies without cascading errors. However, sequential conditional predictions in autoregressive decoding can still accumulate per-step residual errors, especially for continuous ratios. This risks undermining the asserted advantage over unimodal or discretized baselines when modeling non-convex surplus curves.

Authors: We appreciate the referee highlighting this potential limitation of autoregressive approaches in general. Our design uses stepwise residual generation specifically to model inter-value dependencies and enable per-step correction, which we contrast with discretization methods that discard interval dependencies entirely. The manuscript provides both theoretical motivation and empirical ablations showing reduced error propagation and better performance on non-convex curves relative to unimodal and discretized baselines. To directly address the concern, we will expand the discussion of the generative component with additional analysis of residual error accumulation and further ablation results demonstrating the mitigation effect. revision: partial

Circularity Check

0 steps flagged

No circularity: model trained end-to-end on external bidding data with independent validation

full rationale

The paper introduces an autoregressive generative model for shading ratios and a CHNet+GRPO alignment system, both trained on observed bidding data and evaluated via offline/online A/B tests plus production deployment. No equations or sections reduce a claimed prediction or surplus objective to a fitted parameter or self-citation by construction; the surplus optimization uses standard policy-gradient methods on externally measured outcomes rather than redefining the target as the model's own output. The derivation chain therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 2 invented entities

The central claim rests on the unstated premise that bidding data contains learnable stepwise residual dependencies that an autoregressive model can exploit without external priors, plus the assumption that CHNet features and GRPO rewards align with true advertiser surplus.

free parameters (2)

autoregressive residual steps
Number of stepwise residuals used to generate shading ratios; chosen to capture dependencies but value not specified.
GRPO group size and reward weights
Hyperparameters controlling short-term versus long-term surplus trade-off in the alignment stage.

axioms (1)

domain assumption Bidding surplus curves can be non-convex and that autoregressive generation can model them without unimodal assumptions.
Invoked when contrasting with existing two-stage methods that rely on unimodal assumptions.

invented entities (2)

Channel-aware Hierarchical Dynamic Network (CHNet) no independent evidence
purpose: Reward model to extract fine-grained features for surplus optimization.
New component introduced to support the alignment system; no independent evidence outside this work.
Group Relative Policy Optimization (GRPO) no independent evidence
purpose: Optimization method for aligning short- and long-term surplus rewards.
Introduced or adapted for the bid-shading task; no external verification cited.

pith-pipeline@v0.9.0 · 5796 in / 1554 out tokens · 56525 ms · 2026-05-19T01:18:09.811689+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 43 canonical work pages · 2 internal anchors

[1]

2017.Demystifying Auction Dynamics for Digital Buyers and Sell- ers.https://www.appnexus.com/sites/default/files/whitepapers/49344-CM- Auction-Type-Whitepaper-V9.pdf

AppNexus. 2017.Demystifying Auction Dynamics for Digital Buyers and Sell- ers.https://www.appnexus.com/sites/default/files/whitepapers/49344-CM- Auction-Type-Whitepaper-V9.pdf

work page 2017
[2]

Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, and Daniele Calandriello. 2024. A general theoret- ical paradigm to understand learning from human preferences. InInternational Conference on Artificial Intelligence and Statistics. PMLR, 4447–4455

work page 2024
[3]

2019.Rolling out first price auctions to Google Ad Manager part- ners

Jason Bigler. 2019.Rolling out first price auctions to Google Ad Manager part- ners. https://blog.google/products/admanager/rolling-out-first-price-auctions- google-ad-manager-partners/

work page 2019
[4]

Olivier Chapelle. 2015. Offline evaluation of response prediction in online advertising auctions. InProceedings of the 24th international conference on world wide web. 919–922

work page 2015
[5]

Sayak Ray Chowdhury, Anush Kini, and Nagarajan Natarajan. [n. d.]. Provably Robust DPO: Aligning Language Models with Noisy Feedback. InForty-first International Conference on Machine Learning

work page
[6]

John M Crespi and Richard J Sexton. 2005. A Multinomial logit framework to estimate bid shading in procurement auctions: Application to cattle sales in the Texas Panhandle.Review of industrial organization27, 3 (2005), 253–278

work page 2005
[7]

Ying Cui, Ruofei Zhang, Wei Li, and Jianchang Mao. 2011. Bid landscape forecast- ing in online ad exchange marketplace. InProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 265–273

work page 2011
[8]

Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. 2025. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[9]

Jingtong Gao, Bo Chen, Menghui Zhu, Xiangyu Zhao, Xiaopeng Li, Yuhao Wang, Yichao Wang, Huifeng Guo, and Ruiming Tang. 2023. Scenario-Aware Hierar- chical Dynamic Network for Multi-Scenario Recommendation.arXiv preprint arXiv:2309.02061(2023)

work page arXiv 2023
[10]

2017.RTB Auctions: Fair Play?https://blog.getintent.com/rtb- auctions-fair-play-3b372d505089

Getintent. 2017.RTB Auctions: Fair Play?https://blog.getintent.com/rtb- auctions-fair-play-3b372d505089

work page 2017
[11]

Djordje Gligorijevic, Tian Zhou, Bharatbhushan Shetty, Brendan Kitts, Shengjun Pan, Junwei Pan, and Aaron Flores. 2020. Bid shading in the brave new world of first-price auctions. InProceedings of the 29th ACM International Conference on Information & Knowledge Management. 2453–2460

work page 2020
[12]

Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Haoqi Zhang, Zhenzhe Zheng, Zhilin Zhang, Rongquan Bai, Chuan Yu, Jian Xu, et al. 2023. MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management. 4588–4594

work page 2023
[13]

Xian Guo, Ben Chen, Siyuan Wang, Ying Yang, Chenyi Lei, Yuqing Ding, and Han Li. 2025. OneSug: The Unified End-to-End Generative Framework for E-commerce Query Suggestion.arXiv preprint arXiv:2506.06913(2025)

work page arXiv 2025
[14]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models.Advances in neural information processing systems33 (2020), 6840–6851

work page 2020
[15]

Jiani Huang, Zhenzhe Zheng, Yanrong Kang, and Zixiao Wang. 2024. From Sec- ond to First: Mixed Censored Multi-Task Learning for Winning Price Prediction. InProceedings of the 17th ACM International Conference on Web Search and Data Mining. 295–303

work page 2024
[16]

Xu Li, Michelle Ma Zhang, Zhenya Wang, and Youjun Tong. 2022. Arbitrary distribution modeling with censorship in real-time bidding advertising. InPro- ceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3250–3258

work page 2022
[17]

Hairen Liao, Lingxiao Peng, Zhenchuan Liu, and Xuehua Shen. 2014. iPinYou global rtb bidding algorithm competition dataset. InProceedings of the Eighth International Workshop on Data Mining for Online Advertising. 1–6

work page 2014
[18]

Xiao Lin, Xiaokai Chen, Linfeng Song, Jingwei Liu, Biao Li, and Peng Jiang

work page
[19]

InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Tree based progressive regression model for watch-time prediction in short-video recommendation. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4497–4506

work page
[20]

Zhutian Lin, Junwei Pan, Shangyu Zhang, Ximei Wang, Xi Xiao, Shudong Huang, Lei Xiao, and Jie Jiang. 2024. Understanding the ranking loss for recommendation with sparse user feedback. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5409–5418

work page 2024
[21]

Hongxu Ma, Kai Tian, Tao Zhang, Xuefeng Zhang, Han Zhou, Chunjie Chen, Han Li, Jihong Guan, and Shuigeng Zhou. 2024. Generative Regression Based Watch Time Prediction for Short-Video Recommendation.arXiv preprint arXiv:2412.20211(2024)

work page arXiv 2024
[22]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744

work page 2022
[23]

Shengjun Pan, Brendan Kitts, Tian Zhou, Hao He, Bharatbhushan Shetty, Aaron Flores, Djordje Gligorijevic, Junwei Pan, Tingyu Mao, San Gultekin, et al. 2020. Bid shading by win-rate estimation and surplus maximization.arXiv preprint arXiv:2009.09259(2020)

work page arXiv 2020
[24]

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems36 (2023), 53728–53741

work page 2023
[25]

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

work page
[26]

Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

work page 2023
[27]

Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, and Yong Yu. 2019. Deep landscape forecasting for real-time bidding advertising. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 363–372

work page 2019
[28]

Kan Ren, Weinan Zhang, Ke Chang, Yifei Rong, Yong Yu, and Jun Wang. 2017. Bidding machine: Learning to bid for directly optimizing profits in display ad- vertising.IEEE Transactions on Knowledge and Data Engineering30, 4 (2017), 645–659

work page 2017
[29]

Kan Ren, Weinan Zhang, Yifei Rong, Haifeng Zhang, Yong Yu, and Jun Wang

work page
[30]

InProceedings of the 25th acm international on conference on information and knowledge management

User response learning for directly optimizing campaign performance in display advertising. InProceedings of the 25th acm international on conference on information and knowledge management. 679–688

work page
[31]

Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. Inproceedings of the 2008 conference on empirical methods in natural language processing. 1070–1079

work page 2008
[32]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, et al. 2024. Deepseekmath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[33]

Nian Si, San Gultekin, Jose Blanchet, and Aaron Flores. 2023. Optimal bidding and experimentation for multi-layer auctions in online advertising.A vailable at SSRN 4358914(2023)

work page 2023
[34]

2017.Explainer: More On The Widespread Fee Practice Behind The Guardian’s Lawsuit Vs

Sarah Sluis. 2017.Explainer: More On The Widespread Fee Practice Behind The Guardian’s Lawsuit Vs. Rubicon Project. https://www.adexchanger.com/ad- exchange-news/explainer-widespread-fee-practice-behind-guardians- lawsuit-vs-rubicon-project/

work page 2017
[35]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning.Advances in neural information processing systems30 (2017)

work page 2017
[36]

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models.Advances in neural information processing systems28 (2015)

work page 2015
[37]

Jie Sun, Zhaoying Ding, Xiaoshuang Chen, Qi Chen, Yincheng Wang, Kaiqiao Zhan, and Ben Wang. 2024. Cread: A classification-restoration framework with error adaptive discretization for watch time prediction in video recommender systems. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 9027–9034

work page 2024
[38]

Zixiao Wang, Zhenzhe Zheng, Yanrong Kang, and Jiani Huang. 2024. Cost- Effective Active Learning for Bid Exploration in Online Advertising. InProceed- ings of the 17th ACM International Conference on Web Search and Data Mining. 788–796

work page 2024
[39]

Wush Chi-Hsuan Wu, Mi-Yen Yeh, and Ming-Syan Chen. 2015. Predicting winning price in real time bidding with censored data. InProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1305–1314

work page 2015
[40]

Wei Zhang, Brendan Kitts, Yanjun Han, Zhengyuan Zhou, Tingyu Mao, Hao He, Shengjun Pan, Aaron Flores, San Gultekin, and Tsachy Weissman. 2021. MEOW: A space-efficient nonparametric bid shading algorithm. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3928–3936

work page 2021
[41]

Tian Zhou, Hao He, Shengjun Pan, Niklas Karlsson, Bharatbhushan Shetty, Brendan Kitts, Djordje Gligorijevic, San Gultekin, Tingyu Mao, Junwei Pan, et al

work page
[42]

InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

An efficient deep distribution network for bid shading in first-price auctions. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3996–4004

work page
[43]

Wen-Yuan Zhu, Wen-Yueh Shih, Ying-Hsuan Lee, Wen-Chih Peng, and Jiun- Long Huang. 2017. A gamma-based regression for winning price estimation in real-time bidding advertising. In2017 IEEE International Conference on Big Data (Big Data). IEEE, 1610–1619. Yinqiu Huang et al. A Vocabulary Construction We use the shading ratios generated by the two-stage meth...

work page 2017

[1] [1]

2017.Demystifying Auction Dynamics for Digital Buyers and Sell- ers.https://www.appnexus.com/sites/default/files/whitepapers/49344-CM- Auction-Type-Whitepaper-V9.pdf

AppNexus. 2017.Demystifying Auction Dynamics for Digital Buyers and Sell- ers.https://www.appnexus.com/sites/default/files/whitepapers/49344-CM- Auction-Type-Whitepaper-V9.pdf

work page 2017

[2] [2]

Mohammad Gheshlaghi Azar, Zhaohan Daniel Guo, Bilal Piot, Remi Munos, Mark Rowland, Michal Valko, and Daniele Calandriello. 2024. A general theoret- ical paradigm to understand learning from human preferences. InInternational Conference on Artificial Intelligence and Statistics. PMLR, 4447–4455

work page 2024

[3] [3]

2019.Rolling out first price auctions to Google Ad Manager part- ners

Jason Bigler. 2019.Rolling out first price auctions to Google Ad Manager part- ners. https://blog.google/products/admanager/rolling-out-first-price-auctions- google-ad-manager-partners/

work page 2019

[4] [4]

Olivier Chapelle. 2015. Offline evaluation of response prediction in online advertising auctions. InProceedings of the 24th international conference on world wide web. 919–922

work page 2015

[5] [5]

Sayak Ray Chowdhury, Anush Kini, and Nagarajan Natarajan. [n. d.]. Provably Robust DPO: Aligning Language Models with Noisy Feedback. InForty-first International Conference on Machine Learning

work page

[6] [6]

John M Crespi and Richard J Sexton. 2005. A Multinomial logit framework to estimate bid shading in procurement auctions: Application to cattle sales in the Texas Panhandle.Review of industrial organization27, 3 (2005), 253–278

work page 2005

[7] [7]

Ying Cui, Ruofei Zhang, Wei Li, and Jianchang Mao. 2011. Bid landscape forecast- ing in online ad exchange marketplace. InProceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. 265–273

work page 2011

[8] [8]

Jiaxin Deng, Shiyao Wang, Kuo Cai, Lejian Ren, Qigen Hu, Weifeng Ding, Qiang Luo, and Guorui Zhou. 2025. Onerec: Unifying retrieve and rank with generative recommender and iterative preference alignment.arXiv preprint arXiv:2502.18965 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[9] [9]

Jingtong Gao, Bo Chen, Menghui Zhu, Xiangyu Zhao, Xiaopeng Li, Yuhao Wang, Yichao Wang, Huifeng Guo, and Ruiming Tang. 2023. Scenario-Aware Hierar- chical Dynamic Network for Multi-Scenario Recommendation.arXiv preprint arXiv:2309.02061(2023)

work page arXiv 2023

[10] [10]

2017.RTB Auctions: Fair Play?https://blog.getintent.com/rtb- auctions-fair-play-3b372d505089

Getintent. 2017.RTB Auctions: Fair Play?https://blog.getintent.com/rtb- auctions-fair-play-3b372d505089

work page 2017

[11] [11]

Djordje Gligorijevic, Tian Zhou, Bharatbhushan Shetty, Brendan Kitts, Shengjun Pan, Junwei Pan, and Aaron Flores. 2020. Bid shading in the brave new world of first-price auctions. InProceedings of the 29th ACM International Conference on Information & Knowledge Management. 2453–2460

work page 2020

[12] [12]

Zhen Gong, Lvyin Niu, Yang Zhao, Miao Xu, Haoqi Zhang, Zhenzhe Zheng, Zhilin Zhang, Rongquan Bai, Chuan Yu, Jian Xu, et al. 2023. MEBS: Multi-task End-to-end Bid Shading for Multi-slot Display Advertising. InProceedings of the 32nd ACM International Conference on Information and Knowledge Management. 4588–4594

work page 2023

[13] [13]

Xian Guo, Ben Chen, Siyuan Wang, Ying Yang, Chenyi Lei, Yuqing Ding, and Han Li. 2025. OneSug: The Unified End-to-End Generative Framework for E-commerce Query Suggestion.arXiv preprint arXiv:2506.06913(2025)

work page arXiv 2025

[14] [14]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models.Advances in neural information processing systems33 (2020), 6840–6851

work page 2020

[15] [15]

Jiani Huang, Zhenzhe Zheng, Yanrong Kang, and Zixiao Wang. 2024. From Sec- ond to First: Mixed Censored Multi-Task Learning for Winning Price Prediction. InProceedings of the 17th ACM International Conference on Web Search and Data Mining. 295–303

work page 2024

[16] [16]

Xu Li, Michelle Ma Zhang, Zhenya Wang, and Youjun Tong. 2022. Arbitrary distribution modeling with censorship in real-time bidding advertising. InPro- ceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3250–3258

work page 2022

[17] [17]

Hairen Liao, Lingxiao Peng, Zhenchuan Liu, and Xuehua Shen. 2014. iPinYou global rtb bidding algorithm competition dataset. InProceedings of the Eighth International Workshop on Data Mining for Online Advertising. 1–6

work page 2014

[18] [18]

Xiao Lin, Xiaokai Chen, Linfeng Song, Jingwei Liu, Biao Li, and Peng Jiang

work page

[19] [19]

InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Tree based progressive regression model for watch-time prediction in short-video recommendation. InProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4497–4506

work page

[20] [20]

Zhutian Lin, Junwei Pan, Shangyu Zhang, Ximei Wang, Xi Xiao, Shudong Huang, Lei Xiao, and Jie Jiang. 2024. Understanding the ranking loss for recommendation with sparse user feedback. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 5409–5418

work page 2024

[21] [21]

Hongxu Ma, Kai Tian, Tao Zhang, Xuefeng Zhang, Han Zhou, Chunjie Chen, Han Li, Jihong Guan, and Shuigeng Zhou. 2024. Generative Regression Based Watch Time Prediction for Short-Video Recommendation.arXiv preprint arXiv:2412.20211(2024)

work page arXiv 2024

[22] [22]

Long Ouyang, Jeffrey Wu, Xu Jiang, Diogo Almeida, Carroll Wainwright, Pamela Mishkin, Chong Zhang, Sandhini Agarwal, Katarina Slama, Alex Ray, et al. 2022. Training language models to follow instructions with human feedback.Advances in neural information processing systems35 (2022), 27730–27744

work page 2022

[23] [23]

Shengjun Pan, Brendan Kitts, Tian Zhou, Hao He, Bharatbhushan Shetty, Aaron Flores, Djordje Gligorijevic, Junwei Pan, Tingyu Mao, San Gultekin, et al. 2020. Bid shading by win-rate estimation and surplus maximization.arXiv preprint arXiv:2009.09259(2020)

work page arXiv 2020

[24] [24]

Rafael Rafailov, Archit Sharma, Eric Mitchell, Christopher D Manning, Stefano Ermon, and Chelsea Finn. 2023. Direct preference optimization: Your language model is secretly a reward model.Advances in neural information processing systems36 (2023), 53728–53741

work page 2023

[25] [25]

Shashank Rajput, Nikhil Mehta, Anima Singh, Raghunandan Hulikal Keshavan, Trung Vu, Lukasz Heldt, Lichan Hong, Yi Tay, Vinh Tran, Jonah Samost, et al

work page

[26] [26]

Recommender systems with generative retrieval.Advances in Neural Information Processing Systems36 (2023), 10299–10315

work page 2023

[27] [27]

Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, and Yong Yu. 2019. Deep landscape forecasting for real-time bidding advertising. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 363–372

work page 2019

[28] [28]

Kan Ren, Weinan Zhang, Ke Chang, Yifei Rong, Yong Yu, and Jun Wang. 2017. Bidding machine: Learning to bid for directly optimizing profits in display ad- vertising.IEEE Transactions on Knowledge and Data Engineering30, 4 (2017), 645–659

work page 2017

[29] [29]

Kan Ren, Weinan Zhang, Yifei Rong, Haifeng Zhang, Yong Yu, and Jun Wang

work page

[30] [30]

InProceedings of the 25th acm international on conference on information and knowledge management

User response learning for directly optimizing campaign performance in display advertising. InProceedings of the 25th acm international on conference on information and knowledge management. 679–688

work page

[31] [31]

Burr Settles and Mark Craven. 2008. An analysis of active learning strategies for sequence labeling tasks. Inproceedings of the 2008 conference on empirical methods in natural language processing. 1070–1079

work page 2008

[32] [32]

Zhihong Shao, Peiyi Wang, Qihao Zhu, Runxin Xu, Junxiao Song, Xiao Bi, Haowei Zhang, Mingchuan Zhang, YK Li, et al. 2024. Deepseekmath: Pushing the limits of mathematical reasoning in open language models.arXiv preprint arXiv:2402.03300 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[33] [33]

Nian Si, San Gultekin, Jose Blanchet, and Aaron Flores. 2023. Optimal bidding and experimentation for multi-layer auctions in online advertising.A vailable at SSRN 4358914(2023)

work page 2023

[34] [34]

2017.Explainer: More On The Widespread Fee Practice Behind The Guardian’s Lawsuit Vs

Sarah Sluis. 2017.Explainer: More On The Widespread Fee Practice Behind The Guardian’s Lawsuit Vs. Rubicon Project. https://www.adexchanger.com/ad- exchange-news/explainer-widespread-fee-practice-behind-guardians- lawsuit-vs-rubicon-project/

work page 2017

[35] [35]

Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical networks for few-shot learning.Advances in neural information processing systems30 (2017)

work page 2017

[36] [36]

Kihyuk Sohn, Honglak Lee, and Xinchen Yan. 2015. Learning structured output representation using deep conditional generative models.Advances in neural information processing systems28 (2015)

work page 2015

[37] [37]

Jie Sun, Zhaoying Ding, Xiaoshuang Chen, Qi Chen, Yincheng Wang, Kaiqiao Zhan, and Ben Wang. 2024. Cread: A classification-restoration framework with error adaptive discretization for watch time prediction in video recommender systems. InProceedings of the AAAI Conference on Artificial Intelligence, Vol. 38. 9027–9034

work page 2024

[38] [38]

Zixiao Wang, Zhenzhe Zheng, Yanrong Kang, and Jiani Huang. 2024. Cost- Effective Active Learning for Bid Exploration in Online Advertising. InProceed- ings of the 17th ACM International Conference on Web Search and Data Mining. 788–796

work page 2024

[39] [39]

Wush Chi-Hsuan Wu, Mi-Yen Yeh, and Ming-Syan Chen. 2015. Predicting winning price in real time bidding with censored data. InProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1305–1314

work page 2015

[40] [40]

Wei Zhang, Brendan Kitts, Yanjun Han, Zhengyuan Zhou, Tingyu Mao, Hao He, Shengjun Pan, Aaron Flores, San Gultekin, and Tsachy Weissman. 2021. MEOW: A space-efficient nonparametric bid shading algorithm. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3928–3936

work page 2021

[41] [41]

Tian Zhou, Hao He, Shengjun Pan, Niklas Karlsson, Bharatbhushan Shetty, Brendan Kitts, Djordje Gligorijevic, San Gultekin, Tingyu Mao, Junwei Pan, et al

work page

[42] [42]

InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining

An efficient deep distribution network for bid shading in first-price auctions. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. 3996–4004

work page

[43] [43]

Wen-Yuan Zhu, Wen-Yueh Shih, Ying-Hsuan Lee, Wen-Chih Peng, and Jiun- Long Huang. 2017. A gamma-based regression for winning price estimation in real-time bidding advertising. In2017 IEEE International Conference on Big Data (Big Data). IEEE, 1610–1619. Yinqiu Huang et al. A Vocabulary Construction We use the shading ratios generated by the two-stage meth...

work page 2017