GEM-Bench: A Benchmark for Ad-Injected Response Generation within Generative Engine Marketing

Shiqi Zhang; Silan Hu; Xiaokui Xiao; Yimin Shi

arxiv: 2509.14221 · v3 · pith:7PTSSX35new · submitted 2025-09-17 · 💻 cs.IR · cs.CL

GEM-Bench: A Benchmark for Ad-Injected Response Generation within Generative Engine Marketing

Silan Hu , Shiqi Zhang , Yimin Shi , Xiaokui Xiao This is my paper

classification 💻 cs.IR cs.CL

keywords ad-injectedgem-benchresponsesbenchmarkgenerationgenerativeengagementengine

0 comments

read the original abstract

Generative Engine Marketing (GEM) is an emerging ecosystem for monetizing generative engines, such as LLM-based chatbots, by seamlessly integrating relevant advertisements into their responses. At the core of GEM lies the generation and evaluation of ad-injected responses. However, existing benchmarks are not specifically designed for this purpose, which limits future research. To address this gap, we propose GEM-Bench, the first comprehensive benchmark for ad-injected response generation in GEM. GEM-Bench includes three curated datasets covering both chatbot and search scenarios, a metric ontology that captures multiple dimensions of user satisfaction and engagement, and several baseline solutions implemented within an extensible multi-agent framework. Our preliminary results indicate that, while simple prompt-based methods achieve reasonable engagement such as click-through rate, they often reduce user satisfaction. In contrast, approaches that insert ads based on pre-generated ad-free responses help mitigate this issue but introduce additional overhead. These findings highlight the need for future research on designing more effective and efficient solutions for generating ad-injected responses in GEM. The benchmark and all related resources are publicly available at https://gem-bench.org/.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

NaiAD: Initiate Data-Driven Research for LLM Advertising
cs.LG 2026-05 unverdicted novelty 7.0

NaiAD is a new dataset and framework for LLM-native advertising that uses decoupled generation and calibrated scoring to identify four semantic strategies for balancing user and commercial utilities.
Mechanism Design for Quality-Preserving LLM Advertising
cs.GT 2026-05 unverdicted novelty 6.0

A quality-preserving auction framework for LLM advertising uses RAG-based endogenous reserves and KL-regularized or screened VCG mechanisms to achieve DSIC, IR, higher revenue, and better semantic fidelity than baselines.
Generative AI Advertising as a Problem of Trustworthy Commercial Intervention
cs.CY 2026-05 unverdicted novelty 5.0

Generative AI advertising is reframed as a problem of trustworthy commercial intervention on the generative process, with a taxonomy of influence tiers from product mentions to long-term preference shaping.