Characterizing Web Search in The Age of Generative AI
read the original abstract
The advent of LLMs has given rise to generative search, a new search paradigm in which LLMs retrieve information from the web related to a query and synthesize it into a single, coherent response. This paradigm differs fundamentally from traditional web search, where results are returned as a ranked list of independent web pages. In this paper, we ask: Along what dimensions does generative search differ from traditional search? We conduct a systematic comparison between Google organic search and five generative search systems from three providers: Google, OpenAI, and Perplexity. Our analysis reveals substantial variation among engines in their reliance on internal v.s. external knowledge, source diversity, and stability. While generative systems often achieve topical coverage comparable to traditional search, they do so using markedly different retrieval footprints and synthesis strategies. We further show that the outputs of generative search can vary across time and executions, raising new challenges for robustness. Our findings demonstrate that generative search introduces new dimensions that are not captured by existing evaluation paradigms, motivating the development of evaluations that explicitly account for retrieval behavior, synthesis, and stability in generative search systems.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
How Generative AI Disrupts Search: An Empirical Study of Google Search, Gemini, and AI Overviews
AI Overviews and Gemini retrieve substantially different sources than traditional Google search (Jaccard similarity <0.2), favor Google-owned content, appear for 51.5% of queries especially controversial ones, and are...
-
To Call or Not to Call: A Framework to Assess and Optimize LLM Tool Calling
LLMs often misalign their self-perceived need for tools with true need and utility, but lightweight estimators trained on hidden states can improve tool-calling decisions and task performance across multiple models and tasks.
-
From Citation Selection to Citation Absorption: A Measurement Framework for Generative Engine Optimization Across AI Search Platforms
A measurement study of 602 prompts across ChatGPT, Google AI Overview, and Perplexity finds that citation selection breadth and absorption depth diverge, with high-influence pages being longer, structured, and evidence-rich.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.