How Large Language Models Source Brand Reputation Across Languages and Markets

Dmitrij Zatuchin

arxiv: 2606.25787 · v1 · pith:GAUPS3TRnew · submitted 2026-06-24 · 💻 cs.IR · cs.CL· cs.CY

How Large Language Models Source Brand Reputation Across Languages and Markets

Dmitrij Zatuchin This is my paper

Pith reviewed 2026-06-25 19:03 UTC · model grok-4.3

classification 💻 cs.IR cs.CLcs.CY

keywords large language modelsbrand reputationweb citationsthird-party sourcesWikipediainformation sourcingmarket analysisAI visibility

0 comments

The pith

Large language models cite third-party sites for 85.7 percent of brand answers rather than company-owned pages.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper measures the web domains that large language models cite when they answer questions about specific companies. It merges citation data from three datasets covering 128 brands in 12 markets and 13 languages to count 167551 grounded URLs. The results show heavy reliance on external sites, strong concentration on a small number of domains, and consistent dominance by Wikipedia except in a few markets. These patterns matter because the cited sources directly shape the facts and tone the models use to describe each brand. The work therefore maps the information supply chain that determines AI-generated corporate reputation.

Core claim

When large language models answer brand questions they attribute 85.7 percent of citations to third-party domains and only 14.3 percent to domains the brand itself controls. The domain distribution is long-tailed and follows a Zipf law with alpha equal to 0.86. Wikipedia ranks as the single most-cited domain in 11 of the 12 languages studied. In the Polish market YouTube leads and four HR and career portals together supply more than twice as many citations as Polish Wikipedia.

What carries the argument

Classification of each citation URL as owned by the brand or third-party, followed by aggregation by language and market to reveal concentration and dominance.

If this is right

Brand reputation inside AI answers is shaped mainly by how external sites describe the company.
Wikipedia supplies the largest single share of AI brand information in almost every language.
Because citations follow a Zipf distribution, a modest shift in the top 18 percent of domains affects the majority of outputs.
Market-specific source preferences, such as YouTube dominance for Polish brands, produce different citation mixes even for national companies.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Companies could gain more AI visibility by ensuring accurate information appears on Wikipedia and other top-cited domains than by focusing only on their own sites.
Small edits or content changes on the leading third-party domains could alter AI descriptions of many brands at once.
The observed patterns may differ for non-brand queries or for models trained after the datasets were collected.

Load-bearing premise

The Rankfor.AI datasets and the owned-versus-third-party labels accurately capture the actual grounding behavior of the models without bias from query wording or model choice.

What would settle it

A new run of the same brand queries on additional models or with rephrased prompts that produces a third-party citation share below 70 percent would falsify the central proportion.

read the original abstract

When a large language model (LLM) answers a question about a company, it grounds the answer in retrieved web sources, and those sources decide what the model says. Most analysis of AI brand visibility looks at the answer text. This study looks one step earlier, at the citations. We merge three Rankfor.AI datasets covering 128 brands across 12 home markets and 13 languages, and analyse 167,551 URL-grounded citations (189,974 total attribution rows). We classify each citation by domain and source type and measure where AI gets its brand information, by language and by market. Four patterns hold. First, AI grounds brand answers overwhelmingly in third-party sources: 85.7% of citations point to sites the brand does not own, against 14.3% owned. Second, the source base is concentrated and long-tailed: 80% of citations come from about 18% of domains, fitting a Zipf law (alpha = 0.86, R^2 = 0.983). Third, one reference site dominates almost everywhere: Wikipedia is the most-cited domain in 11 of 12 languages, the exception being Lithuanian, where the business daily vz.lt edges it (4.38%). Fourth, the source mix is market-specific at the margin: for 46 Polish national brands the most-cited domain is YouTube, and four HR and careers portals supply 637 citations against 297 for Polish Wikipedia, about twice as many.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper counts LLM citations for 128 brands and finds 85.7% come from third-party domains, with Wikipedia dominant and some market differences, but the owned-versus-third-party labels lack any described validation.

read the letter

The core result is straightforward: across 167k citations, LLMs ground brand answers in third-party sites 85.7% of the time. Wikipedia leads in 11 of 12 languages, the source distribution follows a Zipf with alpha 0.86, and Polish brands show YouTube and HR sites outranking Wikipedia.

What is new is the scale and framing. Prior citation work in IR exists, but this one ties the counts specifically to brand reputation queries, runs them across 13 languages and 12 markets, and reports the owned share explicitly. The Polish pattern is a concrete observation that stands out.

The numbers are reported cleanly and the sample is large. The Zipf fit is given with an R-squared, which is better than many descriptive papers.

The soft spot is the classification step. The 85.7% figure requires every citation to be labeled owned or not. The abstract says domains were classified by source type but gives no rules, no list of brand domains, no handling of subdomains or subsidiaries, and no spot checks. A few percent error on high-frequency domains would move the headline split noticeably. No error bars appear on the percentages either.

The markets and languages are presented without discussion of how representative they are. The work stays descriptive; there are no predictions or tests that would let someone falsify the patterns beyond re-running the same queries.

This is useful for people who track how LLMs surface brand information in practice, such as marketing analysts or IR researchers working on grounding. It does not reorganize a field. A serious referee could check the classification procedure and ask for robustness checks on the labels. I would send it to review rather than desk reject, mainly because the raw counts are large and the cross-market variation is observable even if the exact split needs tighter documentation.

Referee Report

2 major / 2 minor

Summary. The manuscript merges three Rankfor.AI datasets covering 128 brands across 12 markets and 13 languages and examines 167551 URL-grounded citations (189974 attribution rows). It reports that 85.7% of citations come from third-party domains versus 14.3% from brand-owned domains, that the domain distribution is long-tailed and fits a Zipf law (alpha=0.86, R^2=0.983), that Wikipedia is the most-cited domain in 11 of 12 languages, and that market-specific patterns exist (e.g., YouTube leads for Polish brands while HR portals outrank Wikipedia).

Significance. If the domain classifications are reliable, the work supplies a large-scale descriptive baseline on LLM grounding behavior for brands. The explicit counts, concentration statistic, and cross-language/market comparisons are concrete and could inform studies of AI-mediated brand visibility. The sample size and reported R^2 value are strengths of the descriptive analysis.

major comments (2)

[Abstract] Abstract: the central 85.7% third-party claim requires every one of the 167551 citations to be correctly labeled owned versus third-party. The text states only that citations were classified by domain and source type; no decision rules, brand-domain lists, handling of subsidiaries/subdomains, or validation (inter-rater reliability, spot-checks) are described. Systematic misclassification of even a few high-frequency domains would move the reported split by several points.
[Abstract] Abstract: no justification is offered for the representativeness of the 12 markets and 13 languages, and no analysis addresses possible biases arising from query phrasing or the particular models underlying the Rankfor.AI datasets.

minor comments (2)

The reported percentages and Zipf parameters are given without error bars, confidence intervals, or other uncertainty measures.
The exact breakdown of brands per market/language and any filtering steps applied to the 189974 attribution rows are not stated.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback. We address each major comment below and will revise the manuscript to improve transparency and limitations discussion.

read point-by-point responses

Referee: [Abstract] Abstract: the central 85.7% third-party claim requires every one of the 167551 citations to be correctly labeled owned versus third-party. The text states only that citations were classified by domain and source type; no decision rules, brand-domain lists, handling of subsidiaries/subdomains, or validation (inter-rater reliability, spot-checks) are described. Systematic misclassification of even a few high-frequency domains would move the reported split by several points.

Authors: We agree that the classification methodology must be described in detail to support the 85.7% figure. In the revised manuscript we will add a dedicated subsection in Methods that specifies the decision rules for owned vs. third-party domains, the brand-domain lists employed, handling of subsidiaries and subdomains, and any validation performed (including spot-checks). revision: yes
Referee: [Abstract] Abstract: no justification is offered for the representativeness of the 12 markets and 13 languages, and no analysis addresses possible biases arising from query phrasing or the particular models underlying the Rankfor.AI datasets.

Authors: The 12 markets and 13 languages are those present in the merged Rankfor.AI datasets; we will add a short justification paragraph in Methods or Limitations explaining the data-driven selection and its diversity. We will also add an explicit limitations statement acknowledging possible biases from query phrasing and model-specific retrieval behavior, while clarifying that the work is descriptive rather than a generalizability study. A full bias analysis would require new experiments outside the current scope. revision: partial

Circularity Check

0 steps flagged

No circularity: purely descriptive empirical counts and one data fit

full rationale

The paper merges existing citation datasets, classifies URLs by domain ownership, reports raw percentages (85.7% third-party), identifies the most-cited domain per language/market, and fits a Zipf exponent to the observed frequency distribution. None of these steps derive a new quantity from prior outputs of the same analysis; the Zipf fit is a post-hoc statistical summary of the collected data rather than a prediction that reduces to the fit by construction. No self-citations, uniqueness theorems, or ansatzes are invoked to support the central claims. The work is self-contained against external citation benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The work is observational and contains no free parameters, invented entities, or non-standard axioms beyond routine statistical assumptions about domain classification and Zipf fitting.

pith-pipeline@v0.9.1-grok · 5798 in / 1047 out tokens · 32122 ms · 2026-06-25T19:03:46.913725+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 2 canonical work pages

[1]

In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al (2020) Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020). arXiv:2005.11401

Pith/arXiv arXiv 2020
[2]

Computational Linguistics 49(4):777–840

Rashkin H, Nikolaev V, Lamm M, Aroyo L, Collins M, Das D, Petrov S, Singh Tomar G, Turc I, Reitter D (2023) Measuring attribution in natural language generation models. Computational Linguistics 49(4):777–840. arXiv:2112.12870

arXiv 2023
[3]

In: Findings of the Association for Computational Linguistics: EMNLP

Liu NF, Zhang T, Liang P (2023) Evaluating verifiability in generative search engines. In: Findings of the Association for Computational Linguistics: EMNLP

2023
[4]

In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24)

Aggarwal P, Murahari V, Rajpurohit T, Kalyan A, Narasimhan K, Deshpande A (2024) GEO: generative engine optimization. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24). arXiv:2311.09735

arXiv 2024
[5]

arXiv:2509.08919

Chen M, Wang X, Chen K, Koudas N (2025) Generative engine optimization: how to dominate AI search. arXiv:2509.08919

arXiv 2025
[6]

arXiv:2606.20065

Kumar P (2026) Generative engine optimization at scale: measuring brand visibility across AI search engines. arXiv:2606.20065

Pith/arXiv arXiv 2026
[7]

arXiv:2507.05301

Yang KC (2025) News source citing patterns in AI search systems. arXiv:2507.05301

arXiv 2025
[8]

arXiv:2602.13415 11

Aral S, Li H, Zuo R (2026) The rise of AI search: implications for information markets and human judgement at scale. arXiv:2602.13415 11

arXiv 2026
[9]

arXiv:2510.11560

Kirsten E, Grosse Perdekamp J, Wu Q, Upadhyay M, Gummadi KP, Zafar MB (2025) Characterizing web search in the age of generative AI. arXiv:2510.11560

Pith/arXiv arXiv 2025
[10]

Proceedings of the Association for Information Science and Technology 61(1):205–217

Li A, Sinnamon L (2024) Generative AI search engines as arbiters of pub- lic knowledge: an audit of bias and authority. Proceedings of the Association for Information Science and Technology 61(1):205–217. https://doi.org/10.1002/ pra2.1021 (also arXiv:2405.14034)

arXiv 2024
[11]

Journal of Brand Management 7(4):241–255

Fombrun CJ, Gardberg NA, Sever JM (2000) The Reputation Quotient: a multi- stakeholder measure of corporate reputation. Journal of Brand Management 7(4):241–255. https://doi.org/10.1057/bm.2000.10

work page doi:10.1057/bm.2000.10 2000
[12]

Strategic Management Journal 23(12):1077–1093

Roberts PW, Dowling GR (2002) Corporate reputation and sustained superior financial performance. Strategic Management Journal 23(12):1077–1093. https: //doi.org/10.1002/smj.274

work page doi:10.1002/smj.274 2002
[13]

arXiv:2606.23165

˙Zatuchin D (2026) The language blind spot: how query language and brand recognition tier shape AI-constructed brand reputation across twelve European languages. arXiv:2606.23165

arXiv 2026
[14]

arXiv:2606.23057

˙Zatuchin D (2026) Who owns the AI recommendation? A multi-industry empirical map of brand category ownership across large language models. arXiv:2606.23057

Pith/arXiv arXiv 2026
[15]

Peec AI, 19 June 2026

Peec AI (2026) Top domains cited by AI search: analysis based on 30M sources. Peec AI, 19 June 2026. https://peec.ai/blog/ top-domains-cited-by-ai-search-analysis-based-on-30m-sources

2026
[16]

Omniscient Digital, 8 January 2026

Dombrowski C (2026) How LLMs source brand information: an analysis of 23,000+ AI citations. Omniscient Digital, 8 January 2026. https://beomniscient. com/blog/how-llms-source-brand-information/ 12

2026

[1] [1]

In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al (2020) Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems 33 (NeurIPS 2020). arXiv:2005.11401

Pith/arXiv arXiv 2020

[2] [2]

Computational Linguistics 49(4):777–840

Rashkin H, Nikolaev V, Lamm M, Aroyo L, Collins M, Das D, Petrov S, Singh Tomar G, Turc I, Reitter D (2023) Measuring attribution in natural language generation models. Computational Linguistics 49(4):777–840. arXiv:2112.12870

arXiv 2023

[3] [3]

In: Findings of the Association for Computational Linguistics: EMNLP

Liu NF, Zhang T, Liang P (2023) Evaluating verifiability in generative search engines. In: Findings of the Association for Computational Linguistics: EMNLP

2023

[4] [4]

In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24)

Aggarwal P, Murahari V, Rajpurohit T, Kalyan A, Narasimhan K, Deshpande A (2024) GEO: generative engine optimization. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD ’24). arXiv:2311.09735

arXiv 2024

[5] [5]

arXiv:2509.08919

Chen M, Wang X, Chen K, Koudas N (2025) Generative engine optimization: how to dominate AI search. arXiv:2509.08919

arXiv 2025

[6] [6]

arXiv:2606.20065

Kumar P (2026) Generative engine optimization at scale: measuring brand visibility across AI search engines. arXiv:2606.20065

Pith/arXiv arXiv 2026

[7] [7]

arXiv:2507.05301

Yang KC (2025) News source citing patterns in AI search systems. arXiv:2507.05301

arXiv 2025

[8] [8]

arXiv:2602.13415 11

Aral S, Li H, Zuo R (2026) The rise of AI search: implications for information markets and human judgement at scale. arXiv:2602.13415 11

arXiv 2026

[9] [9]

arXiv:2510.11560

Kirsten E, Grosse Perdekamp J, Wu Q, Upadhyay M, Gummadi KP, Zafar MB (2025) Characterizing web search in the age of generative AI. arXiv:2510.11560

Pith/arXiv arXiv 2025

[10] [10]

Proceedings of the Association for Information Science and Technology 61(1):205–217

Li A, Sinnamon L (2024) Generative AI search engines as arbiters of pub- lic knowledge: an audit of bias and authority. Proceedings of the Association for Information Science and Technology 61(1):205–217. https://doi.org/10.1002/ pra2.1021 (also arXiv:2405.14034)

arXiv 2024

[11] [11]

Journal of Brand Management 7(4):241–255

Fombrun CJ, Gardberg NA, Sever JM (2000) The Reputation Quotient: a multi- stakeholder measure of corporate reputation. Journal of Brand Management 7(4):241–255. https://doi.org/10.1057/bm.2000.10

work page doi:10.1057/bm.2000.10 2000

[12] [12]

Strategic Management Journal 23(12):1077–1093

Roberts PW, Dowling GR (2002) Corporate reputation and sustained superior financial performance. Strategic Management Journal 23(12):1077–1093. https: //doi.org/10.1002/smj.274

work page doi:10.1002/smj.274 2002

[13] [13]

arXiv:2606.23165

˙Zatuchin D (2026) The language blind spot: how query language and brand recognition tier shape AI-constructed brand reputation across twelve European languages. arXiv:2606.23165

arXiv 2026

[14] [14]

arXiv:2606.23057

˙Zatuchin D (2026) Who owns the AI recommendation? A multi-industry empirical map of brand category ownership across large language models. arXiv:2606.23057

Pith/arXiv arXiv 2026

[15] [15]

Peec AI, 19 June 2026

Peec AI (2026) Top domains cited by AI search: analysis based on 30M sources. Peec AI, 19 June 2026. https://peec.ai/blog/ top-domains-cited-by-ai-search-analysis-based-on-30m-sources

2026

[16] [16]

Omniscient Digital, 8 January 2026

Dombrowski C (2026) How LLMs source brand information: an analysis of 23,000+ AI citations. Omniscient Digital, 8 January 2026. https://beomniscient. com/blog/how-llms-source-brand-information/ 12

2026