Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
Search arena: Analyzing search-augmented llms
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3representative citing papers
ECC calibrates semantic embeddings with posterior model comparisons and Bradley-Terry capability profiles to create flexible, mixed-membership query clusters that improve LLM capability ranking.
MARCA is a bilingual benchmark using 52 questions and validated checklists to evaluate LLM web-search completeness and correctness in English and Portuguese.
citing papers explorer
-
Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources
Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
-
Capturing LLM Capabilities via Evidence-Calibrated Query Clustering
ECC calibrates semantic embeddings with posterior model comparisons and Bradley-Terry capability profiles to create flexible, mixed-membership query clusters that improve LLM capability ranking.
-
MARCA: A Checklist-Based Benchmark for Multilingual Web Search
MARCA is a bilingual benchmark using 52 questions and validated checklists to evaluate LLM web-search completeness and correctness in English and Portuguese.