Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
arXiv:2401.06730 , year=
5 Pith papers cite this work. Polarity classification is still indexing.
representative citing papers
CoRM-RAG uses a cognitive perturbation protocol to simulate biases and trains an Evidence Critic to retrieve documents that support correct decisions even under adversarial query changes.
The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
The paper consolidates risks of overreliance on LLMs, identifies gaps in current measurement approaches, and proposes mitigation strategies to keep AI as a human-compatible thought partner.
citing papers explorer
-
Synthetic Sources?: Auditing Generative Search Engine Citations for Evidence of AI-Generated Sources
Audit of ChatGPT, Copilot, Gemini and Perplexity finds ~16% of cited sources are AI-generated across 712 queries on politics, health and environment.
-
Beyond Semantic Relevance: Counterfactual Risk Minimization for Robust Retrieval-Augmented Generation
CoRM-RAG uses a cognitive perturbation protocol to simulate biases and trains an Evidence Critic to retrieve documents that support correct decisions even under adversarial query changes.
-
A Roadmap to Pluralistic Alignment
The paper formalizes three types of pluralistic AI models and three benchmark classes, arguing that current alignment techniques may reduce rather than increase distributional pluralism.
-
Calibrating Model-Based Evaluation Metrics for Summarization
A reference-free proxy scoring framework combined with GIRB calibration produces better-aligned evaluation metrics for summarization and outperforms baselines across seven datasets.
-
Measuring and mitigating overreliance to build human-compatible AI
The paper consolidates risks of overreliance on LLMs, identifies gaps in current measurement approaches, and proposes mitigation strategies to keep AI as a human-compatible thought partner.