Conversational AI increases political knowledge as effectively as self-directed internet search
Pith reviewed 2026-05-18 18:31 UTC · model grok-4.3
The pith
Task-directed conversations with AI increase political knowledge as much as self-directed Google search.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
In a series of randomised controlled trials, task-directed conversations with AI to research specific political topics increase political knowledge to the same extent as self-directed Google search. This equivalence is observed across issues, models, and prompting strategies, with knowledge gauged by belief in true information rising and belief in misinformation falling.
What carries the argument
The randomised controlled trials that directly compare belief changes after AI-assisted versus self-directed search-based research on political questions.
Load-bearing premise
The knowledge measures used in the RCTs validly capture lasting changes in belief rather than temporary responses shaped by the experimental setting or demand characteristics.
What would settle it
Re-testing the same participants on political facts after a delay of several weeks to see if the knowledge gains persist equally in the AI and search groups.
Figures
read the original abstract
Conversational AI systems are increasingly being used in place of traditional search engines to help users complete information-seeking tasks. This has raised concerns in the political domain, where biased or hallucinated outputs could misinform voters or distort public opinion. However, in spite of these concerns, the extent to which conversational AI is used for political information-seeking, as well the potential impact of this use on users' political knowledge, remains uncertain. Here, we address these questions: First, in a representative national survey of the UK public (N = 2,499), we find that in the week before the 2024 election as many as 32% of chatbot users - and 13% of eligible UK voters - have used conversational AI to seek political information relevant to their electoral choice. Second, in a series of randomised controlled trials (N = 2,858 total) we find that across issues, models, and prompting strategies, task-directed conversations with AI to research specific political topics increase political knowledge (increase belief in true information and decrease belief in misinformation) to the same extent as self-directed Google search. Taken together, our results suggest that people in the UK are increasingly turning to conversational AI for information about politics. These findings substantially extend prior work by demonstrating that conversational AI's effects on political knowledge generalise across multiple topics, political perspectives, and model families, suggesting that the shift toward AI-assisted political information-seeking may not lead to increased public belief in political misinformation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper reports that in a representative UK survey (N=2,499) conducted the week before the 2024 election, up to 32% of chatbot users (13% of eligible voters) used conversational AI for politically relevant information. Across a series of RCTs (total N=2,858), task-directed conversations with AI on specific political topics increased belief in true statements and decreased belief in misinformation to the same extent as self-directed Google search, with this equivalence holding across issues, models, and prompting strategies.
Significance. If the equivalence result holds under scrutiny of the outcome measures, the work provides large-scale comparative evidence that conversational AI does not appear to worsen political knowledge relative to conventional search, extending prior findings on AI information-seeking to the political domain with multi-topic, multi-model coverage. The representative survey component adds timely descriptive data on adoption rates.
major comments (1)
- [RCT design and outcome measures] The central equivalence claim rests on immediate post-task belief measures in the RCTs. The manuscript provides no delayed retest, no explicit pre-registration details on retention checks, and limited abstract-level description of item wording or distractors; this leaves open that observed parity could reflect single-session demand characteristics, social desirability, or verbatim recall rather than durable knowledge gains (see skeptic note on weakest assumption).
minor comments (1)
- [Abstract] The abstract states results 'across issues, models, and prompting strategies' but does not enumerate the specific topics or models tested; adding this detail would improve transparency without altering the main text.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We address the major comment regarding RCT design and outcome measures below, with clarifications on what the study can and cannot support.
read point-by-point responses
-
Referee: [RCT design and outcome measures] The central equivalence claim rests on immediate post-task belief measures in the RCTs. The manuscript provides no delayed retest, no explicit pre-registration details on retention checks, and limited abstract-level description of item wording or distractors; this leaves open that observed parity could reflect single-session demand characteristics, social desirability, or verbatim recall rather than durable knowledge gains (see skeptic note on weakest assumption).
Authors: We acknowledge that the RCTs rely on immediate post-task belief measures rather than delayed retests. This design matches the acute nature of the information-seeking task and enables a direct, controlled comparison to self-directed Google search, consistent with prior experimental work on search effects. A delayed retest was not included because the study protocol focused on immediate belief updating following the task; data collection is now complete and such a follow-up cannot be added. The study was pre-registered, though retention checks were outside the original scope. In revision we will expand the methods section with full item wording, distractor details, and selection criteria, and we will add explicit discussion of potential demand characteristics, noting that both AI and search conditions used parallel instructions with participants unaware of the equivalence hypothesis. These changes will clarify the scope of the equivalence finding without overstating durability. revision: partial
- Absence of delayed retest data to evaluate long-term retention of belief changes.
Circularity Check
No circularity: purely empirical RCT and survey results
full rationale
The paper presents a national survey (N=2499) and series of RCTs (N=2858) comparing task-directed AI conversations to self-directed Google search on political knowledge outcomes. No equations, derivations, fitted parameters, or predictive models are defined or used. The core equivalence claim is a direct statistical comparison of independent experimental conditions. No self-citations are invoked to justify uniqueness or load-bearing premises, and the design does not rename or smuggle in prior results by construction. This is a standard empirical comparison with no reduction of outputs to inputs via definition or fitting.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The national survey sample is representative of the UK public for estimating chatbot usage rates.
- domain assumption The belief measures in the RCTs reflect genuine changes in political knowledge rather than experimental artifacts.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
across issues, models, and prompting strategies, task-directed conversations with AI ... increase political knowledge ... to the same extent as self-directed Google search
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Bayesian Generalized Linear Models (GLM1-3 ... ordered-logistic likelihood
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 2 Pith papers
-
People readily follow personal advice from AI but it does not improve their well-being
Large longitudinal RCT finds high rates of following AI personal advice but no sustained well-being gains versus a hobbies control condition.
-
What Is The Political Content in LLMs' Pre- and Post-Training Data?
Training data for open LLMs is systematically left-leaning, with pre-training corpora containing more political material than post-training data and model stances aligning with data distributions.
Reference graph
Works this paper leans on
-
[1]
Can knowledge graphs reduce hallucinations in LLMs? : A survey
Garima Agrawal, Tharindu Kumarage, Zeyad Alghamdi, and Huan Liu. Can knowledge graphs reduce hallucinations in LLMs? : A survey. In Kevin Duh, Helena Gomez, and Steven Bethard, editors,Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), page...
work page 2024
-
[2]
doi: 10.18653/v1/2024.naacl-long.219
Association for Computational Linguistics. doi: 10.18653/v1/2024.naacl-long.219. URL https://aclanthology.org/2024.naacl-long. 219/. Anthropic. Clio: Privacy-preserving insights into real-world ai use. https://assets.anthropic.com/m/ 7e1ab885d1b24176/original/Clio-Privacy-Preserving-Insights-into-Real-World-AI-Use.pdf ,
-
[3]
Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal
URLhttps://arxiv.org/abs/2507.03772. Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. Detecting hallucinations in large language models using semantic entropy.Nature, 630(8017):625–630,
-
[4]
J Hartmann, J Schwenzow, and M Witte. The political ideology of conversational ai: Converging evidence on chatgpt’s pro-environmental, left-libertarian orientation.http://arxiv.org/abs/2301.01768,
-
[5]
The No-U-Turn Sampler: Adaptively Setting Path Lengths in Hamiltonian Monte Carlo
URLhttp://arxiv.org/abs/1111.4246. L Huang, W Yu, W Ma, W Zhong, Z Feng, H Wang, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions.ACM Transactions on Information Systems, page 3703155,
work page internal anchor Pith review Pith/arXiv arXiv
-
[6]
doi: 10.1145/3703155. Z Ji, N Lee, R Frieske, T Yu, D Su, Y Xu, et al. Survey of hallucination in natural language generation.ACM Computing Surveys, 55:1–38,
-
[7]
doi: 10.1145/3571730. T Laher. Who do we trust the most? https://www.ipsos.com/sites/default/files/ct/news/documents/ 2024-09/Ipsos%20BandA%20%20Veracity%20Index%202024.pdf,
-
[8]
URL https://arxiv.org/abs/2505. 05602. C McClain. Americans’ use of chatgpt is ticking up, but few trust its elec- tion information. https://www.pewresearch.org/short-reads/2024/03/26/ americans-use-of-chatgpt-is-ticking-up-but-few-trust-its-election-information/ #chatgpt-and-the-2024-presidential-election,
work page 2024
-
[9]
Reuters institute digital news report 2024,
N Newman, R Fletcher, CT Robertson, A Ross Arguedas, and RK Nielsen. Reuters institute digital news report 2024,
work page 2024
-
[10]
Composable Effects for Flexible and Accelerated Probabilistic Programming in NumPyro
URLhttp://arxiv.org/abs/1912.11554. P Röttger, V Hofmann, V Pyatkin, M Hinck, HR Kirk, H Schütze, et al. Political compass or spinning arrow? towards more meaningful evaluations for values and opinions in large language models. http://arxiv.org/abs/2402. 16786,
work page internal anchor Pith review Pith/arXiv arXiv 1912
-
[11]
How will advanced ai systems impact democracy?, 2024
JMLR.org. C Summerfield, L Argyle, M Bakker, T Collins, E Durmus, T Eloundou, et al. How will advanced ai systems impact democracy?http://arxiv.org/abs/2409.06729,
-
[12]
11 Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D Manning, and Chelsea Finn
doi: 10.1080/19331681.2024.2422929. 11 Katherine Tian, Eric Mitchell, Huaxiu Yao, Christopher D Manning, and Chelsea Finn. Fine-tuning language models for factuality. InThe Twelfth International Conference on Learning Representations, Vienna, Austria, May
-
[13]
Survey on factuality in large language models: Knowledge, retrieval and domain-specificity
URLhttps://openreview.net/forum?id=8435. C Wang, X Liu, Y Yue, X Tang, T Zhang, C Jiayang, et al. Survey on factuality in large language models: Knowledge, retrieval and domain-specificity.http://arxiv.org/abs/2310.07521,
-
[14]
12 Supplementary Information Survey Demographics For each variable we show the percentages in the weighted sample with the raw percentages in parentheses. • Gender: –Male: 48.34% (51.62%) –Female: 51.46% (48.18%) • Age group: –18 to 24: 11.60% (12.61%) –25 to 34: 14.01% (19.37%) –35 to 54: 37.41% (34.73%) –55 to 64: 14.33% (14.17%) –65+: 22.65% (19.13%) •...
work page 2030
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.