Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
S., Turc, I., and Reitter, D
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
CONDITIONAL 2representative citing papers
Active Indexing with synthetic data augmentation for bidirectional fact-source binding during pretraining yields up to 30.2% higher citation precision than passive identifier appending on CitePretrainBench for Qwen models.
citing papers explorer
-
Evaluating Commercial AI Chatbots as News Intermediaries
Commercial AI chatbots reach over 90% multiple-choice accuracy on recent news facts but lose 11-17% in free response and drop to 19-70% on subtle false-premise questions, with retrieval failures causing most errors and clear Anglophone bias.
-
Cite Pretrain: Retrieval-Free Knowledge Attribution for Large Language Models
Active Indexing with synthetic data augmentation for bidirectional fact-source binding during pretraining yields up to 30.2% higher citation precision than passive identifier appending on CitePretrainBench for Qwen models.