Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation
Pith reviewed 2026-05-21 15:07 UTC · model grok-4.3
The pith
Multimodal LLMs show uneven resistance to cognitive biases in Chinese short-video health misinformation, with Gemini-2.5-Pro scoring 71.5 and o3 scoring 35.2 on a belief metric.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
A dataset of 200 Chinese short videos spanning four health domains supplies fine-grained annotations for three deceptive patterns—experimental errors, logical fallacies, and fabricated claims—each checked against national standards and academic sources. When eight frontier multimodal LLMs are tested in five modality settings, Gemini-2.5-Pro records the highest belief score of 71.5 out of 100 while o3 records the lowest at 35.2, and the models prove vulnerable to social cues such as authoritative channel IDs that trigger false beliefs.
What carries the argument
The manually annotated dataset of 200 short videos with labels for deceptive patterns, together with a belief score that quantifies model resistance across modality settings.
If this is right
- Gemini-2.5-Pro maintains higher resistance than other models when given full video input.
- Authoritative channel IDs reliably increase false belief rates across tested models.
- Performance varies with the modality setting, from text-only to complete video.
- The three deceptive patterns produce measurable differences in model belief scores.
- The framework supplies a repeatable benchmark for tracking progress on video misinformation.
Where Pith is reading between the lines
- Models trained on longer-form content may need targeted fine-tuning on short, fast-paced video formats to reduce these biases.
- Platforms that rely on MLLMs for moderation would still need separate checks for channel authority signals.
- The same evaluation could be repeated on non-health topics to test whether bias patterns generalize.
- Human viewers exposed to the same videos may exhibit parallel vulnerabilities, offering a comparison point for model behavior.
Load-bearing premise
The manually annotated dataset of 200 short videos supplies accurate fine-grained labels for deceptive patterns verified by national standards and literature, and the belief score accurately reflects each model's susceptibility to cognitive biases.
What would settle it
Running the identical eight models on a fresh collection of 200 short videos that preserve the same distribution of deceptive patterns and social cues and obtaining a reversed model ranking or loss of susceptibility to channel IDs.
Figures
read the original abstract
Short-video platforms have become major channels for misinformation, where deceptive claims frequently leverage visual experiments and social cues. While Multimodal Large Language Models (MLLMs) have demonstrated impressive reasoning capabilities, their robustness against misinformation entangled with cognitive biases remains under-explored. In this paper, we introduce a comprehensive evaluation framework using a high-quality, manually annotated dataset of 200 short videos spanning four health domains. This dataset provides fine-grained annotations for three deceptive patterns-experimental errors, logical fallacies, and fabricated claims-each verified by evidence such as national standards and academic literature. We evaluate eight frontier MLLMs across five modality settings. Experimental results demonstrate that Gemini-2.5-Pro achieves the highest performance in the multimodal setting with a belief score of 71.5/100, while o3 performs the worst at 35.2. Furthermore, we investigate social cues that induce false beliefs in videos and find that models are susceptible to biases like authoritative channel IDs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a manually annotated dataset of 200 Chinese short videos across four health domains with fine-grained labels for three deceptive patterns (experimental errors, logical fallacies, fabricated claims) verified against national standards and literature. It evaluates eight frontier MLLMs across five modality settings using a belief score to quantify susceptibility to cognitive biases in misinformation, reporting Gemini-2.5-Pro at 71.5/100 (highest) and o3 at 35.2 (lowest) in the multimodal setting, while also analyzing social cues such as authoritative channel IDs that induce false beliefs.
Significance. If the measurement pipeline is validated, the work would offer timely empirical evidence on MLLM vulnerabilities to visually and socially entangled misinformation in short-video platforms, with direct relevance to public health communication and AI safety in non-English contexts.
major comments (3)
- [Abstract and Evaluation Framework] Abstract and §4 (Evaluation): The headline model rankings rest on the belief score (e.g., 71.5 vs 35.2), yet the manuscript supplies no explicit definition, formula, or mapping from raw model outputs to the 0-100 scale, nor any inter-annotator agreement or statistical significance for the 200-video results; this directly undermines verification of the susceptibility claims.
- [Dataset Construction] §3 (Dataset): The claim that annotations are 'fine-grained' and 'verified by national standards' is load-bearing for all downstream results, but no annotation protocol, annotator count, agreement metric, or validation procedure is described, leaving open the possibility that label noise could reverse the reported ordering between models.
- [Social Cue Investigation] §5 (Social Cue Analysis): The finding that models are susceptible to biases like authoritative channel IDs inherits the same unverified belief-score pipeline; without details on how social cues are isolated or scored, the causal attribution to specific cues cannot be assessed.
minor comments (2)
- [Experimental Setup] Clarify the precise definitions of the five modality settings and how prompts differ across them.
- [Results] Add a table summarizing per-model, per-modality belief scores with standard deviations or confidence intervals.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below and have revised the manuscript to incorporate additional methodological details where the comments correctly identify gaps in the current version.
read point-by-point responses
-
Referee: [Abstract and Evaluation Framework] Abstract and §4 (Evaluation): The headline model rankings rest on the belief score (e.g., 71.5 vs 35.2), yet the manuscript supplies no explicit definition, formula, or mapping from raw model outputs to the 0-100 scale, nor any inter-annotator agreement or statistical significance for the 200-video results; this directly undermines verification of the susceptibility claims.
Authors: We agree that an explicit definition and formula for the belief score are required for reproducibility and verification. In the revised manuscript we have expanded §4 to include the precise definition of the belief score, the formula mapping raw model outputs (e.g., assessed belief level or probability) to the 0-100 scale, inter-annotator agreement statistics for the underlying annotations, and statistical significance tests for the reported model differences. revision: yes
-
Referee: [Dataset Construction] §3 (Dataset): The claim that annotations are 'fine-grained' and 'verified by national standards' is load-bearing for all downstream results, but no annotation protocol, annotator count, agreement metric, or validation procedure is described, leaving open the possibility that label noise could reverse the reported ordering between models.
Authors: The referee is correct that the current description of the annotation process is incomplete. We have revised §3 to provide the full annotation protocol, the number of annotators, the inter-annotator agreement metric employed, and the specific validation steps used to cross-check labels against national standards and academic literature. revision: yes
-
Referee: [Social Cue Investigation] §5 (Social Cue Analysis): The finding that models are susceptible to biases like authoritative channel IDs inherits the same unverified belief-score pipeline; without details on how social cues are isolated or scored, the causal attribution to specific cues cannot be assessed.
Authors: We acknowledge that the social-cue analysis requires additional methodological transparency to support causal claims. In the revised §5 we have added explicit descriptions of how social cues (such as channel IDs) are isolated within the videos, the scoring procedure for their influence on model belief scores, and any controls applied to attribute effects to individual cues. revision: yes
Circularity Check
No circularity: direct empirical evaluation on new annotated dataset
full rationale
The paper constructs a new dataset of 200 short videos with manual annotations for deceptive patterns, then directly evaluates eight MLLMs across modality settings to produce belief scores. No equations, fitted parameters, predictions derived from subsets, or self-citations are invoked to justify the core results. The reported scores (71.5 for Gemini-2.5-Pro, 35.2 for o3) are outputs of the evaluation pipeline rather than inputs renamed or forced by definition. The framework is self-contained against external benchmarks with no load-bearing self-referential steps.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Manual annotations by experts using national standards and academic literature accurately capture the three deceptive patterns in the videos.
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:2505.21588 , year=
Springer. Young-Min Cho, Sharath Chandra Guntuku, and Lyle Ungar. 2025. Herd behavior: Investigating peer in- fluence in llm-based multi-agent systems.arXiv preprint arXiv:2505.21588. Yun-Shiuan Chuang, Agam Goyal, Nikunj Harlalka, Siddharth Suresh, Robert Hawkins, Sijia Yang, Dha- van Shah, Junjie Hu, and Timothy Rogers. 2024. Sim- ulating opinion dynami...
-
[2]
Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kai- jie Zhu, Yijia Xiao, and Jindong Wang
Measuring misinformation in video search platforms: An audit study on youtube.Proceed- ings of the ACM on human-computer interaction, 4(CSCW1):1–27. Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kai- jie Zhu, Yijia Xiao, and Jindong Wang. 2024. Agen- treview: Exploring peer review dynamics with llm agents. InEMNLP. Koray Kavukcuoglu. 2025. Gemini 2.5: O...
-
[3]
Covid-vts: Fact extraction and verification on short video platforms. InProceedings of the 17th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 178–188. Ryan Liu, Jiayi Geng, Addison J Wu, Ilia Sucholut- sky, Tania Lombrozo, and Thomas L Griffiths. 2025a. Mind your step (by step): Chain-of-thought can re- duce ...
-
[4]
InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 14444– 14452
Fakesv: A multimodal benchmark with rich social context for fake news detection on short video platforms. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 14444– 14452. Juan Carlos Medina Serrano, Orestis Papakyriakopou- los, and Simon Hegelich. 2020. Nlp-based feature extraction for the detection of covid-19 misinforma- t...
work page 2020
-
[5]
In2021 IEEE in- ternational conference on big data (big data), pages 899–908
A multimodal misinformation detector for covid-19 short videos on tiktok. In2021 IEEE in- ternational conference on big data (big data), pages 899–908. IEEE. Lanyu Shang, Yang Zhang, Yawen Deng, and Dong Wang. 2025. Multitec: a data-driven multimodal short video detection framework for healthcare mis- information on tiktok.IEEE Transactions on Big Data. K...
-
[6]
Jinpeng You, Yanghao Lin, Dazhen Lin, and Donglin Cao
Enhancing video rumor detection through multimodal deep feature fusion with time-sync com- ments.Information Processing & Management, 62(1):103935. Jinpeng You, Yanghao Lin, Dazhen Lin, and Donglin Cao. 2022. Video rumor classification based on multi-modal theme and keyframe fusion. InCCF conference on computer supported cooperative work and social comput...
work page 2022
-
[7]
Fact-r1: Towards explainable video misinfor- mation detection with deep reasoning.arXiv preprint arXiv:2505.16836. Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, and Maarten Sap. 2025a. The pimmur principles: En- suring validity in collective behavior of llm societies. arXiv preprint arXiv:2509.18052. 12 Xuhui Zhou,...
-
[8]
**Understand the Claim:** Carefully read and interpret the provided "Claim."
-
[9]
**Identify Core Assertions:** Break down the claim into its main assertions or factual statements
-
[10]
**Cross-Reference and Verify:** Access your knowledge base to determine the veracity of each assertion. Consider: * Is this statement factually incorrect? * Does it contradict established scientific consensus, historical records, or widely accepted knowledge? * Is it presented out of context or in a misleading way? * Does it promote a conspiracy theory, p...
-
[11]
Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...
-
[15]
Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...
-
[16]
**Analyze the entire text carefully.** Look for claims, statements, or assertions that could be factually incorrect, misleading, or unsubstantiated
-
[19]
Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...
-
[23]
Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...
-
[24]
**Analyze the entire frames carefully.** Look for claims, statements, assertions, or actions that could be factually incorrect, misleading, or unsubstantiated
-
[25]
**Identify specific points of potential misinformation.**
-
[26]
If the information is accurate, explain why it is not misinformation
Explain *why* the points constitute misinformation.** This explanation should reference common knowledge, widely accepted facts, or the logical inconsistencies within the provided text. If the information is accurate, explain why it is not misinformation
-
[27]
Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.