pith. sign in

arxiv: 2601.06600 · v3 · pith:I4QXRJI5new · submitted 2026-01-10 · 💻 cs.CL

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

Pith reviewed 2026-05-21 15:07 UTC · model grok-4.3

classification 💻 cs.CL
keywords multimodal large language modelscognitive biasesmisinformationshort videoshealth claimsdeceptive patternsbelief scoresocial cues
0
0 comments X

The pith

Multimodal LLMs show uneven resistance to cognitive biases in Chinese short-video health misinformation, with Gemini-2.5-Pro scoring 71.5 and o3 scoring 35.2 on a belief metric.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds an evaluation framework to measure how multimodal large language models respond to short videos that mix visual demonstrations with deceptive claims and social signals. It supplies a set of 200 manually labeled videos drawn from four health topics, each tagged for experimental mistakes, logical flaws, or invented assertions backed by official standards. Tests across eight frontier models and five input formats reveal consistent performance gaps and specific weaknesses to cues such as authoritative channel names. These findings matter because short-video platforms now dominate health-information spread and models are increasingly used to interpret or filter such content.

Core claim

A dataset of 200 Chinese short videos spanning four health domains supplies fine-grained annotations for three deceptive patterns—experimental errors, logical fallacies, and fabricated claims—each checked against national standards and academic sources. When eight frontier multimodal LLMs are tested in five modality settings, Gemini-2.5-Pro records the highest belief score of 71.5 out of 100 while o3 records the lowest at 35.2, and the models prove vulnerable to social cues such as authoritative channel IDs that trigger false beliefs.

What carries the argument

The manually annotated dataset of 200 short videos with labels for deceptive patterns, together with a belief score that quantifies model resistance across modality settings.

If this is right

  • Gemini-2.5-Pro maintains higher resistance than other models when given full video input.
  • Authoritative channel IDs reliably increase false belief rates across tested models.
  • Performance varies with the modality setting, from text-only to complete video.
  • The three deceptive patterns produce measurable differences in model belief scores.
  • The framework supplies a repeatable benchmark for tracking progress on video misinformation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Models trained on longer-form content may need targeted fine-tuning on short, fast-paced video formats to reduce these biases.
  • Platforms that rely on MLLMs for moderation would still need separate checks for channel authority signals.
  • The same evaluation could be repeated on non-health topics to test whether bias patterns generalize.
  • Human viewers exposed to the same videos may exhibit parallel vulnerabilities, offering a comparison point for model behavior.

Load-bearing premise

The manually annotated dataset of 200 short videos supplies accurate fine-grained labels for deceptive patterns verified by national standards and literature, and the belief score accurately reflects each model's susceptibility to cognitive biases.

What would settle it

Running the identical eight models on a fresh collection of 200 short videos that preserve the same distribution of deceptive patterns and social cues and obtaining a reversed model ranking or loss of susceptibility to channel IDs.

Figures

Figures reproduced from arXiv: 2601.06600 by Chang Chen, Jen-tse Huang, Mark Dredze, Michelle R. Kaufman, Shiyang Lai, Wenxuan Wang.

Figure 1
Figure 1. Figure 1: Overview of the data structure. Upper-left: The high-quality dataset consists of twelve fields. Upper-right: Videos from Douyin and Kuaishou are processed into visual, textual, and aural modalities, with histograms depicting token length distributions. Lower-left: Misinformation is annotated with detailed error reasons, supporting evidence, and error types. Lower-right: The dataset is categorized into four… view at source ↗
Figure 2
Figure 2. Figure 2: Belief Scores (BS) of eight models across three error types on the false video subset. Yellow “Score” lines [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Belief Scores (BS) of four models with dif [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The difference (rwith_ID − rwithout_ID) rescaled by 100 3 across four verification statuses for eight models using the Claim setting and all video data. Results on the two subsets are provided in [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Average score decrease after using channel [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: The [CLAIM] is replaced by actual claim of each video. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: The [TEXT] is replaced by actual textual text of each video. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: The [TRANSCRIPT] is replaced by actual aural transcript of each video. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: The [IMAGE] is replaced by actual visual frames of each video. 19 [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: The [TRANSCRIPT] and [IMAGE] are replaced by actual aural transcripts and visual frames of each video. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: The [REASONING] and [REASON] are replaced by actual CoT output and the error reason for each video. Popularity Effect [Original Claim or Multimodal Prompt] — [Corresponding Data Input] This short video has received [A] views, [B] likes, [C] shares, and [D] comments [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: The [Corresponding Data Input] is replaced by actual input data along with (views, likes, shares, comments) popularity statistics. Channel ID Effect [Original Claim or Multimodal Prompt] — [Corresponding Data Input] This short video was uploaded by [CHANNEL ID] [PITH_FULL_IMAGE:figures/full_fig_p021_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: The [Corresponding Data Input] is replaced by actual input data along with [CHANNEL ID]. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_14.png] view at source ↗
read the original abstract

Short-video platforms have become major channels for misinformation, where deceptive claims frequently leverage visual experiments and social cues. While Multimodal Large Language Models (MLLMs) have demonstrated impressive reasoning capabilities, their robustness against misinformation entangled with cognitive biases remains under-explored. In this paper, we introduce a comprehensive evaluation framework using a high-quality, manually annotated dataset of 200 short videos spanning four health domains. This dataset provides fine-grained annotations for three deceptive patterns-experimental errors, logical fallacies, and fabricated claims-each verified by evidence such as national standards and academic literature. We evaluate eight frontier MLLMs across five modality settings. Experimental results demonstrate that Gemini-2.5-Pro achieves the highest performance in the multimodal setting with a belief score of 71.5/100, while o3 performs the worst at 35.2. Furthermore, we investigate social cues that induce false beliefs in videos and find that models are susceptible to biases like authoritative channel IDs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces a manually annotated dataset of 200 Chinese short videos across four health domains with fine-grained labels for three deceptive patterns (experimental errors, logical fallacies, fabricated claims) verified against national standards and literature. It evaluates eight frontier MLLMs across five modality settings using a belief score to quantify susceptibility to cognitive biases in misinformation, reporting Gemini-2.5-Pro at 71.5/100 (highest) and o3 at 35.2 (lowest) in the multimodal setting, while also analyzing social cues such as authoritative channel IDs that induce false beliefs.

Significance. If the measurement pipeline is validated, the work would offer timely empirical evidence on MLLM vulnerabilities to visually and socially entangled misinformation in short-video platforms, with direct relevance to public health communication and AI safety in non-English contexts.

major comments (3)
  1. [Abstract and Evaluation Framework] Abstract and §4 (Evaluation): The headline model rankings rest on the belief score (e.g., 71.5 vs 35.2), yet the manuscript supplies no explicit definition, formula, or mapping from raw model outputs to the 0-100 scale, nor any inter-annotator agreement or statistical significance for the 200-video results; this directly undermines verification of the susceptibility claims.
  2. [Dataset Construction] §3 (Dataset): The claim that annotations are 'fine-grained' and 'verified by national standards' is load-bearing for all downstream results, but no annotation protocol, annotator count, agreement metric, or validation procedure is described, leaving open the possibility that label noise could reverse the reported ordering between models.
  3. [Social Cue Investigation] §5 (Social Cue Analysis): The finding that models are susceptible to biases like authoritative channel IDs inherits the same unverified belief-score pipeline; without details on how social cues are isolated or scored, the causal attribution to specific cues cannot be assessed.
minor comments (2)
  1. [Experimental Setup] Clarify the precise definitions of the five modality settings and how prompts differ across them.
  2. [Results] Add a table summarizing per-model, per-modality belief scores with standard deviations or confidence intervals.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point by point below and have revised the manuscript to incorporate additional methodological details where the comments correctly identify gaps in the current version.

read point-by-point responses
  1. Referee: [Abstract and Evaluation Framework] Abstract and §4 (Evaluation): The headline model rankings rest on the belief score (e.g., 71.5 vs 35.2), yet the manuscript supplies no explicit definition, formula, or mapping from raw model outputs to the 0-100 scale, nor any inter-annotator agreement or statistical significance for the 200-video results; this directly undermines verification of the susceptibility claims.

    Authors: We agree that an explicit definition and formula for the belief score are required for reproducibility and verification. In the revised manuscript we have expanded §4 to include the precise definition of the belief score, the formula mapping raw model outputs (e.g., assessed belief level or probability) to the 0-100 scale, inter-annotator agreement statistics for the underlying annotations, and statistical significance tests for the reported model differences. revision: yes

  2. Referee: [Dataset Construction] §3 (Dataset): The claim that annotations are 'fine-grained' and 'verified by national standards' is load-bearing for all downstream results, but no annotation protocol, annotator count, agreement metric, or validation procedure is described, leaving open the possibility that label noise could reverse the reported ordering between models.

    Authors: The referee is correct that the current description of the annotation process is incomplete. We have revised §3 to provide the full annotation protocol, the number of annotators, the inter-annotator agreement metric employed, and the specific validation steps used to cross-check labels against national standards and academic literature. revision: yes

  3. Referee: [Social Cue Investigation] §5 (Social Cue Analysis): The finding that models are susceptible to biases like authoritative channel IDs inherits the same unverified belief-score pipeline; without details on how social cues are isolated or scored, the causal attribution to specific cues cannot be assessed.

    Authors: We acknowledge that the social-cue analysis requires additional methodological transparency to support causal claims. In the revised §5 we have added explicit descriptions of how social cues (such as channel IDs) are isolated within the videos, the scoring procedure for their influence on model belief scores, and any controls applied to attribute effects to individual cues. revision: yes

Circularity Check

0 steps flagged

No circularity: direct empirical evaluation on new annotated dataset

full rationale

The paper constructs a new dataset of 200 short videos with manual annotations for deceptive patterns, then directly evaluates eight MLLMs across modality settings to produce belief scores. No equations, fitted parameters, predictions derived from subsets, or self-citations are invoked to justify the core results. The reported scores (71.5 for Gemini-2.5-Pro, 35.2 for o3) are outputs of the evaluation pipeline rather than inputs renamed or forced by definition. The framework is self-contained against external benchmarks with no load-bearing self-referential steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the accuracy of manual annotations for deceptive patterns and the validity of the belief score as a measure of bias susceptibility; these are domain assumptions not independently verified in the provided abstract.

axioms (1)
  • domain assumption Manual annotations by experts using national standards and academic literature accurately capture the three deceptive patterns in the videos.
    Invoked when describing the dataset construction and verification process in the abstract.

pith-pipeline@v0.9.0 · 5715 in / 1350 out tokens · 65013 ms · 2026-05-21T15:07:58.251939+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

19 extracted references · 19 canonical work pages

  1. [1]

    arXiv preprint arXiv:2505.21588 , year=

    Springer. Young-Min Cho, Sharath Chandra Guntuku, and Lyle Ungar. 2025. Herd behavior: Investigating peer in- fluence in llm-based multi-agent systems.arXiv preprint arXiv:2505.21588. Yun-Shiuan Chuang, Agam Goyal, Nikunj Harlalka, Siddharth Suresh, Robert Hawkins, Sijia Yang, Dha- van Shah, Junjie Hu, and Timothy Rogers. 2024. Sim- ulating opinion dynami...

  2. [2]

    Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kai- jie Zhu, Yijia Xiao, and Jindong Wang

    Measuring misinformation in video search platforms: An audit study on youtube.Proceed- ings of the ACM on human-computer interaction, 4(CSCW1):1–27. Yiqiao Jin, Qinlin Zhao, Yiyang Wang, Hao Chen, Kai- jie Zhu, Yijia Xiao, and Jindong Wang. 2024. Agen- treview: Exploring peer review dynamics with llm agents. InEMNLP. Koray Kavukcuoglu. 2025. Gemini 2.5: O...

  3. [3]

    InProceedings of the 17th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 178–188

    Covid-vts: Fact extraction and verification on short video platforms. InProceedings of the 17th Conference of the European Chapter of the Associa- tion for Computational Linguistics, pages 178–188. Ryan Liu, Jiayi Geng, Addison J Wu, Ilia Sucholut- sky, Tania Lombrozo, and Thomas L Griffiths. 2025a. Mind your step (by step): Chain-of-thought can re- duce ...

  4. [4]

    InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 14444– 14452

    Fakesv: A multimodal benchmark with rich social context for fake news detection on short video platforms. InProceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 14444– 14452. Juan Carlos Medina Serrano, Orestis Papakyriakopou- los, and Simon Hegelich. 2020. Nlp-based feature extraction for the detection of covid-19 misinforma- t...

  5. [5]

    In2021 IEEE in- ternational conference on big data (big data), pages 899–908

    A multimodal misinformation detector for covid-19 short videos on tiktok. In2021 IEEE in- ternational conference on big data (big data), pages 899–908. IEEE. Lanyu Shang, Yang Zhang, Yawen Deng, and Dong Wang. 2025. Multitec: a data-driven multimodal short video detection framework for healthcare mis- information on tiktok.IEEE Transactions on Big Data. K...

  6. [6]

    Jinpeng You, Yanghao Lin, Dazhen Lin, and Donglin Cao

    Enhancing video rumor detection through multimodal deep feature fusion with time-sync com- ments.Information Processing & Management, 62(1):103935. Jinpeng You, Yanghao Lin, Dazhen Lin, and Donglin Cao. 2022. Video rumor classification based on multi-modal theme and keyframe fusion. InCCF conference on computer supported cooperative work and social comput...

  7. [7]

    Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, and Maarten Sap

    Fact-r1: Towards explainable video misinfor- mation detection with deep reasoning.arXiv preprint arXiv:2505.16836. Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, and Maarten Sap. 2025a. The pimmur principles: En- suring validity in collective behavior of llm societies. arXiv preprint arXiv:2509.18052. 12 Xuhui Zhou,...

  8. [8]

    **Understand the Claim:** Carefully read and interpret the provided "Claim."

  9. [9]

    **Identify Core Assertions:** Break down the claim into its main assertions or factual statements

  10. [10]

    **Cross-Reference and Verify:** Access your knowledge base to determine the veracity of each assertion. Consider: * Is this statement factually incorrect? * Does it contradict established scientific consensus, historical records, or widely accepted knowledge? * Is it presented out of context or in a misleading way? * Does it promote a conspiracy theory, p...

  11. [11]

    Explanation

    Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...

  12. [15]

    Explanation

    Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...

  13. [16]

    **Analyze the entire text carefully.** Look for claims, statements, or assertions that could be factually incorrect, misleading, or unsubstantiated

  14. [19]

    Explanation

    Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...

  15. [23]

    Explanation

    Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...

  16. [24]

    **Analyze the entire frames carefully.** Look for claims, statements, assertions, or actions that could be factually incorrect, misleading, or unsubstantiated

  17. [25]

    **Identify specific points of potential misinformation.**

  18. [26]

    If the information is accurate, explain why it is not misinformation

    Explain *why* the points constitute misinformation.** This explanation should reference common knowledge, widely accepted facts, or the logical inconsistencies within the provided text. If the information is accurate, explain why it is not misinformation

  19. [27]

    Explanation

    Based on your analysis, provide a rating from 1 to 7 indicating your confidence about the presence of misinformation, where: **1**: Very confident that the content has no misinformation **2**: Confident that the content has no misinformation **3**: Slightly confident that the content has no misinformation **4**: Not at all confident that the content has/h...