pith. sign in

arxiv: 2605.16953 · v1 · pith:JRY554RWnew · submitted 2026-05-16 · 💻 cs.AI · cs.CL

How do Humans Process AI-generated Hallucination Contents: a Neuroimaging Study

Pith reviewed 2026-05-19 20:29 UTC · model grok-4.3

classification 💻 cs.AI cs.CL
keywords EEGhallucinationsAI-generated contentneuroimagingfact verificationevent-related potentialsmultimodal LLMscognitive processing
0
0 comments X

The pith

Misjudged AI hallucinations fail to activate the brain's standard fact verification pathway, as shown by distinct EEG responses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper investigates the neural processes involved when humans encounter AI-generated hallucinations in image descriptions. By recording EEG from 27 participants judging the accuracy of these descriptions, it finds that hallucinated content triggers different patterns in semantic integration, inference, memory retrieval, and cognitive load compared to accurate content. The key finding is that when participants incorrectly accept hallucinations as true, their brain activity lacks the typical responses associated with fact checking, suggesting the verification mechanism is not engaged. This matters because it reveals why people sometimes accept false AI outputs without realizing it.

Core claim

The central discovery is that in a verification task with multi-modal LLM outputs, event-related potentials differ for hallucinated versus non-hallucinated descriptions, and crucially, misjudged hallucinations show neural response patterns that deviate from those of correctly identified hallucinations, implying they do not engage the usual neurocognitive fact verification processes.

What carries the argument

Event-related potential (ERP) analysis from EEG recordings during a correctness judgment task, highlighting differences in cognitive processing pathways for detected versus undetected hallucinations.

Load-bearing premise

The observed ERP differences specifically reflect processing of hallucinated content rather than confounding factors such as varying task difficulty, content complexity, or individual participant differences.

What would settle it

A replication EEG study finding no significant differences in neural responses between misjudged and correctly judged hallucinations would falsify the claim that undetected hallucinations bypass the standard verification pathway.

Figures

Figures reproduced from arXiv: 2605.16953 by Bangde Du, Qingyao Ai, Shuqi Zhu, Yiqun Liu, Yi Zhong, Yujia Zhou, Ziyi Ye.

Figure 1
Figure 1. Figure 1: The overall procedure of our data collection. A) The procedure of stimulus selection. B) The experimental trial flow consists of five stages: presenting an image (S1), showing a fixation cross (S2), displaying a sentence word-by-word (S3), the participant making a judgment about the sentence’s match to the image (S4), and finally proceeding to the next image (S5). 3.1. Participants A total of 27 volunteers… view at source ↗
Figure 2
Figure 2. Figure 2: A) Comparison of ERP waveforms elicited by different stimulus word types in the central brain region, with shaded areas indicating the 95% confidence intervals. B) Time-resolved topographic difference maps comparing HalluCorrect with NoHallu and HalluWrong words, respectively; highlighted electrodes denote brain regions showing significant effects in the post-hoc analysis. (F[1,26]=8.271, p<0.05, η 2 p=0.2… view at source ↗
read the original abstract

While AI-generated hallucinations pose considerable risks, the underlying cognitive mechanisms by which humans can successfully recognize or be misled by these hallucinations remain unclear. To address this problem, this paper explores humans' neural dynamics to characterize how the brain processes hallucinated content. We record EEG signals from 27 participants while they are performing a verification task to judge the correctness of image descriptions generated by a multi-modal large language model (MLLM). Based on an averaged event-related potential (ERP) study, we reveal that multiple cognitive processes, e.g., semantic integration, inferential processing, memory retrieval, and cognitive load, exhibit distinct patterns when humans process hallucinated versus non-hallucinated content. Notably, neural responses to hallucinations that were misjudged versus correctly judged by human participants showed significant differences. This indicates that misjudged AI-generated hallucinations failed to trigger the standard neurocognitive fact verification pathway.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper reports an EEG/ERP study with 27 participants who performed a verification task on image descriptions generated by a multi-modal large language model. It claims that hallucinated versus non-hallucinated content elicits distinct patterns across semantic integration, inferential processing, memory retrieval, and cognitive load, and that misjudged hallucinations produce reliably different neural responses from correctly judged ones, indicating failure to engage the standard neurocognitive fact-verification pathway.

Significance. If the central claim survives controls for stimulus properties, the work would supply the first direct neural evidence on how humans process AI hallucinations in a verification setting, with clear relevance to AI safety and human-AI interaction research. The empirical approach is straightforward and the sample size is reasonable for an initial ERP study.

major comments (2)
  1. [Abstract / Results] Abstract and Results: The interpretation that ERP differences between misjudged and correctly judged hallucinations demonstrate failure to trigger the standard fact-verification pathway assumes the two sets of items are matched on cloze probability, visual-semantic mismatch magnitude, and lexical complexity. These variables are known to modulate the same N400 and late-positive components referenced in the abstract; without stimulus matching, regression controls, or post-hoc checks, the load-bearing claim remains vulnerable to item-difficulty confounds.
  2. [Methods] Methods: The manuscript provides no description of how the hallucinated and non-hallucinated stimuli were generated, how they were matched or counterbalanced, what statistical thresholds were applied, or the EEG preprocessing pipeline and artifact-rejection criteria. These omissions prevent evaluation of whether the reported differences can be attributed to hallucination processing rather than uncontrolled stimulus or analysis factors.
minor comments (1)
  1. [Abstract] The abstract would benefit from a brief statement of the exact number of trials per condition and the time windows used for the reported ERP effects.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. The comments highlight important issues regarding stimulus controls and methodological transparency that we address below. We have revised the manuscript to strengthen the claims and improve reproducibility.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and Results: The interpretation that ERP differences between misjudged and correctly judged hallucinations demonstrate failure to trigger the standard fact-verification pathway assumes the two sets of items are matched on cloze probability, visual-semantic mismatch magnitude, and lexical complexity. These variables are known to modulate the same N400 and late-positive components referenced in the abstract; without stimulus matching, regression controls, or post-hoc checks, the load-bearing claim remains vulnerable to item-difficulty confounds.

    Authors: We agree that the current interpretation would be strengthened by explicit controls for these potential confounds. The original manuscript did not report stimulus matching or regression analyses on cloze probability, lexical complexity, or visual-semantic mismatch. In the revision we will add post-hoc stimulus property checks, include regression models controlling for these variables in the ERP analyses, and qualify the interpretation of the misjudged vs. correctly judged contrast accordingly. If the key differences remain significant after controls, we will retain the claim with supporting evidence; otherwise we will revise the discussion to reflect the limitations. revision: yes

  2. Referee: [Methods] Methods: The manuscript provides no description of how the hallucinated and non-hallucinated stimuli were generated, how they were matched or counterbalanced, what statistical thresholds were applied, or the EEG preprocessing pipeline and artifact-rejection criteria. These omissions prevent evaluation of whether the reported differences can be attributed to hallucination processing rather than uncontrolled stimulus or analysis factors.

    Authors: We acknowledge that these details were omitted and are critical for evaluation. In the revised Methods section we will add: (1) full description of MLLM stimulus generation including prompts, model version, and parameters used to create hallucinated vs. non-hallucinated descriptions; (2) criteria and procedures for matching or counterbalancing items across conditions; (3) exact statistical thresholds and multiple-comparison corrections applied; and (4) the complete EEG preprocessing pipeline, including filtering, artifact rejection criteria (e.g., amplitude thresholds, eye-blink detection), and ICA component removal. These additions will allow readers to assess the robustness of the reported effects. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical neuroimaging study

full rationale

This paper reports an EEG/ERP experiment with 27 participants performing a verification task on MLLM-generated image descriptions. All central claims rest on measured differences in averaged event-related potentials between hallucinated vs. non-hallucinated content and between misjudged vs. correctly judged hallucinations. No equations, fitted parameters, self-referential definitions, or derivation chains appear in the reported results. The interpretation that misjudged hallucinations failed to trigger a standard neurocognitive pathway is an inference from observed data patterns against established ERP literature, not a reduction to the paper's own inputs or self-citations. The study is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard neuroscientific assumptions for interpreting ERP components as markers of semantic integration, memory retrieval, and cognitive load, with no free parameters or invented entities introduced.

axioms (1)
  • domain assumption Standard ERP analysis assumptions hold, including that averaged event-related potentials across trials and participants reliably reflect distinct cognitive processes.
    The study interprets differences in ERP patterns as evidence of specific cognitive mechanisms without detailing deviations from standard practices.

pith-pipeline@v0.9.0 · 5699 in / 1192 out tokens · 42268 ms · 2026-05-19T20:29:32.988803+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 8 internal anchors

  1. [1]

    I think, therefore i hallucinate: Minds, ma- chines, and the art of being wrong.arXiv preprint arXiv:2503.05806,

    Barros, S. I think, therefore i hallucinate: Minds, ma- chines, and the art of being wrong.arXiv preprint arXiv:2503.05806,

  2. [2]

    Generated faces in the wild: Quantitative compar- ison of stable diffusion, midjourney and dall-e 2.arXiv preprint arXiv:2210.00586,

    Borji, A. Generated faces in the wild: Quantitative compar- ison of stable diffusion, midjourney and dall-e 2.arXiv preprint arXiv:2210.00586,

  3. [3]

    S., Vaughan, J

    Kim, S. S., Vaughan, J. W., Liao, Q. V ., Lombrozo, T., and Russakovsky, O. Fostering appropriate reliance on large language models: The role of explanations, sources, and inconsistencies. InProceedings of the 2025 CHI Conference on Human Factors in Computing Systems, pp. 1–19,

  4. [4]

    Lin, T.-Y ., Maire, M., Belongie, S., Hays, J., Perona, P., Ra- manan, D., Doll´ar, P., and Zitnick, C. L. Microsoft coco: Common objects in context. InComputer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pp. 740–

  5. [5]

    Manakul, P., Liusie, A., and Gales, M. J. Selfcheckgpt: Zero- resource black-box hallucination detection for generative large language models.arXiv preprint arXiv:2303.08896,

  6. [6]

    On faithfulness and factuality in abstractive summarization,

    Maynez, J., Narayan, S., Bohnet, B., and McDonald, R. On faithfulness and factuality in abstractive summarization. arXiv preprint arXiv:2005.00661,

  7. [7]

    Large Language Models: A Survey

    Minaee, S., Mikolov, T., Nikzad, N., Chenaghlu, M., Socher, R., Amatriain, X., and Gao, J. Large language models: A survey.arXiv preprint arXiv:2402.06196,

  8. [8]

    Object Hallucination in Image Captioning

    Rohrbach, A., Hendricks, L. A., Burns, K., Darrell, T., and Saenko, K. Object hallucination in image captioning. arXiv preprint arXiv:1809.02156,

  9. [9]

    Unsupervised real-time hallucination detection based on the internal states of large language models.arXiv preprint arXiv:2403.06448,

    Su, W., Wang, C., Ai, Q., Hu, Y ., Wu, Z., Zhou, Y ., and Liu, Y . Unsupervised real-time hallucination detection based on the internal states of large language models.arXiv preprint arXiv:2403.06448,

  10. [10]

    LLaMA: Open and Efficient Foundation Language Models

    Touvron, H., Lavril, T., Izacard, G., Martinet, X., Lachaux, M.-A., Lacroix, T., Rozi`ere, B., Goyal, N., Hambro, E., Azhar, F., et al. Llama: Open and efficient foundation lan- guage models.arXiv preprint arXiv:2302.13971,

  11. [11]

    AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation

    Wang, J., Wang, Y ., Xu, G., Zhang, J., Gu, Y ., Jia, H., Wang, J., Xu, H., Yan, M., Zhang, J., et al. Amber: An llm-free multi-dimensional benchmark for mllms hallucination evaluation.arXiv preprint arXiv:2311.07397,

  12. [12]

    Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

    Wang, P., Bai, S., Tan, S., Wang, S., Fan, Z., Bai, J., Chen, K., Liu, X., Wang, J., Ge, W., Fan, Y ., Dang, K., Du, M., Ren, X., Men, R., Liu, D., Zhou, C., Zhou, J., and Lin, J. Qwen2-vl: Enhancing vision-language model’s perception of the world at any resolution.arXiv preprint arXiv:2409.12191,

  13. [13]

    The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision)

    Yang, Z., Li, L., Lin, K., Wang, J., Lin, C.-C., Liu, Z., and Wang, L. The dawn of lmms: Preliminary explorations with gpt-4v (ision).arXiv preprint arXiv:2309.17421,

  14. [14]

    Towards a better understanding of human reading comprehension with brain signals

    Ye, Z., Xie, X., Liu, Y ., Wang, Z., Chen, X., Zhang, M., and Ma, S. Towards a better understanding of human reading comprehension with brain signals. InProceedings of the ACM Web Conference 2022, pp. 380–391,

  15. [15]

    A Survey of Large Language Models

    Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y ., Min, Y ., Zhang, B., Zhang, J., Dong, Z., et al. A survey of large language models.arXiv preprint arXiv:2303.18223, 1(2),

  16. [16]

    Yes” or “No

    11 How do Humans Process AI-generated Hallucination Contents: a Neuroimaging Study A. Appendix A.1. Apparatus The stimuli are presented on a desktop computer that has a 27-inch monitor with a resolution of 2560×1440 pixels and a refresh rate of 60 Hz. Participants are required to use the keyboard to interact with the platform. EEG signals are captured and...

  17. [17]

    Best results are inBold

    The compared methods include uncertainty-based detection, confidence-based consistency 14 How do Humans Process AI-generated Hallucination Contents: a Neuroimaging Study Table 8.The classification results of word-level and sentence-level prediction. Best results are inBold. †/* indicates the result is significantly different with p-value<0.05 compared to ...