Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Baoliang Lu; Dongsheng Li; Enze Zhang; Junyu Pan; Weilong Zheng; Yansen Wang

arxiv: 2605.18172 · v1 · pith:H3T5GARMnew · submitted 2026-05-18 · 💻 cs.AI

Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

Junyu Pan , Yansen Wang , Enze Zhang , Baoliang Lu , Weilong Zheng , Dongsheng Li This is my paper

Pith reviewed 2026-05-20 10:20 UTC · model grok-4.3

classification 💻 cs.AI

keywords generative visual groundingEEG understandingmultimodal large language modelsproxy imagesvisual alignmentbrain signalsclinical interpretationneural representations

0 comments

The pith

Generating proxy images from EEG signals lets MLLMs use visual priors to interpret brain activity more effectively than text alone.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Generative Visual Grounding to overcome limited visually-evoked EEG data by turning neural signals into instance-specific images. Rather than mapping brain activity only to abstract text, which risks losing perceptual details, the framework uses an EEG-to-image model to create visual proxies. These images supply structured contexts that let multimodal large language models draw on their existing visual knowledge for clinical interpretation tasks. Tests on two backbones show image-only alignment already competes with larger text-based systems while tuning far fewer parameters, and combining images with text yields further gains in understanding and generation. If correct, the work points toward brain foundation models that retain richer information from raw neural signals.

Core claim

Generative Visual Grounding employs an EEG-to-image generative model as a visual translator to produce instance-specific proxy images for non-visual EEG. These proxies supply structured visual contexts that allow MLLMs to exploit their visual priors for clinical-state interpretation, delivering competitive results with image-only alignment and consistent improvements when extended to trimodal image-plus-text alignment.

What carries the argument

Generative Visual Grounding (GVG), the framework that uses an EEG-to-image generative model to create instance-specific proxy images serving as visual contexts for MLLM alignment.

If this is right

Image-only alignment using the generated proxies matches the performance of larger text-aligned baselines while tuning only a small fraction of parameters on a frozen backbone.
Trimodal alignment that adds the visual proxies to text supplies both categorical semantic anchors and perceptual details for richer neural representations.
The method produces measurable gains in EEG understanding tasks as well as in visual generation from brain signals.
Visual proxy grounding functions as a direct complement to textual alignment for building more capable EEG foundation models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar proxy generation could extend visual grounding to other non-visual sensor data such as audio or wearable signals.
The approach may support more interpretable brain-computer interfaces by linking raw neural activity to concrete visual outputs users can inspect.
Testing whether the generated images recover specific perceptual experiences encoded in EEG would provide a direct check on information preservation.
Combining this grounding with other modalities could produce more robust multimodal models for scarce brain-signal datasets.

Load-bearing premise

EEG-to-image generative models can accurately translate neural signals into meaningful visual representations that preserve fine-grained perceptual information without introducing misleading artifacts.

What would settle it

A controlled experiment showing that MLLMs achieve equal or lower accuracy on clinical-state prediction tasks when given the generated proxy images versus text-only alignments would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.18172 by Baoliang Lu, Dongsheng Li, Enze Zhang, Junyu Pan, Weilong Zheng, Yansen Wang.

**Figure 1.** Figure 1: Overview of our core idea and proxy-image strategy. Left: GVG converts EEG into a visual-like language, allowing [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗

**Figure 2.** Figure 2: Overview of the Generative Visual Grounding (GVG) Training Framework. The proposed GVG pipeline consists of [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Qualitative Results of EEG-based Visual Reconstruction. We visualize the decoding capabilities of our two instantiations. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

read the original abstract

Leveraging the universal representations of pre-trained LLMs and MLLMs offers a promising path toward brain foundation models. However, visually-evoked EEG datasets remain scarce, leading existing methods to align neural signals mainly with abstract text, a lossy translation that may discard fine-grained perceptual information encoded in brain activity. We propose Generative Visual Grounding (GVG), a framework that visualizes the invisible by using an EEG-to-image generative model as a visual translator. Instead of forcing EEG into text alone, GVG hallucinates instance-specific proxy images for non-visual EEG, providing structured visual contexts that allow MLLMs to exploit their visual priors for clinical-state interpretation. We validate this idea on two MLLM backbones, GVG-X-Omni and GVG-Janus. Image-only alignment is already competitive: the lightweight GVG-X-Omni matches 1.7B-parameter text-aligned baselines while tuning only 170M parameters on a frozen 7B backbone. We further extend GVG-Janus with trimodal Image+Text alignment, where text supplies categorical semantic anchors and visual proxies enrich neural representations with perceptual details. Experiments show consistent gains in EEG understanding and visual generation, suggesting visual proxy grounding as an effective complement to textual alignment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GVG adds EEG-to-image proxies as a complement to text alignment in MLLMs and shows parameter-efficient competitive results, but the abstract leaves the fidelity of those proxies unverified.

read the letter

The main thing to know is that this paper introduces Generative Visual Grounding to turn EEG signals into instance-specific proxy images for MLLMs, letting the models draw on visual priors instead of relying only on lossy text translations. They test the idea on two backbones and report that image-only alignment can match larger text baselines while tuning far fewer parameters, with further gains when text and visual proxies are combined.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes Generative Visual Grounding (GVG), a framework that uses an EEG-to-image generative model to hallucinate instance-specific proxy images from non-visual EEG signals. These proxies supply structured visual context to MLLMs, enabling them to leverage visual priors for clinical-state interpretation instead of relying solely on lossy text alignment. The approach is validated on two backbones (GVG-X-Omni and GVG-Janus), with claims that image-only alignment is competitive with larger text baselines using only 170M tunable parameters on a frozen 7B model, and that trimodal (Image+Text) alignment yields further gains in EEG understanding and visual generation.

Significance. If the generated visual proxies faithfully encode fine-grained perceptual details from EEG without introducing artifacts, the framework could meaningfully advance brain foundation models by complementing textual alignment with visual priors in MLLMs. The parameter-efficient tuning (170M parameters) and the explicit separation of categorical semantic anchors (text) from perceptual enrichment (images) are strengths. However, the absence of direct fidelity metrics or controls for non-visual EEG cases limits the assessed impact, as gains might stem from added modality capacity rather than meaningful neural-to-visual translation.

major comments (3)

[Abstract / Experiments] Abstract and validation sections: The central claim that visual proxies 'enrich neural representations with perceptual details' and enable 'consistent gains' requires evidence that EEG-to-image outputs preserve fine-grained information rather than spurious features. No direct fidelity checks, image quality metrics, or comparisons against ground-truth perceptual content for non-visual EEG are described, leaving open whether reported improvements track proxy quality or simply reflect extra input capacity.
[Validation on GVG-X-Omni] GVG-X-Omni description: The claim that the lightweight model 'matches 1.7B-parameter text-aligned baselines' while tuning only 170M parameters on a frozen 7B backbone is load-bearing for the efficiency argument, yet no specific baseline models, datasets, tasks, or numerical performance values (e.g., accuracy, F1) are provided to support the comparison.
[GVG-Janus trimodal alignment] Trimodal extension: Extending GVG-Janus with Image+Text alignment is presented as yielding further gains, but without ablation isolating the contribution of the generated visual proxies versus text alone, or versus random visual inputs, it is unclear whether the perceptual enrichment is the operative factor.

minor comments (2)

[Abstract] The abstract uses 'hallucinates' to describe the generative process; a more neutral term such as 'generates' would avoid unintended connotations in a scientific context.
[Methods] Notation for the two backbones (GVG-X-Omni, GVG-Janus) is introduced without an explicit definition of how GVG is integrated into each architecture.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment point-by-point below, clarifying our approach and outlining revisions to strengthen the evidence and presentation.

read point-by-point responses

Referee: [Abstract / Experiments] Abstract and validation sections: The central claim that visual proxies 'enrich neural representations with perceptual details' and enable 'consistent gains' requires evidence that EEG-to-image outputs preserve fine-grained information rather than spurious features. No direct fidelity checks, image quality metrics, or comparisons against ground-truth perceptual content for non-visual EEG are described, leaving open whether reported improvements track proxy quality or simply reflect extra input capacity.

Authors: We agree that direct fidelity evidence would be ideal. However, non-visual EEG inherently lacks ground-truth images, rendering standard metrics such as FID or LPIPS inapplicable without artificial references. Our primary validation relies on consistent downstream gains in EEG understanding and generation tasks, which serve as indirect but task-relevant indicators that the proxies capture meaningful perceptual structure rather than noise. In revision we will add a dedicated subsection discussing evaluation challenges for non-visual signals, include qualitative examples of generated proxies with corresponding model attention maps, and report correlation analysis between proxy characteristics and task performance to better address this concern. revision: yes
Referee: [Validation on GVG-X-Omni] GVG-X-Omni description: The claim that the lightweight model 'matches 1.7B-parameter text-aligned baselines' while tuning only 170M parameters on a frozen 7B backbone is load-bearing for the efficiency argument, yet no specific baseline models, datasets, tasks, or numerical performance values (e.g., accuracy, F1) are provided to support the comparison.

Authors: The experimental section of the full manuscript contains these comparisons, but we acknowledge that the high-level claim in the abstract and introduction would benefit from explicit anchoring. In the revised manuscript we will insert a concise table or paragraph that names the specific 1.7B-parameter text-aligned baselines, lists the EEG datasets and clinical interpretation tasks used, and reports the numerical results (accuracy and F1 scores) demonstrating that GVG-X-Omni remains competitive while tuning only 170M parameters on the frozen 7B backbone. revision: yes
Referee: [GVG-Janus trimodal alignment] Trimodal extension: Extending GVG-Janus with Image+Text alignment is presented as yielding further gains, but without ablation isolating the contribution of the generated visual proxies versus text alone, or versus random visual inputs, it is unclear whether the perceptual enrichment is the operative factor.

Authors: We have already compared text-only, image-only, and trimodal alignments and observed incremental gains for the trimodal setting. To more rigorously isolate the role of the generated proxies, we will add a new ablation experiment in the revision that replaces the EEG-conditioned proxies with random or noise-based images while keeping all other factors fixed. This control will clarify whether the observed improvements stem from semantically relevant visual content rather than simply the addition of an extra modality. revision: yes

Circularity Check

0 steps flagged

No circularity: new framework proposal validated via independent experiments

full rationale

The paper proposes Generative Visual Grounding (GVG) as a method that uses an EEG-to-image generative model to create instance-specific visual proxies for non-visual EEG signals, which are then fed into MLLMs for improved clinical-state interpretation. The derivation consists of describing this translator role, applying it to two specific backbones (GVG-X-Omni with 170M tunable parameters on a frozen 7B model, and trimodal GVG-Janus), and reporting empirical gains in alignment and generation tasks. No equations, fitted parameters renamed as predictions, or self-citation chains are invoked to force the central claims; the results are presented as outcomes of external validation on GVG-X-Omni and GVG-Janus rather than reducing tautologically to the inputs by construction. The approach remains self-contained against the described benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that EEG encodes visualizable perceptual details and that generative models can produce useful proxies; no free parameters or invented entities beyond the proposed framework are evident from the abstract.

axioms (1)

domain assumption EEG signals contain fine-grained perceptual information that can be translated into instance-specific visual images via generative models
Invoked to justify using visual proxies instead of text-only alignment for non-visual EEG.

invented entities (1)

Generative Visual Grounding (GVG) framework no independent evidence
purpose: To generate visual proxy images from EEG for enhanced MLLM interpretation
Newly introduced method in the paper.

pith-pipeline@v0.9.0 · 5772 in / 1157 out tokens · 54770 ms · 2026-05-20T10:20:15.322648+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We employ an EEG-to-Image generative model (AVDE) as a visual translator to hallucinate instance-specific proxy images for non-visual EEG data... trimodal objective Ltri = λ_ei L_ei + λ_et L_et + λ_it L_it
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

mapping raw EEG signals into discrete image tokens... similarity-based prediction over codebook V

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

63 extracted references · 63 canonical work pages · 9 internal anchors

[1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report.arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[2]

Diego Alvarez-Estevez and Roselyne M Rijsman. 2021. Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one16, 8 (2021), e0256111

work page 2021
[3]

Yunpeng Bai, Xintao Wang, Yan-pei Cao, Yixiao Ge, Chun Yuan, and Ying Shan

work page
[4]

Dr eamDif- fusion: Generating high-quality images from brain EEG sign als,

Dreamdiffusion: Generating high-quality images from brain eeg signals. arXiv preprint arXiv:2306.16934(2023)

work page arXiv 2023
[5]

Hubert Banville, Yohann Benchetrit, Stéphane d’Ascoli, Jérémy Rapin, and Jean- Rémi King. 2025. Scaling laws for decoding images from brain activity.arXiv preprint arXiv:2501.15322(2025)

work page arXiv 2025
[6]

Donghong Cai, Junru Chen, Yang Yang, Teng Liu, and Yafeng Li. 2023. Mbrain: A multi-channel self-supervised learning framework for brain signals. InPro- ceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 130–141

work page 2023
[7]

Josue Ortega Caro, Antonio H de O Fonseca, Christopher Averill, Syed A Rizvi, Matteo Rosati, James L Cross, Prateek Mittal, Emanuele Zappala, Daniel Levine, Rahul M Dhodapkar, et al. 2023. BrainLM: A foundation model for brain activity recordings.bioRxiv(2023), 2023–09

work page 2023
[8]

Xuhang Chen, Baiying Lei, Chi-Man Pun, and Shuqiang Wang. 2023. Brain diffuser: An end-to-end brain image to brain network pipeline. InChinese Con- ference on Pattern Recognition and Computer Vision (PRCV). Springer, 16–26

work page 2023
[9]

Zijiao Chen, Jiaxin Qing, and Juan Helen Zhou. 2023. Cinematic mindscapes: High-quality video reconstruction from brain activity.Advances in Neural Infor- mation Processing Systems36 (2023), 24841–24858

work page 2023
[10]

Zhisheng Chen, Yingwei Zhang, Qizhen Lan, Tianyu Liu, Huacan Wang, Yi Ding, Ziyu Jia, Ronghao Chen, Kun Wang, and Xinliang Zhou. 2025. Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning.arXiv preprint arXiv:2509.24222(2025)

work page arXiv 2025
[11]

Wenhui Cui, Woojae Jeong, Philipp Thölke, Takfarinas Medani, Karim Jerbi, Anand A Joshi, and Richard M Leahy. 2024. Neuro-gpt: Towards a foundation model for eeg. In2024 IEEE International Symposium on Biomedical Imaging (ISBI). IEEE, 1–5

work page 2024
[12]

Sicheng Dai, Hongwang Xiao, Shan Yu, and Qiwei Ye. 2026. Autoregressive Visual Decoding from EEG Signals.arXiv preprint arXiv:2602.22555(2026)

work page arXiv 2026
[13]

Alexandru Dimofte, Glenn Anta Bucagu, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Luca Benini, and Yawei Li. 2025. Cerebro: Compact encoder for representations of brain oscillations using efficient alternating attention. arXiv preprint arXiv:2501.10885(2025)

work page arXiv 2025
[14]

Ruo-Nan Duan, Jia-Yi Zhu, and Bao-Liang Lu. 2013. Differential entropy feature for EEG-based emotion classification. In6th International IEEE/EMBS Conference on Neural Engineering (NER). IEEE, 81–84

work page 2013
[15]

Zitao Fang, Chenxuan Li, Hongting Zhou, Shuyang Yu, Guodong Du, Ashwaq Qasem, Yang Lu, Jing Li, Junsong Zhang, and Sim Kuan Goh. 2025. Neuript: Foundation model for neural interfaces.arXiv preprint arXiv:2510.16548(2025)

work page arXiv 2025
[16]

Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, et al. 2025. X-omni: Reinforcement learning makes discrete autoregressive image generative models great again. arXiv preprint arXiv:2507.22058(2025)

work page arXiv 2025
[17]

Amir Harati, Meysam Golmohammadi, Silvia Lopez, Iyad Obeid, and Joseph Picone. 2015. Improved EEG event classification using differential energy. In 2015 IEEE Signal Processing in Medicine and Biology Symposium (SPMB). IEEE, 1–4

work page 2015
[18]

Shuai Huang, Yongxiong Wang, Huan Luo, Haodong Jing, Chendong Qin, and Jingqun Tang. 2025. MINDEV: Multi-modal Integrated Diffusion Framework for Video Reconstruction from EEG Signals. InProceedings of the 33rd ACM International Conference on Multimedia. 3350–3359

work page 2025
[19]

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. 2024. The platonic representation hypothesis.arXiv preprint arXiv:2405.07987(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[20]

Wei-Bang Jiang, Xuan-Hao Liu, Wei-Long Zheng, and Bao-Liang Lu. 2025. SEED- VII: A Multimodal Dataset of Six Basic Emotions With Continuous Labels for Emotion Recognition.IEEE Transactions on Affective Computing16, 2 (2025), 969–985. doi:10.1109/TAFFC.2024.3485057

work page doi:10.1109/taffc.2024.3485057 2025
[21]

Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, and Dongsheng Li. 2024. NeuroLM: A universal multi-task foundation model for bridging the gap between language and EEG signals.arXiv preprint arXiv:2409.00101(2024)

work page arXiv 2024
[22]

Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. 2024. Large brain model for learning generic representations with tremendous EEG data in BCI.arXiv preprint arXiv:2405.18765(2024)

work page arXiv 2024
[23]

Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, Chaoqi Yang, Sungtae An, Aaron F Struck, Aline Herlopian, Ioannis Karakis, et al. 2023. Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation.Neurology100, 17 (2023), e1750–e1762

work page 2023
[24]

Isaak Kavasidis, Simone Palazzo, Concetto Spampinato, Daniela Giordano, and Mubarak Shah. 2017. Brain2image: Converting brain signals into images. In Proceedings of the 25th ACM international conference on Multimedia. 1809–1817

work page 2017
[25]

Jonathan W Kim, Ahmed Alaa, and Danilo Bernardo. 2024. EEG-GPT: exploring capabilities of large language models for EEG classification and interpretation. arXiv preprint arXiv:2401.18006(2024)

work page arXiv 2024
[26]

Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. 2021. BENDR: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data.Frontiers in Human Neuroscience15 (2021), 653659

work page 2021
[27]

Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dockhorn, Jack English, Zion English, Patrick Esser, Sumith Kulal, Kyle Lacey, Yam Levi, Cheng Li, Dominik Lorenz, Jonas Müller, Dustin Podell, Robin Rombach, Harry Saini, Axel Sauer, and Luke Smith. 2025. FLUX.1 Kontext: Flow Matching for In-Context ...

work page internal anchor Pith review Pith/arXiv arXiv 2025
[28]

Yu-Ting Lan, Kan Ren, Yansen Wang, Wei-Long Zheng, Dongsheng Li, Bao-Liang Lu, and Lili Qiu. 2023. Seeing through the brain: image reconstruction of visual perception from human brain signals.arXiv preprint arXiv:2308.02510(2023)

work page arXiv 2023
[29]

Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. 2022. Motor imagery EEG classification algorithm based on CNN-LSTM feature fusion network.Biomedical signal processing and control72 (2022), 103342

work page 2022
[30]

Chenyu Liu, Yuqiu Deng, Tianyu Liu, Jinan Zhou, Xinliang Zhou, Ziyu Jia, and Yi Ding. 2025. ECHO: Toward Contextual Seq2Seq Paradigms in Large EEG Models.arXiv preprint arXiv:2509.22556(2025)

work page arXiv 2025
[31]

Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual in- struction tuning.Advances in neural information processing systems36 (2023), 34892–34916

work page 2023
[32]

Xuan-Hao Liu, Yan-Kai Liu, Yansen Wang, Kan Ren, Hanwen Shi, Zilong Wang, Dongsheng Li, Bao-Liang Lu, and Wei-Long Zheng. 2024. EEG2video: Towards decoding dynamic visual perception from EEG signals.Advances in Neural Information Processing Systems37 (2024), 72245–72273

work page 2024
[33]

Xuan-Hao Liu, Bao-Liang Lu, and Wei-Long Zheng. 2025. Eegmirror: Leveraging eeg data in the wild via montage-agnostic self-supervision for eeg to video decoding. InProceedings of the IEEE/CVF International Conference on Computer Vision. 18273–18283

work page 2025
[34]

Weiheng Lu, Chunfeng Song, Jiamin Wu, Pengyu Zhu, Yuchen Zhou, Weijian Mai, Qihao Zheng, and Wanli Ouyang. 2025. UniMind: Unleashing the Power of LLMs for Unified Multi-Task Brain Decoding.arXiv preprint arXiv:2506.18962 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[35]

Fei Ma, Han Lin, Yifan Xie, Hongwei Ren, Xiaoyu Shen, Wenbo Ding, and Qi Tian

work page
[36]

arXiv preprint arXiv:2601.07877(2026)

Eˆ 2-LLM: Bridging Neural Signals and Interpretable Affective Analysis. arXiv preprint arXiv:2601.07877(2026)

work page arXiv 2026
[37]

Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. 2022. Transformer convolu- tional neural networks for automated artifact detection in scalp EEG. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 3599–3602

work page 2022
[38]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al

work page
[39]

In International conference on machine learning

Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 8748–8763

work page
[40]

Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. 2021. Transformer- based spatial-temporal feature learning for EEG decoding.arXiv preprint arXiv:2106.11170(2021)

work page arXiv 2021
[41]

Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Nasim Souly, and Mubarak Shah. 2017. Deep learning human mind for automated visual classification. InProceedings of the IEEE conference on computer vision and pattern recognition. 6809–6817

work page 2017
[42]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. 2023. Gemini: a family of highly capable multimodal models.arXiv preprint arXiv:2312.11805(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023
[43]

Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, and Liwei Wang. 2024. Visual autoregressive modeling: Scalable image generation via next-scale prediction. Advances in neural information processing systems37 (2024), 84839–84865

work page 2024
[44]

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al. 2025. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features.arXiv preprint arXiv:2502.14786(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025
[45]

Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. 2023. BrainBERT: Self- supervised representation learning for intracranial recordings.arXiv preprint arXiv:2302.14367(2023)

work page arXiv 2023
[46]

Guangyu Wang, Wenchao Liu, Yuhong He, Cong Xu, Lin Ma, and Haifeng Li

work page
[47]

9 Pan et al

Eegpt: Pretrained transformer for universal and reliable representation of eeg signals.Advances in Neural Information Processing Systems37 (2024), 39249–39280. 9 Pan et al

work page 2024
[48]

Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. 2024. Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236(2024)

work page arXiv 2024
[49]

Peng Wang, Shuai Bai, Sinan Tan, Shijie Wang, Zhihao Fan, Jinze Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, et al. 2024. Qwen2-vl: Enhancing vision-language model’s perception of the world at any resolution.arXiv preprint arXiv:2409.12191(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[50]

Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, et al. 2024. Janus: Decoupling visual encoding for unified multimodal understanding and generation.arXiv preprint arXiv:2410.13848(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[51]

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Cheng- peng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfen...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[52]

Chaoqi Yang, M Westover, and Jimeng Sun. 2023. Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems36 (2023), 78240–78260

work page 2023
[53]

Chaoqi Yang, Cao Xiao, M Brandon Westover, and Jimeng Sun. 2023. Self- supervised electroencephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI2, 1 (2023), e46769

work page 2023
[54]

Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu. 2024. Brainmae: a region- aware self-supervised learning framework for brain signals.arXiv preprint arXiv:2406.17086(2024)

work page arXiv 2024
[55]

Ke Yi, Yansen Wang, Kan Ren, and Dongsheng Li. 2023. Learning topology- agnostic EEG representations with geometry-aware modeling.Advances in Neural Information Processing Systems36 (2023), 53875–53891

work page 2023
[56]

Zhizhang Yuan, Fanqi Shen, Meng Li, Yuguo Yu, Chenhao Tan, and Yang Yang

work page
[57]

Brainwave: A brain signal foundation model for clinical applications.arXiv preprint arXiv:2402.10251(2024)

work page arXiv 2024
[58]

Tongtian Yue, Shuning Xue, Xuange Gao, Yepeng Tang, Longteng Guo, Jie Jiang, and Jing Liu. 2024. Eegpt: Unleashing the potential of eeg generalist foundation model by autoregressive pre-training.arXiv preprint arXiv:2410.19779(2024)

work page arXiv 2024
[59]

Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. 2023. Sigmoid loss for language image pre-training. InProceedings of the IEEE/CVF international conference on computer vision. 11975–11986

work page 2023
[60]

Daoze Zhang, Zhizhang Yuan, Yang Yang, Junru Chen, Jingjing Wang, and Yafeng Li. 2023. Brant: Foundation model for intracranial neural signal.Advances in Neural Information Processing Systems36 (2023), 26304–26321

work page 2023
[61]

Xiang Zhang, Ziyuan Zhao, Theodoros Tsiligkaridis, and Marinka Zitnik. 2022. Self-supervised contrastive pre-training for time series via time-frequency con- sistency.Advances in neural information processing systems35 (2022), 3988–4003

work page 2022
[62]

Zheng, W

W. Zheng, W. Liu, Y. Lu, B. Lu, and A. Cichocki. 2018. EmotionMeter: A Mul- timodal Framework for Recognizing Human Emotions.IEEE Transactions on Cybernetics(2018), 1–13. doi:10.1109/TCYB.2018.2797176

work page doi:10.1109/tcyb.2018.2797176 2018
[63]

Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating Critical Frequency Bands and Channels for EEG-based Emotion Recognition with Deep Neural Networks.IEEE Transactions on Autonomous Mental Development7, 3 (2015), 162–175. doi:10.1109/TAMD.2015.2431497 10

work page doi:10.1109/tamd.2015.2431497 2015

[1] [1]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Floren- cia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report.arXiv preprint arXiv:2303.08774 (2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[2] [2]

Diego Alvarez-Estevez and Roselyne M Rijsman. 2021. Inter-database validation of a deep learning approach for automatic sleep scoring.PloS one16, 8 (2021), e0256111

work page 2021

[3] [3]

Yunpeng Bai, Xintao Wang, Yan-pei Cao, Yixiao Ge, Chun Yuan, and Ying Shan

work page

[4] [4]

Dr eamDif- fusion: Generating high-quality images from brain EEG sign als,

Dreamdiffusion: Generating high-quality images from brain eeg signals. arXiv preprint arXiv:2306.16934(2023)

work page arXiv 2023

[5] [5]

Hubert Banville, Yohann Benchetrit, Stéphane d’Ascoli, Jérémy Rapin, and Jean- Rémi King. 2025. Scaling laws for decoding images from brain activity.arXiv preprint arXiv:2501.15322(2025)

work page arXiv 2025

[6] [6]

Donghong Cai, Junru Chen, Yang Yang, Teng Liu, and Yafeng Li. 2023. Mbrain: A multi-channel self-supervised learning framework for brain signals. InPro- ceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 130–141

work page 2023

[7] [7]

Josue Ortega Caro, Antonio H de O Fonseca, Christopher Averill, Syed A Rizvi, Matteo Rosati, James L Cross, Prateek Mittal, Emanuele Zappala, Daniel Levine, Rahul M Dhodapkar, et al. 2023. BrainLM: A foundation model for brain activity recordings.bioRxiv(2023), 2023–09

work page 2023

[8] [8]

Xuhang Chen, Baiying Lei, Chi-Man Pun, and Shuqiang Wang. 2023. Brain diffuser: An end-to-end brain image to brain network pipeline. InChinese Con- ference on Pattern Recognition and Computer Vision (PRCV). Springer, 16–26

work page 2023

[9] [9]

Zijiao Chen, Jiaxin Qing, and Juan Helen Zhou. 2023. Cinematic mindscapes: High-quality video reconstruction from brain activity.Advances in Neural Infor- mation Processing Systems36 (2023), 24841–24858

work page 2023

[10] [10]

Zhisheng Chen, Yingwei Zhang, Qizhen Lan, Tianyu Liu, Huacan Wang, Yi Ding, Ziyu Jia, Ronghao Chen, Kun Wang, and Xinliang Zhou. 2025. Uni-NTFM: A Unified Foundation Model for EEG Signal Representation Learning.arXiv preprint arXiv:2509.24222(2025)

work page arXiv 2025

[11] [11]

Wenhui Cui, Woojae Jeong, Philipp Thölke, Takfarinas Medani, Karim Jerbi, Anand A Joshi, and Richard M Leahy. 2024. Neuro-gpt: Towards a foundation model for eeg. In2024 IEEE International Symposium on Biomedical Imaging (ISBI). IEEE, 1–5

work page 2024

[12] [12]

Sicheng Dai, Hongwang Xiao, Shan Yu, and Qiwei Ye. 2026. Autoregressive Visual Decoding from EEG Signals.arXiv preprint arXiv:2602.22555(2026)

work page arXiv 2026

[13] [13]

Alexandru Dimofte, Glenn Anta Bucagu, Thorir Mar Ingolfsson, Xiaying Wang, Andrea Cossettini, Luca Benini, and Yawei Li. 2025. Cerebro: Compact encoder for representations of brain oscillations using efficient alternating attention. arXiv preprint arXiv:2501.10885(2025)

work page arXiv 2025

[14] [14]

Ruo-Nan Duan, Jia-Yi Zhu, and Bao-Liang Lu. 2013. Differential entropy feature for EEG-based emotion classification. In6th International IEEE/EMBS Conference on Neural Engineering (NER). IEEE, 81–84

work page 2013

[15] [15]

Zitao Fang, Chenxuan Li, Hongting Zhou, Shuyang Yu, Guodong Du, Ashwaq Qasem, Yang Lu, Jing Li, Junsong Zhang, and Sim Kuan Goh. 2025. Neuript: Foundation model for neural interfaces.arXiv preprint arXiv:2510.16548(2025)

work page arXiv 2025

[16] [16]

Zigang Geng, Yibing Wang, Yeyao Ma, Chen Li, Yongming Rao, Shuyang Gu, Zhao Zhong, Qinglin Lu, Han Hu, Xiaosong Zhang, et al. 2025. X-omni: Reinforcement learning makes discrete autoregressive image generative models great again. arXiv preprint arXiv:2507.22058(2025)

work page arXiv 2025

[17] [17]

Amir Harati, Meysam Golmohammadi, Silvia Lopez, Iyad Obeid, and Joseph Picone. 2015. Improved EEG event classification using differential energy. In 2015 IEEE Signal Processing in Medicine and Biology Symposium (SPMB). IEEE, 1–4

work page 2015

[18] [18]

Shuai Huang, Yongxiong Wang, Huan Luo, Haodong Jing, Chendong Qin, and Jingqun Tang. 2025. MINDEV: Multi-modal Integrated Diffusion Framework for Video Reconstruction from EEG Signals. InProceedings of the 33rd ACM International Conference on Multimedia. 3350–3359

work page 2025

[19] [19]

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. 2024. The platonic representation hypothesis.arXiv preprint arXiv:2405.07987(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[20] [20]

Wei-Bang Jiang, Xuan-Hao Liu, Wei-Long Zheng, and Bao-Liang Lu. 2025. SEED- VII: A Multimodal Dataset of Six Basic Emotions With Continuous Labels for Emotion Recognition.IEEE Transactions on Affective Computing16, 2 (2025), 969–985. doi:10.1109/TAFFC.2024.3485057

work page doi:10.1109/taffc.2024.3485057 2025

[21] [21]

Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, and Dongsheng Li. 2024. NeuroLM: A universal multi-task foundation model for bridging the gap between language and EEG signals.arXiv preprint arXiv:2409.00101(2024)

work page arXiv 2024

[22] [22]

Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. 2024. Large brain model for learning generic representations with tremendous EEG data in BCI.arXiv preprint arXiv:2405.18765(2024)

work page arXiv 2024

[23] [23]

Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, Chaoqi Yang, Sungtae An, Aaron F Struck, Aline Herlopian, Ioannis Karakis, et al. 2023. Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation.Neurology100, 17 (2023), e1750–e1762

work page 2023

[24] [24]

Isaak Kavasidis, Simone Palazzo, Concetto Spampinato, Daniela Giordano, and Mubarak Shah. 2017. Brain2image: Converting brain signals into images. In Proceedings of the 25th ACM international conference on Multimedia. 1809–1817

work page 2017

[25] [25]

Jonathan W Kim, Ahmed Alaa, and Danilo Bernardo. 2024. EEG-GPT: exploring capabilities of large language models for EEG classification and interpretation. arXiv preprint arXiv:2401.18006(2024)

work page arXiv 2024

[26] [26]

Demetres Kostas, Stephane Aroca-Ouellette, and Frank Rudzicz. 2021. BENDR: Using transformers and a contrastive self-supervised learning task to learn from massive amounts of EEG data.Frontiers in Human Neuroscience15 (2021), 653659

work page 2021

[27] [27]

Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dockhorn, Jack English, Zion English, Patrick Esser, Sumith Kulal, Kyle Lacey, Yam Levi, Cheng Li, Dominik Lorenz, Jonas Müller, Dustin Podell, Robin Rombach, Harry Saini, Axel Sauer, and Luke Smith. 2025. FLUX.1 Kontext: Flow Matching for In-Context ...

work page internal anchor Pith review Pith/arXiv arXiv 2025

[28] [28]

Yu-Ting Lan, Kan Ren, Yansen Wang, Wei-Long Zheng, Dongsheng Li, Bao-Liang Lu, and Lili Qiu. 2023. Seeing through the brain: image reconstruction of visual perception from human brain signals.arXiv preprint arXiv:2308.02510(2023)

work page arXiv 2023

[29] [29]

Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. 2022. Motor imagery EEG classification algorithm based on CNN-LSTM feature fusion network.Biomedical signal processing and control72 (2022), 103342

work page 2022

[30] [30]

Chenyu Liu, Yuqiu Deng, Tianyu Liu, Jinan Zhou, Xinliang Zhou, Ziyu Jia, and Yi Ding. 2025. ECHO: Toward Contextual Seq2Seq Paradigms in Large EEG Models.arXiv preprint arXiv:2509.22556(2025)

work page arXiv 2025

[31] [31]

Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. 2023. Visual in- struction tuning.Advances in neural information processing systems36 (2023), 34892–34916

work page 2023

[32] [32]

Xuan-Hao Liu, Yan-Kai Liu, Yansen Wang, Kan Ren, Hanwen Shi, Zilong Wang, Dongsheng Li, Bao-Liang Lu, and Wei-Long Zheng. 2024. EEG2video: Towards decoding dynamic visual perception from EEG signals.Advances in Neural Information Processing Systems37 (2024), 72245–72273

work page 2024

[33] [33]

Xuan-Hao Liu, Bao-Liang Lu, and Wei-Long Zheng. 2025. Eegmirror: Leveraging eeg data in the wild via montage-agnostic self-supervision for eeg to video decoding. InProceedings of the IEEE/CVF International Conference on Computer Vision. 18273–18283

work page 2025

[34] [34]

Weiheng Lu, Chunfeng Song, Jiamin Wu, Pengyu Zhu, Yuchen Zhou, Weijian Mai, Qihao Zheng, and Wanli Ouyang. 2025. UniMind: Unleashing the Power of LLMs for Unified Multi-Task Brain Decoding.arXiv preprint arXiv:2506.18962 (2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[35] [35]

Fei Ma, Han Lin, Yifan Xie, Hongwei Ren, Xiaoyu Shen, Wenbo Ding, and Qi Tian

work page

[36] [36]

arXiv preprint arXiv:2601.07877(2026)

Eˆ 2-LLM: Bridging Neural Signals and Interpretable Affective Analysis. arXiv preprint arXiv:2601.07877(2026)

work page arXiv 2026

[37] [37]

Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. 2022. Transformer convolu- tional neural networks for automated artifact detection in scalp EEG. In2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC). IEEE, 3599–3602

work page 2022

[38] [38]

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sand- hini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al

work page

[39] [39]

In International conference on machine learning

Learning transferable visual models from natural language supervision. In International conference on machine learning. PmLR, 8748–8763

work page

[40] [40]

Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. 2021. Transformer- based spatial-temporal feature learning for EEG decoding.arXiv preprint arXiv:2106.11170(2021)

work page arXiv 2021

[41] [41]

Concetto Spampinato, Simone Palazzo, Isaak Kavasidis, Daniela Giordano, Nasim Souly, and Mubarak Shah. 2017. Deep learning human mind for automated visual classification. InProceedings of the IEEE conference on computer vision and pattern recognition. 6809–6817

work page 2017

[42] [42]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, Katie Millican, et al. 2023. Gemini: a family of highly capable multimodal models.arXiv preprint arXiv:2312.11805(2023)

work page internal anchor Pith review Pith/arXiv arXiv 2023

[43] [43]

Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, and Liwei Wang. 2024. Visual autoregressive modeling: Scalable image generation via next-scale prediction. Advances in neural information processing systems37 (2024), 84839–84865

work page 2024

[44] [44]

Michael Tschannen, Alexey Gritsenko, Xiao Wang, Muhammad Ferjad Naeem, Ibrahim Alabdulmohsin, Nikhil Parthasarathy, Talfan Evans, Lucas Beyer, Ye Xia, Basil Mustafa, et al. 2025. Siglip 2: Multilingual vision-language encoders with improved semantic understanding, localization, and dense features.arXiv preprint arXiv:2502.14786(2025)

work page internal anchor Pith review Pith/arXiv arXiv 2025

[45] [45]

Christopher Wang, Vighnesh Subramaniam, Adam Uri Yaari, Gabriel Kreiman, Boris Katz, Ignacio Cases, and Andrei Barbu. 2023. BrainBERT: Self- supervised representation learning for intracranial recordings.arXiv preprint arXiv:2302.14367(2023)

work page arXiv 2023

[46] [46]

Guangyu Wang, Wenchao Liu, Yuhong He, Cong Xu, Lin Ma, and Haifeng Li

work page

[47] [47]

9 Pan et al

Eegpt: Pretrained transformer for universal and reliable representation of eeg signals.Advances in Neural Information Processing Systems37 (2024), 39249–39280. 9 Pan et al

work page 2024

[48] [48]

Jiquan Wang, Sha Zhao, Zhiling Luo, Yangxuan Zhou, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. 2024. Cbramod: A criss-cross brain foundation model for eeg decoding.arXiv preprint arXiv:2412.07236(2024)

work page arXiv 2024

[49] [49]

Peng Wang, Shuai Bai, Sinan Tan, Shijie Wang, Zhihao Fan, Jinze Bai, Keqin Chen, Xuejing Liu, Jialin Wang, Wenbin Ge, et al. 2024. Qwen2-vl: Enhancing vision-language model’s perception of the world at any resolution.arXiv preprint arXiv:2409.12191(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[50] [50]

Chengyue Wu, Xiaokang Chen, Zhiyu Wu, Yiyang Ma, Xingchao Liu, Zizheng Pan, Wen Liu, Zhenda Xie, Xingkai Yu, Chong Ruan, et al. 2024. Janus: Decoupling visual encoding for unified multimodal understanding and generation.arXiv preprint arXiv:2410.13848(2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[51] [51]

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Cheng- peng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, Guanting Dong, Haoran Wei, Huan Lin, Jialong Tang, Jialin Wang, Jian Yang, Jianhong Tu, Jianwei Zhang, Jianxin Ma, Jin Xu, Jingren Zhou, Jinze Bai, Jinzheng He, Junyang Lin, Kai Dang, Keming Lu, Keqin Chen, Kexin Yang, Mei Li, Mingfen...

work page internal anchor Pith review Pith/arXiv arXiv 2024

[52] [52]

Chaoqi Yang, M Westover, and Jimeng Sun. 2023. Biot: Biosignal transformer for cross-data learning in the wild.Advances in Neural Information Processing Systems36 (2023), 78240–78260

work page 2023

[53] [53]

Chaoqi Yang, Cao Xiao, M Brandon Westover, and Jimeng Sun. 2023. Self- supervised electroencephalogram representation learning for automatic sleep staging: model development and evaluation study.JMIR AI2, 1 (2023), e46769

work page 2023

[54] [54]

Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu. 2024. Brainmae: a region- aware self-supervised learning framework for brain signals.arXiv preprint arXiv:2406.17086(2024)

work page arXiv 2024

[55] [55]

Ke Yi, Yansen Wang, Kan Ren, and Dongsheng Li. 2023. Learning topology- agnostic EEG representations with geometry-aware modeling.Advances in Neural Information Processing Systems36 (2023), 53875–53891

work page 2023

[56] [56]

Zhizhang Yuan, Fanqi Shen, Meng Li, Yuguo Yu, Chenhao Tan, and Yang Yang

work page

[57] [57]

Brainwave: A brain signal foundation model for clinical applications.arXiv preprint arXiv:2402.10251(2024)

work page arXiv 2024

[58] [58]

Tongtian Yue, Shuning Xue, Xuange Gao, Yepeng Tang, Longteng Guo, Jie Jiang, and Jing Liu. 2024. Eegpt: Unleashing the potential of eeg generalist foundation model by autoregressive pre-training.arXiv preprint arXiv:2410.19779(2024)

work page arXiv 2024

[59] [59]

Xiaohua Zhai, Basil Mustafa, Alexander Kolesnikov, and Lucas Beyer. 2023. Sigmoid loss for language image pre-training. InProceedings of the IEEE/CVF international conference on computer vision. 11975–11986

work page 2023

[60] [60]

Daoze Zhang, Zhizhang Yuan, Yang Yang, Junru Chen, Jingjing Wang, and Yafeng Li. 2023. Brant: Foundation model for intracranial neural signal.Advances in Neural Information Processing Systems36 (2023), 26304–26321

work page 2023

[61] [61]

Xiang Zhang, Ziyuan Zhao, Theodoros Tsiligkaridis, and Marinka Zitnik. 2022. Self-supervised contrastive pre-training for time series via time-frequency con- sistency.Advances in neural information processing systems35 (2022), 3988–4003

work page 2022

[62] [62]

Zheng, W

W. Zheng, W. Liu, Y. Lu, B. Lu, and A. Cichocki. 2018. EmotionMeter: A Mul- timodal Framework for Recognizing Human Emotions.IEEE Transactions on Cybernetics(2018), 1–13. doi:10.1109/TCYB.2018.2797176

work page doi:10.1109/tcyb.2018.2797176 2018

[63] [63]

Wei-Long Zheng and Bao-Liang Lu. 2015. Investigating Critical Frequency Bands and Channels for EEG-based Emotion Recognition with Deep Neural Networks.IEEE Transactions on Autonomous Mental Development7, 3 (2015), 162–175. doi:10.1109/TAMD.2015.2431497 10

work page doi:10.1109/tamd.2015.2431497 2015