What Are We Actually Decoding? Source Attribution for Non-Invasive Brain-to-Language Retrieval

Alexandra Woolgar; Lihui Wang; Runhao Lu; Sichao Liu; Xinyu Zhang

arxiv: 2605.24524 · v1 · pith:ZFM6GCUHnew · submitted 2026-05-23 · 💻 cs.LG · cs.CL· q-bio.NC

What Are We Actually Decoding? Source Attribution for Non-Invasive Brain-to-Language Retrieval

Xinyu Zhang , Sichao Liu , Runhao Lu , Alexandra Woolgar , Lihui Wang This is my paper

Pith reviewed 2026-06-30 14:45 UTC · model grok-4.3

classification 💻 cs.LG cs.CLq-bio.NC

keywords brain-to-language decodingMEG retrievalsource attributionneural decoding auditcontext biasstructural confounds

0 comments

The pith

Brain-to-language retrieval performance must be broken down by source instead of reported as a single number.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that apparent success in retrieving audio from MEG signals can arise from non-neural factors such as signal duration or embedding properties rather than stimulus-evoked brain activity. It introduces an auditing approach that isolates three distinct sources: structural shortcuts, window-level neural evidence, and cross-window contextual aggregation. Fixed-duration windows and stimulus-identity splits drive signal-blind baselines to chance while leaving measurable discriminability in real MEG data. An additive logit bias called Group Context Bias then demonstrates that pooling sentence-consistent evidence across windows raises retrieval scores, but only when local evidence is present and under controls that rule out random grouping effects.

Core claim

By recasting MEG-to-audio retrieval as a source-auditing task, the work separates performance into structural shortcuts that collapse under fixed-duration and identity-split controls, window-level stimulus-locked evidence that remains measurable once those controls are applied, and cross-window contextual aggregation that can be isolated with an inference-time additive logit bias whose effect vanishes under random perturbations or when local evidence is weak.

What carries the argument

An auditing framework that decomposes apparent retrieval performance into structural shortcuts, window-level stimulus-locked evidence, and cross-window contextual aggregation, with Group Context Bias serving as the measurable intervention for the contextual component.

If this is right

Structural factors such as variable signal length can produce high retrieval scores in the absence of any neural signal.
Once structural shortcuts are removed, sentence-level competition remains the main remaining bottleneck even when window-level evidence is present.
Contextual pooling improves scores only when genuine local evidence exists and disappears under random grouping or weak local signals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same source-separation logic could be applied to EEG or fMRI language decoding to check whether similar non-neural contributions are present.
Benchmark reporting standards might shift toward requiring explicit attribution of gains to one of the three sources.
If contextual aggregation proves robust, future decoders could incorporate explicit sentence-level models to target that source directly.

Load-bearing premise

Fixed-duration windows, stimulus-identity splits, and random-grouping perturbations are enough to isolate the three performance sources without residual confounds from unmodeled interactions between structural and neural components.

What would settle it

If real MEG data still shows the same retrieval advantage over signal-blind noise after the fixed-duration and identity-split controls are applied, the separation into neural versus structural sources would be shown to be incomplete.

Figures

Figures reproduced from arXiv: 2605.24524 by Alexandra Woolgar, Lihui Wang, Runhao Lu, Sichao Liu, Xinyu Zhang.

**Figure 1.** Figure 1: Three performance sources in stimulus-locked M/EEG-to-audio retrieval. (a) Representative structural shortcut. Variable-duration inputs provide one concrete example of a structural shortcut: when neural activity is replaced by duration-preserving Gaussian noise, above-chance decoding indicates reliance on signal length or padding structure rather than neural content. (b) Fixed-duration, stimulus-locked M/… view at source ↗

**Figure 2.** Figure 2: Diagnostics for evaluation artefacts and contextual headroom. (a) Under variable-length sentence inputs, duration cues support next-token ranking even with signal-blind neural input; fixedlength windows remove this shortcut, collapsing the noise baseline towards chance and revealing the weaker neural signal under leakage control. (b) Global closed-pool retrieval errors concentrate in cross-sentence compet… view at source ↗

**Figure 3.** Figure 3: Grouping perturbation robustness. Baseline retrieval is invariant to perturbations. With GCB enabled, neighbour_once perturbations can slightly improve performance, whereas random_within_story perturbations monotonically reduce performance as p increases. Additional operatingregime characterisations. Appendices E.5 and E.2 show that the contextual effect is bounded by frozen-logit quality: gains concen… view at source ↗

**Figure 4.** Figure 4: reports how the Base–GCB contrast varies with sentence length on Gwilliams zero-shot retrieval. Sentence length is measured by the number of evaluation windows in the sentence. The analysis should be interpreted as an operating-regime characterisation rather than as a causal estimate of sentence length alone, because sentence length is coupled with the number of query windows, candidate-bucket size, within… view at source ↗

**Figure 5.** Figure 5: Contextual gain tracks available local evidence under within-Gwilliams attenuation. (a) Real-MEG stimulus-mismatched attenuation. Each attenuated query mixes the original MEG window with a stimulus-mismatched real MEG window from the same evaluation set. As the targetevidence fraction α decreases, base retrieval, +GCB retrieval, and the contextual gain all collapse. (b) Evidence attenuation across surroga… view at source ↗

**Figure 6.** Figure 6: Brennan EEG retrieval performance across candidate-pool sizes. Absolute R@1 and MRR improve as the candidate pool shrinks, but ∆R@1 = GCB − baseline remains negative across pool sizes for both backbones. This supports interpreting Brennan as an out-of-distribution evidence-limited regime: reducing pool difficulty raises absolute retrieval scores, but does not create reliable sentence-bucket support for con… view at source ↗

**Figure 7.** Figure 7: Score-space visualisation of GCB while keeping embeddings fixed. Left: A t-SNE visualisation of a query and its top candidates before and after GCB. The visualisation illustrates that GCB does not modify the embedding geometry; it only changes the ranking through a post-hoc logit correction. Right: Retrieval-logit distributions for positive and negative candidates before and after GCB. The positive candida… view at source ↗

**Figure 8.** Figure 8: Learned spatial mixing weights (qualitative). Top: Gwilliams; bottom: MOUS. Left: Dense-TCNN; right: exp-dilated CNN. Colours indicate relative spatial mixing weight in the shared coordinate-conditioned frontend (red: high; blue: low). E.12 Token-level case study of GCB-induced retrieval changes We provide a token-level case study of how GCB modifies retrieval-decoded reconstructions on Gwilliams. Each que… view at source ↗

read the original abstract

In non-invasive neural language decoding, results can be inflated by sources that are not stimulus-evoked neural evidence: decoder priors, embedding-based metrics, and non-neural structural nuisances such as signal duration. The methodological challenge is therefore attribution: a reported gain is more informative when it can be traced to a specific source. We recast stimulus-locked MEG-to-audio retrieval as an auditing framework that separates apparent performance into three sources - structural shortcuts, window-level stimulus-locked evidence, and cross-window contextual aggregation - and provides a diagnostic for each. Signal-blind Gaussian noise reaches 66.3% Rank@1 (R@1) under variable-length decoding but collapses to near chance once fixed-duration windows and stimulus-identity splits are enforced, isolating structural leakage. Under these controls, fixed-window retrieval recovers measurable MEG-audio discriminability, while an oracle sentence-bucket diagnostic shows that 95.7% of Top-1 errors select the wrong sentence, localising the residual bottleneck to sentence-level competition. We audit this contextual source with Group Context Bias (GCB), an inference-time additive logit bias that pools sentence-consistent evidence across windows while leaving the base retrieval scores and candidate pool fixed. Used as a score-space intervention, GCB makes the contextual source measurable: R@1 shifts from 44% to 52% on Gwilliams and from 22% to 29% on MOUS under the same fixed setting. GCB is auditable under this design: its effect collapses under random-grouping perturbations and vanishes when local evidence is attenuated in MEG or is near chance in EEG, supporting its use as a controlled source-attribution intervention. These results suggest that brain-to-language performance should be source-attributed, not merely reported.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a workable auditing setup to separate structural leakage, local neural evidence, and contextual aggregation in MEG/EEG retrieval, backed by noise baselines and perturbation checks that mostly hold up.

read the letter

The main thing to know is that a lot of reported brain-to-language retrieval numbers come from non-neural sources like variable signal length and decoder priors, and this work supplies a three-way split plus a controlled intervention to measure each piece.

They treat the task as source separation. Gaussian noise hits 66% R@1 with variable lengths but drops to chance once fixed windows and stimulus-identity splits are applied, which cleanly flags the structural shortcut. Under those controls they recover real window-level MEG-audio discriminability. The oracle sentence-bucket check then shows most top-1 errors are wrong-sentence picks, pointing to the contextual bottleneck.

Group Context Bias is the new diagnostic: an inference-time logit bias that pools sentence-consistent evidence across windows while keeping the base scores and candidate set fixed. It lifts R@1 from 44% to 52% on Gwilliams and 22% to 29% on MOUS. The effect disappears under random grouping and when local evidence is weak or absent, which gives the attribution some external grounding.

The soft spot is the stress-test worry about unmodeled interactions between duration-dependent features and neural topography that Gaussian noise might not fully replicate. The paper's random perturbations and MEG-vs-EEG checks reduce that risk, but the claim that fixed windows plus identity splits isolate everything still rests on those controls being exhaustive. Minor, but worth a methods check.

This is for labs doing non-invasive decoding who want to stop reporting raw numbers that mix sources. It deserves peer review because the interventions are falsifiable and the numbers are concrete enough to evaluate.

Referee Report

1 major / 1 minor

Summary. The manuscript proposes an auditing framework for non-invasive MEG/EEG-to-language retrieval that decomposes reported performance into three sources: structural shortcuts (isolated via signal-blind Gaussian noise, fixed-duration windows, and stimulus-identity splits), window-level stimulus-locked neural evidence, and cross-window contextual aggregation (measured via the Group Context Bias or GCB intervention). It reports concrete diagnostics including noise collapsing to chance under controls, an oracle showing 95.7% of top-1 errors are sentence-level, and GCB lifts of 8 and 7 R@1 points on Gwilliams and MOUS that vanish under random-grouping perturbations or when local evidence is weak.

Significance. If the separation holds, the framework supplies a practical, perturbation-auditable toolkit for source attribution in brain-to-language decoding, directly supporting the claim that gains should be traced rather than aggregated. The external grounding via independent perturbations and cross-modality (MEG/EEG) checks is a strength that reduces circularity risk.

major comments (1)

[GCB evaluation and perturbation results] The central attribution claim (that GCB isolates contextual aggregation) rests on the assumption that fixed-duration windows plus stimulus-identity splits eliminate all structural-neural interactions. However, the manuscript does not report explicit tests for residual confounds such as duration-dependent neural feature correlations or sentence-bucket statistics interacting with MEG topography; without these, the observed R@1 shifts (44% to 52% on Gwilliams) could partly reflect unmodeled structural leakage rather than pure context.

minor comments (1)

[Methods] Provide the precise algorithmic definition and hyperparameter settings for the stimulus-identity splits and the oracle sentence-bucket diagnostic to support replication.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback and recommendation for major revision. We address the single major comment below.

read point-by-point responses

Referee: The central attribution claim (that GCB isolates contextual aggregation) rests on the assumption that fixed-duration windows plus stimulus-identity splits eliminate all structural-neural interactions. However, the manuscript does not report explicit tests for residual confounds such as duration-dependent neural feature correlations or sentence-bucket statistics interacting with MEG topography; without these, the observed R@1 shifts (44% to 52% on Gwilliams) could partly reflect unmodeled structural leakage rather than pure context.

Authors: We appreciate this concern regarding potential residual confounds. The stimulus-identity splits are constructed such that each unique stimulus appears in only one partition, which by design prevents sentence-bucket statistics from influencing cross-split evaluation. Fixed-duration windows further eliminate duration as a confounding variable. While the manuscript does not include separate correlation analyses between neural features and duration or MEG topography, the signal-blind Gaussian noise control—which preserves all original duration, topography, and any associated correlations—collapses to near-chance performance under the identical fixed-window and identity-split regime. This provides evidence that unmodeled structural-neural interactions are not driving the base retrieval scores or the GCB-induced lifts. We will expand the discussion section to explicitly address these controls and their implications for residual confounds. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation chain relies on independent controls and perturbations

full rationale

The paper separates performance sources using signal-blind Gaussian noise under fixed-duration windows and stimulus-identity splits to isolate structural leakage, then introduces GCB as an additive logit bias at inference time to measure contextual aggregation. Validation occurs via independent perturbations (random grouping, MEG vs EEG attenuation) that cause GCB effects to collapse, without any step reducing to a fitted parameter renamed as prediction or depending on self-citation chains. The framework is self-contained against external benchmarks like chance-level baselines and oracle sentence-bucket diagnostics.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

Ledger constructed from abstract only; full paper may contain additional parameters or assumptions not visible here.

axioms (2)

domain assumption Fixed-duration windows and stimulus-identity splits isolate structural leakage from neural evidence without introducing new confounds
Invoked to demonstrate that noise performance collapses and residual MEG-audio discriminability remains.
domain assumption The oracle sentence-bucket diagnostic and perturbation tests accurately localize the contextual bottleneck
Used to claim that 95.7% of errors are sentence-level and that GCB specifically measures cross-window aggregation.

invented entities (1)

Group Context Bias (GCB) no independent evidence
purpose: Inference-time additive logit bias that pools sentence-consistent evidence across windows
Newly introduced controlled intervention to make the contextual source measurable and auditable.

pith-pipeline@v0.9.1-grok · 5869 in / 1531 out tokens · 58968 ms · 2026-06-30T14:45:50.287173+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

50 extracted references · 16 canonical work pages

[1]

Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification

Zhenhailong Wang and Heng Ji. Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 5350–5358, 2022

2022
[2]

DeWave: Discrete encoding of EEG waves for EEG-to-text translation

Yiqun Duan, Jinzhao Zhou, Zhen Wang, Yu-Kai Wang, and Chin-Teng Lin. DeWave: Discrete encoding of EEG waves for EEG-to-text translation. InAdvances in Neural Information Processing Systems, volume 36, pages 9907–9918, 2023

2023
[3]

UniCoRN: Unified cognitive signal reconstruction bridging cognitive signals and human language

Nuwa Xi, Sendong Zhao, Haochun Wang, Chi Liu, Bing Qin, and Ting Liu. UniCoRN: Unified cognitive signal reconstruction bridging cognitive signals and human language. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023

2023
[4]

MAD: Multi-alignment MEG-to-text decoding, 2024

Yiqian Yang, Hyejeong Jo, Yiqun Duan, Qiang Zhang, Jinni Zhou, Won Hee Lee, Renjing Xu, and Hui Xiong. MAD: Multi-alignment MEG-to-text decoding, 2024

2024
[5]

BrainECHO: Semantic brain signal decoding through vector-quantized spectrogram reconstruction for whisper-enhanced text generation

Jilong Li, Zhenxi Song, Jiaqi Wang, Meishan Zhang, Honghai Liu, Min Zhang, and Zhiguo Zhang. BrainECHO: Semantic brain signal decoding through vector-quantized spectrogram reconstruction for whisper-enhanced text generation. InFindings of the Association for Computa- tional Linguistics: ACL 2025, pages 2762–2778, 2025. doi: 10.18653/v1/2025.findings-acl.142

work page doi:10.18653/v1/2025.findings-acl.142 2025
[6]

Nature Machine Intelligence , author =

Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A. Wichmann. Shortcut learning in deep neural networks.Nature Machine Intelligence, 2(11):665–673, 2020. doi: 10.1038/s42256-020-00257-z

work page doi:10.1038/s42256-020-00257-z 2020
[7]

Assessing the impact of sequence length learn- ing on classification tasks for transformer encoder models.arXiv preprint arXiv:2212.08399, 2022

Jean-Thomas Baillargeon and Luc Lamontagne. Assessing the impact of sequence length learn- ing on classification tasks for transformer encoder models.arXiv preprint arXiv:2212.08399, 2022

work page arXiv 2022
[8]

Evaluating EEG-to-text models through noise-based performance analysis.Scientific Reports, 16(1):350,

Hyejeong Jo, Yiqian Yang, Juhyeok Han, Yiqun Duan, Hui Xiong, and Won Hee Lee. Evaluating EEG-to-text models through noise-based performance analysis.Scientific Reports, 16(1):350,
[9]

doi: 10.1038/s41598-025-29587-x

work page doi:10.1038/s41598-025-29587-x
[10]

Weinberger, and Yoav Artzi

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. BERTScore: Evaluating text generation with BERT. InProceedings of the 8th International Conference on Learning Representations, 2020

2020
[11]

A fine-grained analysis of BERTScore

Michael Hanna and Ondˇrej Bojar. A fine-grained analysis of BERTScore. InProceedings of the Sixth Conference on Machine Translation, pages 507–517, 2021

2021
[12]

Blauch, Yunan Charles Wu, Ryan Glatt, David A

Geoffrey Brookshire, Jake Kasper, Nicholas M. Blauch, Yunan Charles Wu, Ryan Glatt, David A. Merrill, Spencer Gerrol, Keith J. Yoder, Colin Quirk, and Ché Lucero. Data leakage in deep learning studies of translational EEG.Frontiers in Neuroscience, 18:1373515, 2024

2024
[13]

In: Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V

Congchi Yin, Qian Yu, Zhiwei Fang, Changping Peng, and Piji Li. Rethinking cross-subject data splitting for brain-to-text decoding. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng, editors,Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 5675–5689, Suzhou, China, November 2025....

work page doi:10.18653/v1/2025 2025
[14]

Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5(10):1097–1107, 2023

Alexandre Défossez, Charlotte Caucheteux, Jérémy Rapin, Ori Kabeli, and Jean-Rémi King. Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5(10):1097–1107, 2023

2023
[15]

Towards decoding individual words from non-invasive brain recordings.Nature Communications, 16(1):10521, 2025

Stéphane d’Ascoli, Corentin Bel, Jérémy Rapin, Hubert Banville, Yohann Benchetrit, Christophe Pallier, and Jean-Rémi King. Towards decoding individual words from non-invasive brain recordings.Nature Communications, 16(1):10521, 2025. doi: 10.1038/s41467-025-65499-0

work page doi:10.1038/s41467-025-65499-0 2025
[16]

Meyer, and Andrea E

Sophie Slaats, Hugo Weissbart, Jan-Mathijs Schoffelen, Antje S. Meyer, and Andrea E. Martin. Delta-band neural responses to individual words are modulated by sentence processing.Journal of Neuroscience, 43(26):4867–4883, 2023. 11

2023
[17]

Christian Brodbeck, Shohini Bhattasali, Aura A. L. Cruz Heredia, Philip Resnik, Jonathan Z. Simon, and Ellen Lau. Parallel processing in speech perception with local and global represen- tations of linguistic context.eLife, 11:e72056, 2022

2022
[18]

Raghavan and Lucas C

Vinay S. Raghavan and Lucas C. Parra. Neural encoding of linguistic features during natural sentence reading.iScience, 28(7):112798, 2025. doi: 10.1016/j.isci.2025.112798

work page doi:10.1016/j.isci.2025.112798 2025
[19]

Aligning semantic in brain and language: A curriculum contrastive method for electroencephalography-to-text generation

Xiachong Feng, Xiaocheng Feng, Bing Qin, and Ting Liu. Aligning semantic in brain and language: A curriculum contrastive method for electroencephalography-to-text generation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:3874–3883, 2023

2023
[20]

Belt: boot- strapped eeg-to-language training by natural language supervision.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32:3278–3288, 2024

Jinzhao Zhou, Yiqun Duan, Yu-Cheng Chang, Yu-Kai Wang, and Chin-Teng Lin. Belt: boot- strapped eeg-to-language training by natural language supervision.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32:3278–3288, 2024

2024
[21]

EEG2TEXT: Open vocabulary EEG-to-text decoding with EEG pre-training and multi-view transformer, 2024

Hanwen Liu, Daniel Hajialigol, Benny Antony, Aiguo Han, and Xuan Wang. EEG2TEXT: Open vocabulary EEG-to-text decoding with EEG pre-training and multi-view transformer, 2024

2024
[22]

Thomas and Pavlick, Ellie and Linzen, Tal

R. Thomas McCoy, Ellie Pavlick, and Tal Linzen. Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3428–3448. Association for Computational Linguistics, 2019. doi: 10.18653/v1/P19-1334

work page doi:10.18653/v1/p19-1334 2019
[23]

Bowman, and Noah A

Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, and Noah A. Smith. Annotation artifacts in natural language inference data. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 2 (Short Papers), pages 107–112. Association...

2018
[24]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, page...

2021
[25]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 1597–1607. PMLR, 2020

2020
[26]

MEGformer: Enhancing speech decoding from brain activity through extended semantic representations

Maria Boyko, Polina Druzhinina, Georgii Kormakov, Aleksandra Beliaeva, and Maxim Sharaev. MEGformer: Enhancing speech decoding from brain activity through extended semantic representations. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 281–290, 2024

2024
[27]

The brain’s bitter lesson: Scaling speech decoding with self-supervised learning

Dulhan Jayalath, Gilad Landau, Brendan Shillingford, Mark Woolrich, and Oiwi Parker Jones. The brain’s bitter lesson: Scaling speech decoding with self-supervised learning. In Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Tegan Maharaj, Kiri Wagstaff, and Jerry Zhu, editors,Proceedings of the 42nd International Conference ...

2025
[28]

Towards linguistic neural representation learning and sentence retrieval from electroencephalogram recordings

Jinzhao Zhou, Yiqun Duan, Ziyi Zhao, Yu-Cheng Chang, Yu-Kai Wang, Thomas Do, and Chin-Teng Lin. Towards linguistic neural representation learning and sentence retrieval from electroencephalogram recordings. InProceedings of the 1st International Workshop on Brain- Computer Interfaces (BCI) for Multimedia Understanding, pages 19–28, 2024

2024
[29]

Brain decoding: Toward real-time reconstruction of visual perception, 2023

Yohann Benchetrit, Hubert Banville, and Jean-Rémi King. Brain decoding: Toward real-time reconstruction of visual perception, 2023

2023
[30]

Decoding natural images from eeg for object recognition.arXiv preprint arXiv:2308.13234, 2023

Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, and Xiaorong Gao. Decoding natural images from eeg for object recognition.arXiv preprint arXiv:2308.13234, 2023. 12

work page arXiv 2023
[31]

Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, and Danica Kragic

Nona Rajabi, Antonio H. Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, and Danica Kragic. Human-aligned image models improve visual decoding from the brain. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 51009–51038. PMLR, 2025

2025
[32]

Broderick, Andrew J

Michael P. Broderick, Andrew J. Anderson, Giovanni M. Di Liberto, Michael J. Crosse, and Edmund C. Lalor. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech.Current Biology, 28(5):803–809.e3, 2018. doi: 10.1016/j.cub.2018.01.080

work page doi:10.1016/j.cub.2018.01.080 2018
[33]

Cormack, Charles L

Gordon V . Cormack, Charles L. A. Clarke, and Stefan Buettcher. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. InProceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 758–759, 2009

2009
[34]

Hu-wbi at bioasq12b phase a: Exploring rank fusion of dense retrievers and re-rankers

O˘guz ¸ Serbetçi, Xing David Wang, and Ulf Leser. Hu-wbi at bioasq12b phase a: Exploring rank fusion of dense retrievers and re-rankers. InProceedings of the Conference and Labs of the Evaluation Forum, Grenoble, France, pages 9–12, 2024

2024
[35]

Enhancing retrieval systems with inference-time logical reasoning

Felix Faltings, Wei Wei, and Yujia Bao. Enhancing retrieval systems with inference-time logical reasoning. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar, editors,Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 449–463, Vienna, Austria, July 2025. Assoc...

work page doi:10.18653/v1/2025.acl-short.34 2025
[36]

Luck.An Introduction to the Event-Related Potential Technique

Steven J. Luck.An Introduction to the Event-Related Potential Technique. MIT Press, Cam- bridge, MA, 2 edition, 2014. ISBN 9780262525855

2014
[37]

Anna M. Beres. Time is of the essence: A review of electroencephalography (EEG) and event-related brain potentials (ERPs) in language research.Applied Psychophysiology and Biofeedback, 42(4):247–255, 2017. doi: 10.1007/s10484-017-9371-3

work page doi:10.1007/s10484-017-9371-3 2017
[38]

Introducing meg-masc a high-quality magneto-encephalography dataset for evaluating natural speech processing.Scientific data, 10(1):862, 2023

Laura Gwilliams, Graham Flick, Alec Marantz, Liina Pylkkänen, David Poeppel, and Jean-Rémi King. Introducing meg-masc a high-quality magneto-encephalography dataset for evaluating natural speech processing.Scientific data, 10(1):862, 2023

2023
[39]

wav2vec 2.0: A framework for self-supervised learning of speech representations

Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. InAdvances in Neural Information Processing Systems, volume 33, pages 12449–12460, 2020

2020
[40]

Unlocking non-invasive brain-to-text

Dulhan Jayalath, Gilad Landau, and Oiwi Parker Jones. Unlocking non-invasive brain-to-text. In2nd Generative AI for Biology Workshop at ICML 2025, 2025. Poster

2025
[41]

Jan-Mathijs Schoffelen, Robert Oostenveld, Ngoc H. L. Lam, Julia Uddén, Annika Hultén, and Peter Hagoort. A 204-subject multimodal neuroimaging dataset to study language processing. Scientific Data, 6(1):17, 2019. doi: 10.1038/s41597-019-0020-y

work page doi:10.1038/s41597-019-0020-y 2019
[42]

Brennan and John T

Jonathan R. Brennan and John T. Hale. Hierarchical structure guides rapid linguistic predictions during naturalistic listening.PLOS ONE, 14(1):e0207741, 2019

2019
[43]

Robust speech recognition via large-scale weak supervision

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 28492–28518. PMLR, 2023

2023
[44]

Roy John

Samuel Sutton, Michael Braren, Joseph Zubin, and E. Roy John. Evoked-potential correlates of stimulus uncertainty.Science, 150(3700):1187–1188, 1965. doi: 10.1126/science.150.3700. 1187

work page doi:10.1126/science.150.3700 1965
[45]

Updating P300: An integrative theory of P3a and P3b.Clinical Neurophysiology, 118(10):2128–2148, 2007

John Polich. Updating P300: An integrative theory of P3a and P3b.Clinical Neurophysiology, 118(10):2128–2148, 2007. doi: 10.1016/j.clinph.2007.04.019

work page doi:10.1016/j.clinph.2007.04.019 2007
[46]

Hillyard

Marta Kutas and Steven A. Hillyard. Reading senseless sentences: Brain potentials reflect semantic incongruity.Science, 207(4427):203–205, 1980. doi: 10.1126/science.7350657. 13

work page doi:10.1126/science.7350657 1980
[47]

Object-centric learning with slot attention.Advances in neural information processing systems, 33:11525–11538, 2020

Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, and Thomas Kipf. Object-centric learning with slot attention.Advances in neural information processing systems, 33:11525–11538, 2020

2020
[48]

Perceiver: General perception with iterative attention

Andrew Jaegle, Felix Gimeno, Andy Brock, Oriol Vinyals, Andrew Zisserman, and Joao Carreira. Perceiver: General perception with iterative attention. InInternational conference on machine learning, pages 4651–4664. PMLR, 2021

2021
[49]

Flamingo: A visual language model for few-shot learning

Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: A visual language model for few-shot learning. InAdvances in Neural Information Processing Systems, volume 35, pages 23716–23736, 2022. 14 A Data, assets, and preprocessing A.1 Sentence-...

2022
[50]

the short-eared people didn’treally know the secrets of the long ears and had pulled down all the moai statues and destroyed some of the tablets

For each 3 s audio window, we extract hidden activations from layers 14–18, average across layers, and retain the time-resolved representation. We interpolate or resample along time to match the evaluation temporal resolution T=360 , yielding a D×T target with D=1024. Standard waveform preprocessing, including resampling to 16 kHz and waveform normalisati...

[1] [1]

Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification

Zhenhailong Wang and Heng Ji. Open vocabulary electroencephalography-to-text decoding and zero-shot sentiment classification. InProceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 5350–5358, 2022

2022

[2] [2]

DeWave: Discrete encoding of EEG waves for EEG-to-text translation

Yiqun Duan, Jinzhao Zhou, Zhen Wang, Yu-Kai Wang, and Chin-Teng Lin. DeWave: Discrete encoding of EEG waves for EEG-to-text translation. InAdvances in Neural Information Processing Systems, volume 36, pages 9907–9918, 2023

2023

[3] [3]

UniCoRN: Unified cognitive signal reconstruction bridging cognitive signals and human language

Nuwa Xi, Sendong Zhao, Haochun Wang, Chi Liu, Bing Qin, and Ting Liu. UniCoRN: Unified cognitive signal reconstruction bridging cognitive signals and human language. InProceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 2023

2023

[4] [4]

MAD: Multi-alignment MEG-to-text decoding, 2024

Yiqian Yang, Hyejeong Jo, Yiqun Duan, Qiang Zhang, Jinni Zhou, Won Hee Lee, Renjing Xu, and Hui Xiong. MAD: Multi-alignment MEG-to-text decoding, 2024

2024

[5] [5]

BrainECHO: Semantic brain signal decoding through vector-quantized spectrogram reconstruction for whisper-enhanced text generation

Jilong Li, Zhenxi Song, Jiaqi Wang, Meishan Zhang, Honghai Liu, Min Zhang, and Zhiguo Zhang. BrainECHO: Semantic brain signal decoding through vector-quantized spectrogram reconstruction for whisper-enhanced text generation. InFindings of the Association for Computa- tional Linguistics: ACL 2025, pages 2762–2778, 2025. doi: 10.18653/v1/2025.findings-acl.142

work page doi:10.18653/v1/2025.findings-acl.142 2025

[6] [6]

Nature Machine Intelligence , author =

Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A. Wichmann. Shortcut learning in deep neural networks.Nature Machine Intelligence, 2(11):665–673, 2020. doi: 10.1038/s42256-020-00257-z

work page doi:10.1038/s42256-020-00257-z 2020

[7] [7]

Assessing the impact of sequence length learn- ing on classification tasks for transformer encoder models.arXiv preprint arXiv:2212.08399, 2022

Jean-Thomas Baillargeon and Luc Lamontagne. Assessing the impact of sequence length learn- ing on classification tasks for transformer encoder models.arXiv preprint arXiv:2212.08399, 2022

work page arXiv 2022

[8] [8]

Evaluating EEG-to-text models through noise-based performance analysis.Scientific Reports, 16(1):350,

Hyejeong Jo, Yiqian Yang, Juhyeok Han, Yiqun Duan, Hui Xiong, and Won Hee Lee. Evaluating EEG-to-text models through noise-based performance analysis.Scientific Reports, 16(1):350,

[9] [9]

doi: 10.1038/s41598-025-29587-x

work page doi:10.1038/s41598-025-29587-x

[10] [10]

Weinberger, and Yoav Artzi

Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. BERTScore: Evaluating text generation with BERT. InProceedings of the 8th International Conference on Learning Representations, 2020

2020

[11] [11]

A fine-grained analysis of BERTScore

Michael Hanna and Ondˇrej Bojar. A fine-grained analysis of BERTScore. InProceedings of the Sixth Conference on Machine Translation, pages 507–517, 2021

2021

[12] [12]

Blauch, Yunan Charles Wu, Ryan Glatt, David A

Geoffrey Brookshire, Jake Kasper, Nicholas M. Blauch, Yunan Charles Wu, Ryan Glatt, David A. Merrill, Spencer Gerrol, Keith J. Yoder, Colin Quirk, and Ché Lucero. Data leakage in deep learning studies of translational EEG.Frontiers in Neuroscience, 18:1373515, 2024

2024

[13] [13]

In: Christodoulopoulos, C., Chakraborty, T., Rose, C., Peng, V

Congchi Yin, Qian Yu, Zhiwei Fang, Changping Peng, and Piji Li. Rethinking cross-subject data splitting for brain-to-text decoding. In Christos Christodoulopoulos, Tanmoy Chakraborty, Carolyn Rose, and Violet Peng, editors,Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 5675–5689, Suzhou, China, November 2025....

work page doi:10.18653/v1/2025 2025

[14] [14]

Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5(10):1097–1107, 2023

Alexandre Défossez, Charlotte Caucheteux, Jérémy Rapin, Ori Kabeli, and Jean-Rémi King. Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5(10):1097–1107, 2023

2023

[15] [15]

Towards decoding individual words from non-invasive brain recordings.Nature Communications, 16(1):10521, 2025

Stéphane d’Ascoli, Corentin Bel, Jérémy Rapin, Hubert Banville, Yohann Benchetrit, Christophe Pallier, and Jean-Rémi King. Towards decoding individual words from non-invasive brain recordings.Nature Communications, 16(1):10521, 2025. doi: 10.1038/s41467-025-65499-0

work page doi:10.1038/s41467-025-65499-0 2025

[16] [16]

Meyer, and Andrea E

Sophie Slaats, Hugo Weissbart, Jan-Mathijs Schoffelen, Antje S. Meyer, and Andrea E. Martin. Delta-band neural responses to individual words are modulated by sentence processing.Journal of Neuroscience, 43(26):4867–4883, 2023. 11

2023

[17] [17]

Christian Brodbeck, Shohini Bhattasali, Aura A. L. Cruz Heredia, Philip Resnik, Jonathan Z. Simon, and Ellen Lau. Parallel processing in speech perception with local and global represen- tations of linguistic context.eLife, 11:e72056, 2022

2022

[18] [18]

Raghavan and Lucas C

Vinay S. Raghavan and Lucas C. Parra. Neural encoding of linguistic features during natural sentence reading.iScience, 28(7):112798, 2025. doi: 10.1016/j.isci.2025.112798

work page doi:10.1016/j.isci.2025.112798 2025

[19] [19]

Aligning semantic in brain and language: A curriculum contrastive method for electroencephalography-to-text generation

Xiachong Feng, Xiaocheng Feng, Bing Qin, and Ting Liu. Aligning semantic in brain and language: A curriculum contrastive method for electroencephalography-to-text generation. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:3874–3883, 2023

2023

[20] [20]

Belt: boot- strapped eeg-to-language training by natural language supervision.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32:3278–3288, 2024

Jinzhao Zhou, Yiqun Duan, Yu-Cheng Chang, Yu-Kai Wang, and Chin-Teng Lin. Belt: boot- strapped eeg-to-language training by natural language supervision.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 32:3278–3288, 2024

2024

[21] [21]

EEG2TEXT: Open vocabulary EEG-to-text decoding with EEG pre-training and multi-view transformer, 2024

Hanwen Liu, Daniel Hajialigol, Benny Antony, Aiguo Han, and Xuan Wang. EEG2TEXT: Open vocabulary EEG-to-text decoding with EEG pre-training and multi-view transformer, 2024

2024

[22] [22]

Thomas and Pavlick, Ellie and Linzen, Tal

R. Thomas McCoy, Ellie Pavlick, and Tal Linzen. Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference. InProceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 3428–3448. Association for Computational Linguistics, 2019. doi: 10.18653/v1/P19-1334

work page doi:10.18653/v1/p19-1334 2019

[23] [23]

Bowman, and Noah A

Suchin Gururangan, Swabha Swayamdipta, Omer Levy, Roy Schwartz, Samuel R. Bowman, and Noah A. Smith. Annotation artifacts in natural language inference data. InProceedings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 2 (Short Papers), pages 107–112. Association...

2018

[24] [24]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InProceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, page...

2021

[25] [25]

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InProceedings of the 37th International Conference on Machine Learning, volume 119 ofProceedings of Machine Learning Research, pages 1597–1607. PMLR, 2020

2020

[26] [26]

MEGformer: Enhancing speech decoding from brain activity through extended semantic representations

Maria Boyko, Polina Druzhinina, Georgii Kormakov, Aleksandra Beliaeva, and Maxim Sharaev. MEGformer: Enhancing speech decoding from brain activity through extended semantic representations. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 281–290, 2024

2024

[27] [27]

The brain’s bitter lesson: Scaling speech decoding with self-supervised learning

Dulhan Jayalath, Gilad Landau, Brendan Shillingford, Mark Woolrich, and Oiwi Parker Jones. The brain’s bitter lesson: Scaling speech decoding with self-supervised learning. In Aarti Singh, Maryam Fazel, Daniel Hsu, Simon Lacoste-Julien, Felix Berkenkamp, Tegan Maharaj, Kiri Wagstaff, and Jerry Zhu, editors,Proceedings of the 42nd International Conference ...

2025

[28] [28]

Towards linguistic neural representation learning and sentence retrieval from electroencephalogram recordings

Jinzhao Zhou, Yiqun Duan, Ziyi Zhao, Yu-Cheng Chang, Yu-Kai Wang, Thomas Do, and Chin-Teng Lin. Towards linguistic neural representation learning and sentence retrieval from electroencephalogram recordings. InProceedings of the 1st International Workshop on Brain- Computer Interfaces (BCI) for Multimedia Understanding, pages 19–28, 2024

2024

[29] [29]

Brain decoding: Toward real-time reconstruction of visual perception, 2023

Yohann Benchetrit, Hubert Banville, and Jean-Rémi King. Brain decoding: Toward real-time reconstruction of visual perception, 2023

2023

[30] [30]

Decoding natural images from eeg for object recognition.arXiv preprint arXiv:2308.13234, 2023

Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, and Xiaorong Gao. Decoding natural images from eeg for object recognition.arXiv preprint arXiv:2308.13234, 2023. 12

work page arXiv 2023

[31] [31]

Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, and Danica Kragic

Nona Rajabi, Antonio H. Ribeiro, Miguel Vasco, Farzaneh Taleb, Mårten Björkman, and Danica Kragic. Human-aligned image models improve visual decoding from the brain. InProceedings of the 42nd International Conference on Machine Learning, volume 267 ofProceedings of Machine Learning Research, pages 51009–51038. PMLR, 2025

2025

[32] [32]

Broderick, Andrew J

Michael P. Broderick, Andrew J. Anderson, Giovanni M. Di Liberto, Michael J. Crosse, and Edmund C. Lalor. Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech.Current Biology, 28(5):803–809.e3, 2018. doi: 10.1016/j.cub.2018.01.080

work page doi:10.1016/j.cub.2018.01.080 2018

[33] [33]

Cormack, Charles L

Gordon V . Cormack, Charles L. A. Clarke, and Stefan Buettcher. Reciprocal rank fusion outperforms condorcet and individual rank learning methods. InProceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 758–759, 2009

2009

[34] [34]

Hu-wbi at bioasq12b phase a: Exploring rank fusion of dense retrievers and re-rankers

O˘guz ¸ Serbetçi, Xing David Wang, and Ulf Leser. Hu-wbi at bioasq12b phase a: Exploring rank fusion of dense retrievers and re-rankers. InProceedings of the Conference and Labs of the Evaluation Forum, Grenoble, France, pages 9–12, 2024

2024

[35] [35]

Enhancing retrieval systems with inference-time logical reasoning

Felix Faltings, Wei Wei, and Yujia Bao. Enhancing retrieval systems with inference-time logical reasoning. In Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar, editors,Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 449–463, Vienna, Austria, July 2025. Assoc...

work page doi:10.18653/v1/2025.acl-short.34 2025

[36] [36]

Luck.An Introduction to the Event-Related Potential Technique

Steven J. Luck.An Introduction to the Event-Related Potential Technique. MIT Press, Cam- bridge, MA, 2 edition, 2014. ISBN 9780262525855

2014

[37] [37]

Anna M. Beres. Time is of the essence: A review of electroencephalography (EEG) and event-related brain potentials (ERPs) in language research.Applied Psychophysiology and Biofeedback, 42(4):247–255, 2017. doi: 10.1007/s10484-017-9371-3

work page doi:10.1007/s10484-017-9371-3 2017

[38] [38]

Introducing meg-masc a high-quality magneto-encephalography dataset for evaluating natural speech processing.Scientific data, 10(1):862, 2023

Laura Gwilliams, Graham Flick, Alec Marantz, Liina Pylkkänen, David Poeppel, and Jean-Rémi King. Introducing meg-masc a high-quality magneto-encephalography dataset for evaluating natural speech processing.Scientific data, 10(1):862, 2023

2023

[39] [39]

wav2vec 2.0: A framework for self-supervised learning of speech representations

Alexei Baevski, Yuhao Zhou, Abdelrahman Mohamed, and Michael Auli. wav2vec 2.0: A framework for self-supervised learning of speech representations. InAdvances in Neural Information Processing Systems, volume 33, pages 12449–12460, 2020

2020

[40] [40]

Unlocking non-invasive brain-to-text

Dulhan Jayalath, Gilad Landau, and Oiwi Parker Jones. Unlocking non-invasive brain-to-text. In2nd Generative AI for Biology Workshop at ICML 2025, 2025. Poster

2025

[41] [41]

Jan-Mathijs Schoffelen, Robert Oostenveld, Ngoc H. L. Lam, Julia Uddén, Annika Hultén, and Peter Hagoort. A 204-subject multimodal neuroimaging dataset to study language processing. Scientific Data, 6(1):17, 2019. doi: 10.1038/s41597-019-0020-y

work page doi:10.1038/s41597-019-0020-y 2019

[42] [42]

Brennan and John T

Jonathan R. Brennan and John T. Hale. Hierarchical structure guides rapid linguistic predictions during naturalistic listening.PLOS ONE, 14(1):e0207741, 2019

2019

[43] [43]

Robust speech recognition via large-scale weak supervision

Alec Radford, Jong Wook Kim, Tao Xu, Greg Brockman, Christine McLeavey, and Ilya Sutskever. Robust speech recognition via large-scale weak supervision. InProceedings of the 40th International Conference on Machine Learning, volume 202 ofProceedings of Machine Learning Research, pages 28492–28518. PMLR, 2023

2023

[44] [44]

Roy John

Samuel Sutton, Michael Braren, Joseph Zubin, and E. Roy John. Evoked-potential correlates of stimulus uncertainty.Science, 150(3700):1187–1188, 1965. doi: 10.1126/science.150.3700. 1187

work page doi:10.1126/science.150.3700 1965

[45] [45]

Updating P300: An integrative theory of P3a and P3b.Clinical Neurophysiology, 118(10):2128–2148, 2007

John Polich. Updating P300: An integrative theory of P3a and P3b.Clinical Neurophysiology, 118(10):2128–2148, 2007. doi: 10.1016/j.clinph.2007.04.019

work page doi:10.1016/j.clinph.2007.04.019 2007

[46] [46]

Hillyard

Marta Kutas and Steven A. Hillyard. Reading senseless sentences: Brain potentials reflect semantic incongruity.Science, 207(4427):203–205, 1980. doi: 10.1126/science.7350657. 13

work page doi:10.1126/science.7350657 1980

[47] [47]

Object-centric learning with slot attention.Advances in neural information processing systems, 33:11525–11538, 2020

Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, and Thomas Kipf. Object-centric learning with slot attention.Advances in neural information processing systems, 33:11525–11538, 2020

2020

[48] [48]

Perceiver: General perception with iterative attention

Andrew Jaegle, Felix Gimeno, Andy Brock, Oriol Vinyals, Andrew Zisserman, and Joao Carreira. Perceiver: General perception with iterative attention. InInternational conference on machine learning, pages 4651–4664. PMLR, 2021

2021

[49] [49]

Flamingo: A visual language model for few-shot learning

Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: A visual language model for few-shot learning. InAdvances in Neural Information Processing Systems, volume 35, pages 23716–23736, 2022. 14 A Data, assets, and preprocessing A.1 Sentence-...

2022

[50] [50]

the short-eared people didn’treally know the secrets of the long ears and had pulled down all the moai statues and destroyed some of the tablets

For each 3 s audio window, we extract hidden activations from layers 14–18, average across layers, and retain the time-resolved representation. We interpolate or resample along time to match the evaluation temporal resolution T=360 , yielding a D×T target with D=1024. Standard waveform preprocessing, including resampling to 16 kHz and waveform normalisati...