The Wisdom of a Crowd of Brains: A Universal Brain Encoder

Amit Zalcher; Michal Irani; Navve Wasserman; Roman Beliy

arxiv: 2406.12179 · v4 · pith:7BV4HKZPnew · submitted 2024-06-18 · 💻 cs.CV

The Wisdom of a Crowd of Brains: A Universal Brain Encoder

Roman Beliy , Navve Wasserman , Amit Zalcher , Michal Irani This is my paper

classification 💻 cs.CV

keywords brain-voxelbraindataencodersubjectsarchitecturebrainscross-attention

0 comments

read the original abstract

Image-to-fMRI encoding is important for both neuroscience research and practical applications. However, such "Brain-Encoders" have been typically trained per-subject and per fMRI-dataset, thus restricted to very limited training data. In this paper we propose a Universal Brain-Encoder, which can be trained jointly on data from many different subjects/datasets/machines. What makes this possible is our new voxel-centric Encoder architecture, which learns a unique "voxel-embedding" per brain-voxel. Our Encoder trains to predict the response of each brain-voxel on every image, by directly computing the cross-attention between the brain-voxel embedding and multi-level deep image features. This voxel-centric architecture allows the functional role of each brain-voxel to naturally emerge from the voxel-image cross-attention. We show the power of this approach to (i) combine data from multiple different subjects (a "Crowd of Brains") to improve each individual brain-encoding, (ii) quick & effective Transfer-Learning across subjects, datasets, and machines (e.g., 3-Tesla, 7-Tesla), with few training examples, and (iii) use the learned voxel-embeddings as a powerful tool to explore brain functionality (e.g., what is encoded where in the brain).

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain
cs.CV 2026-05 unverdicted novelty 7.0

BrainCause recovers known visual localizations and finds new candidate representations by validating causal specificity via counterfactual stimuli and encoding models, showing activation alone produces many false positives.
A foundation model of vision, audition, and language for in-silico neuroscience
q-bio.NC 2026-05 unverdicted novelty 7.0

TRIBE v2 is a multimodal AI model that predicts human brain activity more accurately than linear encoding models and recovers established neuroscientific findings through in-silico testing.
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
cs.LG 2026-04 unverdicted novelty 6.0

A meta-optimized in-context learning approach enables training-free cross-subject semantic visual decoding from fMRI by inferring individual neural encoding patterns via hierarchical inference on a few examples.
ViBE: Visual-to-M/EEG Brain Encoding via Spatio-Temporal VAE and Distribution-Aligned Projection
cs.CV 2026-04 unverdicted novelty 4.0

ViBE generates M/EEG signals from visual stimuli by reconstructing neural responses with a TSC-VAE and aligning CLIP image features to its latent space via Q-Former, MSE, and sliced Wasserstein losses.