Mind-Omni unifies seven brain-vision-language tasks in one discrete-diffusion framework with a brain tokenizer and a new BQA dataset, claiming SOTA multi-task performance competitive with larger single-task models.
hub
Visual decoding and reconstruction via eeg embeddings with guided diffusion
15 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
verdicts
UNVERDICTED 15roles
background 3polarities
background 3representative citing papers
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
A tri-modal contrastive learning method for EEG-based zero-shot visual decoding reports 54.1% top-1 accuracy on the Things-EEG2 200-way benchmark, outperforming prior baselines of 32.4%.
CineNeuron improves fMRI-to-video reconstruction by combining bottom-up semantic enrichment with top-down Mixture-of-Memories integration and outperforms prior methods on benchmarks.
A meta-optimized in-context learning approach enables training-free cross-subject semantic visual decoding from fMRI by inferring individual neural encoding patterns via hierarchical inference on a few examples.
Hyper-MML integrates EEG, audio, and video using an Adaptive Brain Encoder with Mutual-cross Attention (ABEMA) and Adaptive Hypergraph Fusion Module (AHFM) to outperform prior methods on EAV and AFFEC datasets for conversational emotion recognition.
MindAU is a dual-stream manifold alignment system that conditions a multimodal diffusion editor on EEG signals to perform fine-grained, identity-preserving facial action unit edits.
A multimodal alignment pipeline decodes EEG signals recorded during natural image viewing into image retrieval (86.3% Top-1) and reconstruction (CLIP 0.903) tasks.
MB2L achieves 80.5% top-1 and 97.6% top-5 accuracy on zero-shot EEG-to-image retrieval by using biomimetic modules and bidirectional contrastive learning to align neural and visual features.
EEG2Vision reconstructs images from EEG using diffusion models plus LLM-guided boosting, with reconstruction quality holding up reasonably as electrode count drops from 128 to 24 channels.
BRAIN uses bias-mitigation continual learning with a new de-bias contrastive loss and angular forgetting mitigation to achieve SOTA performance on vision-brain understanding benchmarks despite brain signal inconsistencies across sessions.
Dual-stream EEG decoder separates identity and orientation to support 3D reconstruction from neural signals via circular regression and conditioned diffusion.
SUP-MCRL reports 66.0%/91.9% intra-subject and 24.0%/52.9% LOSO zero-shot top-1/top-5 accuracy on THINGS-EEG by combining semantic visual encoding, multi-scale EEG enhancement, and EMA-updated pseudo-feature augmentation.
ViBE generates M/EEG signals from visual stimuli by reconstructing neural responses with a TSC-VAE and aligning CLIP image features to its latent space via Q-Former, MSE, and sliced Wasserstein losses.
A hybrid visual-motor imagery EEG decoder controls a robot for grasping and placement at 40% and 63% accuracy respectively, yielding 21% end-to-end task success in cue-free online use.
citing papers explorer
-
Mind-Omni: A Unified Multi-Task Framework for Brain-Vision-Language Modeling via Discrete Diffusion
Mind-Omni unifies seven brain-vision-language tasks in one discrete-diffusion framework with a brain tokenizer and a new BQA dataset, claiming SOTA multi-task performance competitive with larger single-task models.
-
NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
-
MindAlign: Bridging EEG, Vision, and Language for Zero-Shot Visual Decoding
A tri-modal contrastive learning method for EEG-based zero-shot visual decoding reports 54.1% top-1 accuracy on the Things-EEG2 200-way benchmark, outperforming prior baselines of 32.4%.
-
Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction
CineNeuron improves fMRI-to-video reconstruction by combining bottom-up semantic enrichment with top-down Mixture-of-Memories integration and outperforms prior methods on benchmarks.
-
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
A meta-optimized in-context learning approach enables training-free cross-subject semantic visual decoding from fMRI by inferring individual neural encoding patterns via hierarchical inference on a few examples.
-
Hypergraph Multi-Modal Learning for EEG-based Emotion Recognition in Conversation
Hyper-MML integrates EEG, audio, and video using an Adaptive Brain Encoder with Mutual-cross Attention (ABEMA) and Adaptive Hypergraph Fusion Module (AHFM) to outperform prior methods on EAV and AFFEC datasets for conversational emotion recognition.
-
MindAU: EEG-Conditioned Facial Action Unit Editing via Dual-Stream Manifold Alignment
MindAU is a dual-stream manifold alignment system that conditions a multimodal diffusion editor on EEG signals to perform fine-grained, identity-preserving facial action unit edits.
-
Brain-to-Image Retrieval and Reconstruction via Multimodal EEG Alignment
A multimodal alignment pipeline decodes EEG signals recorded during natural image viewing into image retrieval (86.3% Top-1) and reconstruction (CLIP 0.903) tasks.
-
Multi-Level Bidirectional Biomimetic Learning for EEG-Based Visual Decoding
MB2L achieves 80.5% top-1 and 97.6% top-5 accuracy on zero-shot EEG-to-image retrieval by using biomimetic modules and bidirectional contrastive learning to align neural and visual features.
-
EEG2Vision: A Multimodal EEG-Based Framework for 2D Visual Reconstruction in Cognitive Neuroscience
EEG2Vision reconstructs images from EEG using diffusion models plus LLM-guided boosting, with reconstruction quality holding up reasonably as electrode count drops from 128 to 24 channels.
-
BRAIN: Bias-Mitigation Continual Learning Approach to Vision-Brain Understanding
BRAIN uses bias-mitigation continual learning with a new de-bias contrastive loss and angular forgetting mitigation to achieve SOTA performance on vision-brain understanding benchmarks despite brain signal inconsistencies across sessions.
-
Dual-Stream EEG Decoding for 3D Visual Perception
Dual-stream EEG decoder separates identity and orientation to support 3D reconstruction from neural signals via circular regression and conditioned diffusion.
-
SUP-MCRL: Subject-aware Unified Pseudo-feature Coded Multimodal Contrastive Representation Learning for EEG Visual Decoding
SUP-MCRL reports 66.0%/91.9% intra-subject and 24.0%/52.9% LOSO zero-shot top-1/top-5 accuracy on THINGS-EEG by combining semantic visual encoding, multi-scale EEG enhancement, and EMA-updated pseudo-feature augmentation.
-
ViBE: Visual-to-M/EEG Brain Encoding via Spatio-Temporal VAE and Distribution-Aligned Projection
ViBE generates M/EEG signals from visual stimuli by reconstructing neural responses with a TSC-VAE and aligning CLIP image features to its latent space via Q-Former, MSE, and sliced Wasserstein losses.
-
Robotic Grasping and Placement Controlled by EEG-Based Hybrid Visual and Motor Imagery
A hybrid visual-motor imagery EEG decoder controls a robot for grasping and placement at 40% and 63% accuracy respectively, yielding 21% end-to-end task success in cue-free online use.