A 1000-hour EEG-EMG-audio dataset of Japanese speech production

Atsushi Yamamoto; Eri Hatakeyama; Ilya Horiguchi; Ippei Fujisawa; Kenichi Tomeoka; Masakazu Inoue; Motoshige Sato; Shuntaro Sasai; Yuya Kita

arxiv: 2606.01264 · v1 · pith:STSBERMEnew · submitted 2026-05-31 · 🧬 q-bio.NC · cs.HC· cs.SD· eess.AS· eess.SP

A 1000-hour EEG-EMG-audio dataset of Japanese speech production

Motoshige Sato , Ilya Horiguchi , Masakazu Inoue , Kenichi Tomeoka , Eri Hatakeyama , Yuya Kita , Atsushi Yamamoto , Ippei Fujisawa

show 1 more author

Shuntaro Sasai

This is my paper

classification 🧬 q-bio.NC cs.HCcs.SDeess.ASeess.SP

keywords datasetspeechaudiofacialjapanesemultimodalspectralthree

0 comments

read the original abstract

We present a multimodal dataset of 1020 hours of simultaneously recorded scalp electroencephalography (EEG), facial electromyography (EMG), and speech audio from three healthy native Japanese speakers during open-vocabulary overt speech. Recordings were acquired with three EEG systems-an ultra-high-density system (g.Pangolin) and two cap-type systems (g.SCARABEO and eegosports), spanning 62-128 channels-across many sessions over several months. Each session provides time-synchronized EEG, facial EMG, and audio, together with speech-event annotations and transcriptions. Although collected with speech decoding as a primary motivation, the dataset also supports work on multimodal signal processing, artifact modeling, longitudinal and cross-device adaptation, and EEG representation learning. Technical validation included power spectral density and event-related potential analyses across participants, devices, and tasks, which showed the expected 1/f spectral profile, task-related alpha-band attenuation, and time-locked evoked responses. The dataset is released in Brain Imaging Data Structure (BIDS) format via OpenNeuro under a CC0 waiver to support both speech-related and broader EEG research.

This paper has not been read by Pith yet.

A 1000-hour EEG-EMG-audio dataset of Japanese speech production

discussion (0)