CIPHER: Conformer-based Inference of Phonemes from High-density EEG

Varshith Madishetty

arxiv: 2604.02362 · v1 · submitted 2026-03-22 · 💻 cs.CL · cs.AI· cs.SD

CIPHER: Conformer-based Inference of Phonemes from High-density EEG

Varshith Madishetty This is my paper

Pith reviewed 2026-05-15 07:02 UTC · model grok-4.3

classification 💻 cs.CL cs.AIcs.SD

keywords EEG speech decodingphoneme classificationconformer modelERP featuresDDA coefficientsbrain-computer interfaceLOSO validationCVC phonemes

0 comments

The pith

High-density EEG supports binary articulatory decoding but shows limited fine-grained discriminability for 11-class CVC phonemes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CIPHER, a dual-pathway model that processes high-density EEG through ERP features and broadband DDA coefficients to infer phonemes. Binary articulatory tasks reach near-ceiling accuracy yet remain vulnerable to confounds such as acoustic onset timing and TMS-target blocking. On the core 11-class CVC phoneme task with full leave-one-subject-out validation across 16 held-out participants, real-word error rates stay high at 0.671 for ERP and 0.688 for DDA pathways. The authors therefore frame the contribution as a controlled benchmark and feature-comparison study rather than a ready EEG-to-text decoder, restricting any neural-representation claims to the confound-controlled evidence.

Core claim

CIPHER achieves near-ceiling performance on binary articulatory tasks from high-density EEG but substantially lower performance on the primary 11-class CVC phoneme task under full Study 2 LOSO validation, indicating limited fine-grained neural discriminability and positioning the work as a benchmark study whose claims are constrained to confound-controlled evidence.

What carries the argument

Dual-pathway Conformer model that combines ERP features with broadband DDA coefficients for phoneme inference from scalp EEG.

If this is right

Binary articulatory features can be decoded at high accuracy from EEG when confounds are tightly controlled.
Specific CVC phoneme distinctions remain difficult to resolve at scale in current scalp recordings.
ERP and DDA pathways yield comparable but still limited results, supporting their use mainly for coarse feature comparison.
Any future claims about neural speech representations must be restricted to evidence obtained under similar confound controls.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The results suggest that practical EEG-based speech interfaces may need to target broader articulatory categories rather than individual phonemes.
Combining scalp EEG with higher-resolution modalities could test whether the current limits are due to spatial blurring or signal quality.
Larger multi-site datasets might reveal whether the performance ceiling rises once subject variability is better sampled.

Load-bearing premise

The observed performance gap between binary and 11-class tasks, together with the noted confound vulnerabilities in binary tasks, is taken to demonstrate limited fine-grained discriminability without further controls or larger validation.

What would settle it

A replication showing substantially lower word error rates on the same 11-class CVC task under identical LOSO validation and confound controls would falsify the claim of limited fine-grained discriminability.

Figures

Figures reproduced from arXiv: 2604.02362 by Varshith Madishetty.

**Figure 2.** Figure 2: WER comparison across ERP/DDA and real/pseudoword conditions. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Matched-split baseline accuracy heatmap. [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: Real vs. pseudoword lexicality effect (non-significant). [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: TMS ANOVA p-values by feature and place stratum. Primary-Supporting Evidence: Completed Must-Have Controls We ran the pre-registered must-have controls on Study 2 NULL-condition LOSO folds (16 held-out subjects): (i) EEG-only NULL control, (ii) acoustic-only baseline, and (iii) wideband ERP control (0.5–100 Hz; [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗

**Figure 6.** Figure 6: NULL-only EEG controls versus acoustic-only controls. [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Time-window masking and block-aware permutation controls. [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

**Figure 8.** Figure 8: Multi-seed ablation matrix (mean validation accuracy). [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗

**Figure 9.** Figure 9: Ablation deltas relative to full CIPHER across feature-task settings. [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

read the original abstract

Decoding speech information from scalp EEG remains difficult due to low SNR and spatial blurring. We present CIPHER (Conformer-based Inference of Phonemes from High-density EEG Representations), a dual-pathway model using (i) ERP features and (ii) broadband DDA coefficients. On OpenNeuro ds006104 (24 participants, two studies with concurrent TMS), binary articulatory tasks reach near-ceiling performance but are highly confound-vulnerable (acoustic onset separability and TMS-target blocking). On the primary 11-class CVC phoneme task under full Study 2 LOSO (16 held-out subjects), performance is substantially lower (real-word WER: ERP 0.671 +/- 0.080, DDA 0.688 +/- 0.096, indicating limited fine-grained discriminability. We therefore position this work as a benchmark and feature-comparison study rather than an EEG-to-text system, and we constrain neural-representation claims to confound-controlled evidence.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CIPHER, a dual-pathway Conformer model that extracts phoneme information from high-density EEG using ERP features in one pathway and broadband DDA coefficients in the other. On binary articulatory tasks from OpenNeuro ds006104 it reports near-ceiling performance, while on the primary 11-class CVC phoneme task under full Study-2 LOSO (16 held-out subjects) it obtains real-word WERs of 0.671 ± 0.080 (ERP) and 0.688 ± 0.096 (DDA). The authors interpret the performance gap as evidence of limited fine-grained neural discriminability, position the work as a benchmark and feature-comparison study rather than an EEG-to-text system, and restrict neural-representation claims to confound-controlled evidence.

Significance. If the empirical results and confound controls hold, the paper supplies a reproducible, public-dataset benchmark that usefully quantifies the gap between binary and multi-class phoneme decoding from scalp EEG. The explicit LOSO protocol, standard-deviation reporting, and cautious framing around confounds are strengths that could help calibrate expectations in the field.

major comments (2)

[Abstract and Results (11-class CVC task)] Abstract and Results (11-class CVC task): the claim that the observed WERs demonstrate 'limited fine-grained discriminability' assumes the dual-pathway Conformer is sufficiently expressive. No capacity ablations, training-curve diagnostics, or comparisons against stronger baselines (deeper Conformer, raw-waveform encoder, or non-linear SVM on identical features) are reported; therefore the performance numbers could equally reflect model or feature limitations rather than an absence of neural information.
[Methods (Study 2 LOSO protocol)] Methods (Study 2 LOSO protocol): while binary tasks are flagged as confound-vulnerable (acoustic onset, TMS blocking), the manuscript does not detail the specific confound controls applied to the 11-class CVC task or quantify residual confound leakage. This weakens the assertion that the 11-class results constitute 'confound-controlled evidence' of limited discriminability.

minor comments (2)

[Methods] Notation for DDA coefficients is introduced without an explicit equation or reference to the precise broadband filter bank; a short methods subsection or appendix equation would improve reproducibility.
[Figures] Figure captions for the LOSO performance plots should state the exact number of subjects, folds, and whether any post-hoc subject exclusions occurred.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and indicate where revisions will be incorporated to improve clarity and rigor.

read point-by-point responses

Referee: [Abstract and Results (11-class CVC task)] the claim that the observed WERs demonstrate 'limited fine-grained discriminability' assumes the dual-pathway Conformer is sufficiently expressive. No capacity ablations, training-curve diagnostics, or comparisons against stronger baselines (deeper Conformer, raw-waveform encoder, or non-linear SVM on identical features) are reported; therefore the performance numbers could equally reflect model or feature limitations rather than an absence of neural information.

Authors: We acknowledge the validity of this observation. The dual-pathway Conformer was selected for its established capacity in sequence modeling of time-series data, and its near-ceiling performance on the binary articulatory tasks provides evidence that the model can extract available information when present. Nevertheless, without explicit capacity ablations or comparisons to stronger baselines, we cannot definitively separate model limitations from neural information limits. In the revised manuscript we will add a dedicated limitations paragraph noting this caveat and framing the reported WERs as an upper bound on performance achievable with the current architecture and features. We will also recommend future work include such ablations. This is a partial revision. revision: partial
Referee: [Methods (Study 2 LOSO protocol)] while binary tasks are flagged as confound-vulnerable (acoustic onset, TMS blocking), the manuscript does not detail the specific confound controls applied to the 11-class CVC task or quantify residual confound leakage. This weakens the assertion that the 11-class results constitute 'confound-controlled evidence' of limited discriminability.

Authors: We agree that explicit documentation of confound controls for the 11-class task is required to support the 'confound-controlled evidence' framing. The full LOSO protocol (16 held-out subjects) removes subject-specific confounds, and the choice of ERP and broadband DDA features was intended to reduce acoustic-onset leakage relative to raw waveforms. In the revised Methods section we will expand the description of these controls, including how TMS blocking was handled via the study design and feature extraction. We will also add any available post-hoc estimates of residual leakage derived from our existing analyses. This constitutes a full revision. revision: yes

Circularity Check

0 steps flagged

No significant circularity: empirical held-out results

full rationale

The paper reports standard machine-learning performance numbers (WER on 11-class CVC phoneme classification) obtained via leave-one-subject-out cross-validation on 16 held-out subjects from OpenNeuro ds006104. These are direct empirical measurements on unseen data rather than any derivation, equation, or fitted parameter that reduces to its own inputs by construction. No self-citations, uniqueness theorems, ansatzes, or renamings of known results are used to support the central claim; the interpretation of lower 11-class performance as evidence of limited discriminability is an empirical conclusion open to falsification by stronger models or larger cohorts. The work explicitly frames itself as a benchmark study with confound-controlled evidence, keeping the derivation chain self-contained and non-circular.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The work relies on standard assumptions of EEG signal processing and deep learning generalization; no new entities are postulated. Free parameters include typical neural network hyperparameters not detailed in the abstract.

free parameters (1)

Conformer hyperparameters
Model architecture details such as layer counts and attention heads are not specified in the abstract but are fitted during training.

axioms (1)

domain assumption LOSO cross-validation on 16 held-out subjects provides unbiased estimate of generalization
Invoked when reporting primary task performance.

pith-pipeline@v0.9.0 · 5465 in / 1304 out tokens · 45368 ms · 2026-05-15T07:02:31.586862+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references · 21 canonical work pages

[1]

Card, Maitreyee Wairagkar, Carrina Iacobacci, Xianda Hou, Tyler Singer-Clark, Francis R

Nicholas S. Card, Maitreyee Wairagkar, Carrina Iacobacci, Xianda Hou, Tyler Singer-Clark, Francis R. Willett, Erin M. Kunz, Chaofei Fan, Maryam Vahdati Nia, Darrel R. Deo, Aparna Srinivasan, Eun Young Choi, Matthew F. Glasser, Leigh R. Hochberg, Jaimie M. Henderson, Kiarash Shahlaie, Sergey D. Stavisky, and David M. Brandman. An accurate and rapidly calib...

work page 2024
[2]

Willett, Erin M

Frank R. Willett, Erin M. Kunz, Chaofei Fan, Donald T. Avansino, Guy H. Wilson, Eun Young Choi, Foram Kamdar, Leigh R. Hochberg, Shaul Druckmann, Krishna V. Shenoy, and Jaimie M. Henderson. A high-performance speech neuroprosthesis.Nature, 620:1031–1036, 2023

work page 2023
[3]

Metzger, Jessie R

Sean L. Metzger, Jessie R. Liu, David A. Moses, Matthew E. Dougherty, Margaret P. Liu, Ilina Bhaya-Grossman, Michelle C. Burkhart, Maitreyee Bhaskaran, David A. Frieden- berg, Laura E. Osborn, Karunesh Ganguly, and Edward F. Chang. A high-performance neuroprosthesis for speech decoding and avatar control.Nature, 620:1037–1046, 2023

work page 2023
[4]

Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5:1097–1107, 2023

Alexandre D´ efossez, Charlotte Caucheteux, J´ er´ emy Rapin, Ori Kabeli, and Jean-R´ emi King. Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5:1097–1107, 2023

work page 2023
[5]

Anumanchipalli, Josh Chartier, and Edward F

Gopala K. Anumanchipalli, Josh Chartier, and Edward F. Chang. Speech synthesis from neural decoding of spoken sentences.Nature, 568:493–498, 2019. 22 EEG Speech Decoding Benchmark CIPHER

work page 2019
[6]

Sejnowski

Claudia Lainscsek and Terrence J. Sejnowski. Delay differential analysis of time series. Neural Computation, 27(3):594–614, 2015

work page 2015
[7]

Conformer: Convolution-augmented Transformer for speech recognition

Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, and Ruoming Pang. Conformer: Convolution-augmented Transformer for speech recognition. InProceedings of Interspeech, pages 5036–5040, 2020

work page 2020
[8]

Automatic speech recognition from neural signals: A focused review.Frontiers in Neuroscience, 10:429, 2016

Christian Herff and Tanja Schultz. Automatic speech recognition from neural signals: A focused review.Frontiers in Neuroscience, 10:429, 2016

work page 2016
[9]

Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, and Tonio Ball. Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

work page 2017
[10]

EEG Conformer: Convolutional Transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. EEG Conformer: Convolutional Transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

work page 2023
[11]

Lawhern, Amelia J

Vernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P. Hung, and Brent J. Lance. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018

work page 2018
[12]

DeWave: Discrete encoding of EEG waves for EEG to text translation

Yiqun Duan, Charles Zhou, Zhen Wang, Yu-Kai Wang, and Chin-Teng Lin. DeWave: Discrete encoding of EEG waves for EEG to text translation. InAdvances in Neural Information Processing Systems (NeurIPS), volume 36, 2023

work page 2023
[13]

Se- jnowski, and Howard Poizner

Claudia Lainscsek, Manuel Enrique Hernandez, Jonathan Weyhenmeyer, Terrence J. Se- jnowski, and Howard Poizner. Delay differential analysis of seizures in multichannel electrocorticography data.Neural Computation, 29(12):3181–3218, 2017

work page 2017
[14]

Liberman and Ignatius G

Alvin M. Liberman and Ignatius G. Mattingly. The motor theory of speech perception revised.Cognition, 21(1):1–36, 1985

work page 1985
[15]

The motor somatotopy of speech perception.Current Biology, 19(5):381–385, 2009

Alessandro D’Ausilio, Friedemann Pulverm¨ uller, Paola Salmas, Ilaria Bufalari, Chiara Begliomini, and Luciano Fadiga. The motor somatotopy of speech perception.Current Biology, 19(5):381–385, 2009

work page 2009
[16]

Jo˜ ao P. C. Moreira et al. An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation. OpenNeuro ds006104, 2025

work page 2025
[17]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018

work page 2018
[18]

Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks

Alex Graves, Santiago Fern´ andez, Faustino Gomez, and J¨ urgen Schmidhuber. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. InProceedings of the International Conference on Machine Learning (ICML), pages 369–376, 2006

work page 2006
[19]

Dauphin, and David Lopez-Paz

Hongyi Zhang, Moustapha Ciss´ e, Yann N. Dauphin, and David Lopez-Paz. Mixup: Beyond empirical risk minimization. InProceedings of the International Conference on Learning Representations (ICLR), 2018

work page 2018
[20]

Etienne Combrisson and Karim Jerbi. Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy.Journal of Neuroscience Methods, 250:126–136, 2015

work page 2015
[21]

Toscano and Bob McMurray

Joseph C. Toscano and Bob McMurray. Cue integration with categories: Weighting acoustic 23 EEG Speech Decoding Benchmark CIPHER cues in speech using unsupervised learning and distributional statistics.Cognitive Science, 34(3):434–464, 2010. 24

work page 2010

[1] [1]

Card, Maitreyee Wairagkar, Carrina Iacobacci, Xianda Hou, Tyler Singer-Clark, Francis R

Nicholas S. Card, Maitreyee Wairagkar, Carrina Iacobacci, Xianda Hou, Tyler Singer-Clark, Francis R. Willett, Erin M. Kunz, Chaofei Fan, Maryam Vahdati Nia, Darrel R. Deo, Aparna Srinivasan, Eun Young Choi, Matthew F. Glasser, Leigh R. Hochberg, Jaimie M. Henderson, Kiarash Shahlaie, Sergey D. Stavisky, and David M. Brandman. An accurate and rapidly calib...

work page 2024

[2] [2]

Willett, Erin M

Frank R. Willett, Erin M. Kunz, Chaofei Fan, Donald T. Avansino, Guy H. Wilson, Eun Young Choi, Foram Kamdar, Leigh R. Hochberg, Shaul Druckmann, Krishna V. Shenoy, and Jaimie M. Henderson. A high-performance speech neuroprosthesis.Nature, 620:1031–1036, 2023

work page 2023

[3] [3]

Metzger, Jessie R

Sean L. Metzger, Jessie R. Liu, David A. Moses, Matthew E. Dougherty, Margaret P. Liu, Ilina Bhaya-Grossman, Michelle C. Burkhart, Maitreyee Bhaskaran, David A. Frieden- berg, Laura E. Osborn, Karunesh Ganguly, and Edward F. Chang. A high-performance neuroprosthesis for speech decoding and avatar control.Nature, 620:1037–1046, 2023

work page 2023

[4] [4]

Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5:1097–1107, 2023

Alexandre D´ efossez, Charlotte Caucheteux, J´ er´ emy Rapin, Ori Kabeli, and Jean-R´ emi King. Decoding speech perception from non-invasive brain recordings.Nature Machine Intelligence, 5:1097–1107, 2023

work page 2023

[5] [5]

Anumanchipalli, Josh Chartier, and Edward F

Gopala K. Anumanchipalli, Josh Chartier, and Edward F. Chang. Speech synthesis from neural decoding of spoken sentences.Nature, 568:493–498, 2019. 22 EEG Speech Decoding Benchmark CIPHER

work page 2019

[6] [6]

Sejnowski

Claudia Lainscsek and Terrence J. Sejnowski. Delay differential analysis of time series. Neural Computation, 27(3):594–614, 2015

work page 2015

[7] [7]

Conformer: Convolution-augmented Transformer for speech recognition

Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, and Ruoming Pang. Conformer: Convolution-augmented Transformer for speech recognition. InProceedings of Interspeech, pages 5036–5040, 2020

work page 2020

[8] [8]

Automatic speech recognition from neural signals: A focused review.Frontiers in Neuroscience, 10:429, 2016

Christian Herff and Tanja Schultz. Automatic speech recognition from neural signals: A focused review.Frontiers in Neuroscience, 10:429, 2016

work page 2016

[9] [9]

Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

Robin Tibor Schirrmeister, Jost Tobias Springenberg, Lukas Dominique Josef Fiederer, Martin Glasstetter, Katharina Eggensperger, Michael Tangermann, Frank Hutter, Wolfram Burgard, and Tonio Ball. Deep learning with convolutional neural networks for EEG decoding and visualization.Human Brain Mapping, 38(11):5391–5420, 2017

work page 2017

[10] [10]

EEG Conformer: Convolutional Transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. EEG Conformer: Convolutional Transformer for EEG decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

work page 2023

[11] [11]

Lawhern, Amelia J

Vernon J. Lawhern, Amelia J. Solon, Nicholas R. Waytowich, Stephen M. Gordon, Chou P. Hung, and Brent J. Lance. EEGNet: A compact convolutional neural network for EEG-based brain–computer interfaces.Journal of Neural Engineering, 15(5):056013, 2018

work page 2018

[12] [12]

DeWave: Discrete encoding of EEG waves for EEG to text translation

Yiqun Duan, Charles Zhou, Zhen Wang, Yu-Kai Wang, and Chin-Teng Lin. DeWave: Discrete encoding of EEG waves for EEG to text translation. InAdvances in Neural Information Processing Systems (NeurIPS), volume 36, 2023

work page 2023

[13] [13]

Se- jnowski, and Howard Poizner

Claudia Lainscsek, Manuel Enrique Hernandez, Jonathan Weyhenmeyer, Terrence J. Se- jnowski, and Howard Poizner. Delay differential analysis of seizures in multichannel electrocorticography data.Neural Computation, 29(12):3181–3218, 2017

work page 2017

[14] [14]

Liberman and Ignatius G

Alvin M. Liberman and Ignatius G. Mattingly. The motor theory of speech perception revised.Cognition, 21(1):1–36, 1985

work page 1985

[15] [15]

The motor somatotopy of speech perception.Current Biology, 19(5):381–385, 2009

Alessandro D’Ausilio, Friedemann Pulverm¨ uller, Paola Salmas, Ilaria Bufalari, Chiara Begliomini, and Luciano Fadiga. The motor somatotopy of speech perception.Current Biology, 19(5):381–385, 2009

work page 2009

[16] [16]

Jo˜ ao P. C. Moreira et al. An open-access EEG dataset for speech decoding: Exploring the role of articulation and coarticulation. OpenNeuro ds006104, 2025

work page 2025

[17] [17]

Squeeze-and-excitation networks

Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. InProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7132–7141, 2018

work page 2018

[18] [18]

Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks

Alex Graves, Santiago Fern´ andez, Faustino Gomez, and J¨ urgen Schmidhuber. Connectionist temporal classification: Labelling unsegmented sequence data with recurrent neural networks. InProceedings of the International Conference on Machine Learning (ICML), pages 369–376, 2006

work page 2006

[19] [19]

Dauphin, and David Lopez-Paz

Hongyi Zhang, Moustapha Ciss´ e, Yann N. Dauphin, and David Lopez-Paz. Mixup: Beyond empirical risk minimization. InProceedings of the International Conference on Learning Representations (ICLR), 2018

work page 2018

[20] [20]

Etienne Combrisson and Karim Jerbi. Exceeding chance level by chance: The caveat of theoretical chance levels in brain signal classification and statistical assessment of decoding accuracy.Journal of Neuroscience Methods, 250:126–136, 2015

work page 2015

[21] [21]

Toscano and Bob McMurray

Joseph C. Toscano and Bob McMurray. Cue integration with categories: Weighting acoustic 23 EEG Speech Decoding Benchmark CIPHER cues in speech using unsupervised learning and distributional statistics.Cognitive Science, 34(3):434–464, 2010. 24

work page 2010