A comprehensive evaluation of pretraining strategies for channel-agnostic contrastive self-supervision of biosignals
Pith reviewed 2026-05-25 09:03 UTC · model grok-4.3
The pith
Random channel subsets as positive pairs enable effective channel-agnostic contrastive pretraining for biosignals.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
CRLC, formed by treating random subsets of input channels from the same recording as positive pairs, outperforms augmentation and temporal-neighbor strategies for channel-agnostic pretraining. On EEG downstream tasks it surpasses the current state-of-the-art reference model; on ECG tasks it produces results comparable to the reference model that otherwise outperforms it.
What carries the argument
Contrastive random lead coding (CRLC), the mechanism that constructs positive pairs by randomly selecting subsets of channels from a single recording.
If this is right
- Pretrained models generalize across variable channel counts without retraining or architecture changes.
- EEG downstream tasks reach higher accuracy than the previous reference model under channel-agnostic conditions.
- ECG downstream performance matches the reference model once CRLC is incorporated.
- Channel-agnostic models become practical for biosignal datasets collected with differing electrode placements.
Where Pith is reading between the lines
- The same random-subset pairing could be tested on other multivariate time-series domains where sensor counts differ.
- Pairing CRLC with existing time- or frequency-domain augmentations may produce additive gains.
- Clinical pipelines could adopt a single pretrained backbone usable across multiple recording montages.
Load-bearing premise
That randomly chosen channel subsets from the same recording form positive pairs that reliably share the same underlying physiological information without introducing channel-specific biases or losing task-relevant features.
What would settle it
Fine-tuning accuracy on standard EEG classification benchmarks falls below the prior state-of-the-art reference model when the same fixed-channel inputs are used for both methods.
Figures
read the original abstract
Contrastive learning yields impressive results for self-supervision in computer vision. The approach relies on the creation of positive pairs, something which is often achieved through augmentations. However, for multivariate time series effective augmentations can be difficult to design. Additionally, the number of input channels for biosignal datasets often varies from application to application, limiting the usefulness of large self-supervised models trained with specific channel configurations. Motivated by these challenges, we set out to investigate strategies for creation of positive pairs for channel-agnostic self-supervision of biosignals. We introduce contrastive random lead coding (CRLC), where random subsets of the input channels are used to create positive pairs and compare with using augmentations and neighboring segments in time as positive pairs. We validate our approach by pre-training models on EEG and ECG data, and then fine-tuning them for downstream tasks. CRLC outperforms competing strategies in both scenarios in the channel-agnostic setting. Notably, for EEG tasks CRLC surpasses the current state-of-the-art reference model. While, the state-of-the-art reference model is superior in the ECG task, incorporating CRLC allows us to obtain comparable results. In conclusion, CRLC helps generalization across variable channel setups when training our channel-agnostic model. The code is available at https://github.com/theabrusch/Multiview_TS_SSL.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Contrastive Random Lead Coding (CRLC) for channel-agnostic contrastive self-supervised pretraining of biosignals. Positive pairs are formed by randomly selecting subsets of channels from the same recording (instead of data augmentations or temporally neighboring segments). Models are pretrained on EEG and ECG corpora, then fine-tuned on downstream tasks; CRLC is reported to outperform the two alternative positive-pair strategies in the channel-agnostic regime and to surpass a published SOTA reference on EEG tasks while remaining competitive on ECG when CRLC is incorporated.
Significance. If the empirical results are robust, the work supplies a practical route to pretrain a single model that can be applied to recordings with arbitrary channel counts, a frequent practical constraint in biosignal research. The public release of code is a clear strength that supports reproducibility. The significance is tempered by the fact that the central modeling assumption (random channel subsets reliably share task-relevant physiological content) receives no direct validation in the reported experiments.
major comments (2)
- [Methods / Experiments] The central empirical claim (CRLC superiority in the channel-agnostic setting) rests on the untested premise that randomly chosen channel subsets constitute valid positive pairs. No ablation on subset size, spatial coverage statistics, or information-loss metrics (e.g., mutual information between subset and full recording for task-relevant features) is presented; this is especially consequential for EEG, where electrodes are spatially localized and focal phenomena can be missed by a random draw.
- [Results] Table or figure reporting downstream performance (EEG and ECG fine-tuning) does not include statistical significance tests across random seeds or cross-validation folds, nor does it report variance when the number of available channels at test time is varied; without these, the reported outperformance of CRLC over augmentation and temporal baselines cannot be assessed for robustness.
minor comments (1)
- [Abstract] Abstract contains a stray comma: 'While, the state-of-the-art reference model is superior…'
Simulated Author's Rebuttal
We are grateful to the referee for their insightful comments, which highlight important aspects for strengthening our work. We provide point-by-point responses to the major comments and indicate where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Methods / Experiments] The central empirical claim (CRLC superiority in the channel-agnostic setting) rests on the untested premise that randomly chosen channel subsets constitute valid positive pairs. No ablation on subset size, spatial coverage statistics, or information-loss metrics (e.g., mutual information between subset and full recording for task-relevant features) is presented; this is especially consequential for EEG, where electrodes are spatially localized and focal phenomena can be missed by a random draw.
Authors: The referee correctly identifies that direct validation of the random channel subsets as positive pairs is not provided through ablations or information metrics. We maintain that the empirical results on downstream tasks serve as validation, as CRLC's outperformance indicates effective learning of shared representations. Nevertheless, we will revise the manuscript to include an ablation study varying the subset size and report its impact on performance for both EEG and ECG. We will also discuss the implications for spatial coverage in EEG, acknowledging the potential for missing focal events but emphasizing the benefit for channel-agnostic applications. revision: yes
-
Referee: [Results] Table or figure reporting downstream performance (EEG and ECG fine-tuning) does not include statistical significance tests across random seeds or cross-validation folds, nor does it report variance when the number of available channels at test time is varied; without these, the reported outperformance of CRLC over augmentation and temporal baselines cannot be assessed for robustness.
Authors: We concur that adding statistical tests and variance measures will improve the robustness assessment. In the revision, we will include results from multiple random seeds with standard deviations and conduct significance testing for the performance differences. Additionally, we will extend the evaluation to report performance variance across different numbers of channels at test time, providing a more comprehensive view of the channel-agnostic capabilities. revision: yes
Circularity Check
Empirical comparison of pretraining strategies; no derivation chain present
full rationale
The paper conducts an empirical evaluation of contrastive pretraining methods (CRLC using random channel subsets, augmentations, and temporal neighbors) on EEG/ECG data, followed by fine-tuning on downstream tasks. No equations, first-principles derivations, or predictions are claimed; performance differences are reported from experiments with released code. The central claim (CRLC superiority in channel-agnostic setting) rests on measured accuracies rather than any reduction to fitted inputs or self-citations. This is a standard self-contained empirical study with no load-bearing circular steps.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Contrastive loss pulls embeddings of positive pairs closer than negative pairs.
- domain assumption Random channel subsets from the same recording share the same underlying signal semantics.
Reference graph
Works this paper leans on
-
[1]
A simple framework for contrastive learning of visual representations,
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the 37th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, vol. 119. PMLR, July 2020, pp. 1597– 1607
work page 2020
-
[2]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2019, unpublished
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[3]
The case for using digital EEG analysis in clinical sleep medicine,
M. Younes, “The case for using digital EEG analysis in clinical sleep medicine,” Sleep Science and Practice , vol. 1, no. 1, February 2017
work page 2017
-
[4]
Dimensionality reduction by learning an invariant mapping,
R. Hadsell, S. Chopra, and Y . LeCun, “Dimensionality reduction by learning an invariant mapping,” in 2006 IEEE Computer Society Con- ference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, 2006, pp. 1735–1742. BR ¨USCH et al.: CONTRASTIVE RANDOM LEAD CODING FOR CHANNEL-AGNOSTIC SELF-SUPERVISION OF BIOSIGNALS 13
work page 2006
-
[5]
Masked au- toencoders are scalable vision learners,
K. He, X. Chen, S. Xie, Y . Li, P. Doll ´ar, and R. Girshick, “Masked au- toencoders are scalable vision learners,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2022, pp. 16 000–16 009
work page 2022
-
[6]
Bootstrap your own latent - a new approach to self-supervised learning,
J.-B. Grill, F. Strub, F. Altch ´e, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, B. Piot, k. Kavukcuoglu, R. Munos, and M. Valko, “Bootstrap your own latent - a new approach to self-supervised learning,” in Advances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 21 ...
work page 2020
-
[7]
Barlow twins: Self- supervised learning via redundancy reduction,
J. Zbontar, L. Jing, I. Misra, Y . LeCun, and S. Deny, “Barlow twins: Self- supervised learning via redundancy reduction,” in Proceedings of the 38th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, vol. 139. PMLR, July 2021, pp. 12 310– 12 320
work page 2021
-
[8]
Multi- view action recognition using contrastive learning,
K. Shah, A. Shah, C. P. Lau, C. M. de Melo, and R. Chellapp, “Multi- view action recognition using contrastive learning,” in 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2023, pp. 3370–3380
work page 2023
-
[9]
Robustsleepnet: Transfer learning for auto- mated sleep staging at scale,
A. Guillot and V . Thorey, “Robustsleepnet: Transfer learning for auto- mated sleep staging at scale,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 29, 2021
work page 2021
-
[10]
Self-supervised contrastive pre-training for time series via time-frequency consistency,
X. Zhang, Z. Zhao, T. Tsiligkaridis, and M. Zitnik, “Self-supervised contrastive pre-training for time series via time-frequency consistency,” in Proceedings of Neural Information Processing Systems, NeurIPS , 2022
work page 2022
-
[11]
D. Kostas, S. Aroca-Ouellette, and F. Rudzicz, “Bendr: Using transform- ers and a contrastive self-supervised learning task to learn from massive amounts of eeg data,” Frontiers in Human Neuroscience, vol. 15, 2021
work page 2021
-
[12]
Lead-agnostic self-supervised learning for local and global representations of elec- trocardiogram,
J. Oh, H. Chung, J.-m. Kwon, D.-g. Hong, and E. Choi, “Lead-agnostic self-supervised learning for local and global representations of elec- trocardiogram,” in Proceedings of the Conference on Health, Inference, and Learning, ser. Proceedings of Machine Learning Research, vol. 174. PMLR, April 2022, pp. 338–353
work page 2022
-
[13]
Contrastive representa- tion learning for electroencephalogram classification,
M. N. Mohsenvand, M. R. Izadi, and P. Maes, “Contrastive representa- tion learning for electroencephalogram classification,” in Proceedings of the Machine Learning for Health NeurIPS Workshop , ser. Proceedings of Machine Learning Research, vol. 136. PMLR, December 2020, pp. 238–253
work page 2020
-
[14]
Clocs: Contrastive learning of cardiac signals across space, time, and patients,
D. Kiyasseh, T. Zhu, and D. A. Clifton, “Clocs: Contrastive learning of cardiac signals across space, time, and patients,” in International Conference on Machine Learning . PMLR, 2021, pp. 5606–5615
work page 2021
-
[15]
Multi-view self- supervised learning for multivariate variable-channel time series,
T. Br ¨usch, M. N. Schmidt, and T. S. Alstrøm, “Multi-view self- supervised learning for multivariate variable-channel time series,” in 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), 2023, pp. 1–6
work page 2023
-
[16]
Time-series representation learning via temporal and contextual con- trasting,
E. Eldele, M. Ragab, Z. Chen, M. Wu, C. K. Kwoh, X. Li, and C. Guan, “Time-series representation learning via temporal and contextual con- trasting,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Z.-H. Zhou, Ed., 8 2021, pp. 2352– 2359, main Track
work page 2021
-
[17]
TS2Vec: Towards universal representation of time series,
Z. Yue, Y . Wang, J. Duan, T. Yang, C. Huang, Y . Tong, and B. Xu, “TS2Vec: Towards universal representation of time series,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36, no. 8, pp. 8980–8987, June 2022
work page 2022
-
[18]
COCOA: Cross modality contrastive learning for sensor data,
S. Deldari, H. Xue, A. Saeed, D. V . Smith, and F. D. Salim, “COCOA: Cross modality contrastive learning for sensor data,”Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 6, no. 3, September 2022
work page 2022
-
[19]
Uncovering the structure of clinical eeg signals with self- supervised learning,
H. Banville, O. Chehab, A. Hyv ¨arinen, D.-A. Engemann, and A. Gram- fort, “Uncovering the structure of clinical eeg signals with self- supervised learning,” Journal of Neural Engineering , vol. 18, no. 4, p. 046020, March 2021
work page 2021
-
[20]
Neural message passing for quantum chemistry,
J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proceedings of the 34th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, D. Precup and Y . W. Teh, Eds., vol. 70. PMLR, August 2017, pp. 1263–1272
work page 2017
-
[21]
You snooze, you win: The physionet/computing in cardiology challenge 2018,
M. M. Ghassemi, B. E. Moody, L. W. H. Lehman, C. Song, Q. Li, H. Sun, R. G. Mark, M. B. Westover, and G. D. Clifford, “You snooze, you win: The physionet/computing in cardiology challenge 2018,” Computing in Cardiology , p. 8743916, 2018
work page 2018
-
[22]
A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. E215–220, 2000
work page 2000
-
[23]
Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the eeg,
B. Kemp, A. H. Zwinderman, B. Tuk, H. A. Kamphuisen, and J. J. Obery´e, “Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the eeg,”Ieee Transactions on Biomedical Engineering, vol. 47, no. 9, pp. 1185–1194, 2000
work page 2000
-
[24]
C. O’Reilly, N. Gosselin, J. Carrier, and T. Nielsen, “Montreal archive of sleep studies: an open-access resource for instrument benchmarking and exploratory research,” Journal of Sleep Research, vol. 23, no. 6, pp. 628–635, 2014
work page 2014
-
[25]
Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020,
E. A. P. Alday, A. Gu, A. J. Shah, C. Robichaux, A.-K. I. Wong, C. Liu, F. Liu, A. B. Rad, A. Elola, S. Seyedi, Q. Li, A. Sharma, G. D. Clifford, and M. A. Reyna, “Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020,” Physiological Measurement, vol. 41, no. 12, p. 124003, December 2020
work page 2020
-
[26]
M. Reyna, N. Sadr, A. Gu, E. A. P. Alday, C. Liu, S. Seyedi, A. Shah, and G. Clifford, “Will two do? varying dimensions in electrocardiog- raphy: The physionet/computing in cardiology challenge 2021 (version 1.0.3),” 2022
work page 2021
-
[27]
wav2vec 2.0: A framework for self-supervised learning of speech representations,
A. Baevski, Y . Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems , vol. 33. Curran Associates, Inc., 2020, pp. 12 449–12 460
work page 2020
-
[28]
The Temple university hospital eeg data corpus,
I. Obeid and J. Picone, “The Temple university hospital eeg data corpus,” Frontiers in Neuroscience, vol. 10, 2016
work page 2016
-
[29]
BCI2000: a general-purpose brain-computer interface (BCI) system,
G. Schalk, D. McFarland, T. Hinterberger, N. Birbaumer, and J. Wolpaw, “BCI2000: a general-purpose brain-computer interface (BCI) system,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 6, pp. 1034– 1043, 2004
work page 2004
-
[30]
B. Blankertz, G. Dornhege, M. Krauledat, K.-R. M ¨uller, and G. Curio, “The non-invasive Berlin Brain–Computer Interface: Fast acquisition of effective performance in untrained subjects,” NeuroImage, vol. 37, no. 2, pp. 539–550, 2007
work page 2007
-
[31]
Deep learning with convolutional neural networks for EEG decoding and visualization,
R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, F. Hutter, W. Burgard, and T. Ball, “Deep learning with convolutional neural networks for EEG decoding and visualization,” Human Brain Mapping , aug 2017
work page 2017
-
[32]
M. Kaya, M. K. Binli, E. Ozbay, H. Yanar, and Y . Mishchenko, “A large electroencephalographic motor imagery dataset for electroencephalo- graphic brain computer interfaces,” Scientific data , vol. 5, no. 1, pp. 1–16, 2018
work page 2018
-
[33]
I. V ogiatzis, E. Koulouris, A. Ioannidis, E. Sdogkos, M. Pliatsika, P. Roditis, and M. Goumenakis, “The importance of the 15-lead versus 12-lead ECG recordings in the diagnosis and treatment of right ventricle and left ventricle posterior and lateral wall acute myocardial infarctions,” Acta Informatica Medica , vol. 27, no. 1, p. 35, 2019
work page 2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.