pith. sign in

arxiv: 2410.19842 · v2 · pith:IPBE7VLTnew · submitted 2024-10-21 · 📡 eess.SP · cs.LG

A comprehensive evaluation of pretraining strategies for channel-agnostic contrastive self-supervision of biosignals

Pith reviewed 2026-05-25 09:03 UTC · model grok-4.3

classification 📡 eess.SP cs.LG
keywords contrastive learningself-supervised learningbiosignalsEEGECGchannel-agnosticpretraining
0
0 comments X

The pith

Random channel subsets as positive pairs enable effective channel-agnostic contrastive pretraining for biosignals.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper evaluates methods for forming positive pairs in contrastive self-supervised learning on multivariate biosignals whose channel counts vary across applications. It introduces contrastive random lead coding (CRLC), which treats random subsets of channels drawn from the same recording as positive pairs, and benchmarks this against augmentation-based pairs and time-neighboring segments. Pretraining on EEG and ECG data followed by fine-tuning shows CRLC yields stronger downstream performance in the channel-agnostic regime. For EEG tasks the approach exceeds the prior state-of-the-art reference model; for ECG tasks it reaches parity with that reference when CRLC is used. The result indicates that random channel subsets support generalization across differing sensor configurations without requiring channel-specific model architectures.

Core claim

CRLC, formed by treating random subsets of input channels from the same recording as positive pairs, outperforms augmentation and temporal-neighbor strategies for channel-agnostic pretraining. On EEG downstream tasks it surpasses the current state-of-the-art reference model; on ECG tasks it produces results comparable to the reference model that otherwise outperforms it.

What carries the argument

Contrastive random lead coding (CRLC), the mechanism that constructs positive pairs by randomly selecting subsets of channels from a single recording.

If this is right

  • Pretrained models generalize across variable channel counts without retraining or architecture changes.
  • EEG downstream tasks reach higher accuracy than the previous reference model under channel-agnostic conditions.
  • ECG downstream performance matches the reference model once CRLC is incorporated.
  • Channel-agnostic models become practical for biosignal datasets collected with differing electrode placements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same random-subset pairing could be tested on other multivariate time-series domains where sensor counts differ.
  • Pairing CRLC with existing time- or frequency-domain augmentations may produce additive gains.
  • Clinical pipelines could adopt a single pretrained backbone usable across multiple recording montages.

Load-bearing premise

That randomly chosen channel subsets from the same recording form positive pairs that reliably share the same underlying physiological information without introducing channel-specific biases or losing task-relevant features.

What would settle it

Fine-tuning accuracy on standard EEG classification benchmarks falls below the prior state-of-the-art reference model when the same fixed-channel inputs are used for both methods.

Figures

Figures reproduced from arXiv: 2410.19842 by Mikkel N. Schmidt, Thea Br\"usch, Tommy S. Alstr{\o}m.

Figure 3
Figure 3. Figure 3: Contrastive random lead coding (CRLC) uses different leads from within the same time window to create positive pairs. Given an input window Xt ∈ R C×Tin , we create two views by sampling one subset of the input leads and using the remaining input leads to create the other view, i.e. X1 t ∈ R C1×Tin and X2 t ∈ R C2×Tin , where C1, C2 ≥ 2. This strategy assumes that groups of channels recorded at the same ti… view at source ↗
Figure 4
Figure 4. Figure 4: Example of simulated data designed to fit the con￾trastive random lead coding setting, but not the contrastive segment coding setting. Data 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Mixing matrix Sources [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
read the original abstract

Contrastive learning yields impressive results for self-supervision in computer vision. The approach relies on the creation of positive pairs, something which is often achieved through augmentations. However, for multivariate time series effective augmentations can be difficult to design. Additionally, the number of input channels for biosignal datasets often varies from application to application, limiting the usefulness of large self-supervised models trained with specific channel configurations. Motivated by these challenges, we set out to investigate strategies for creation of positive pairs for channel-agnostic self-supervision of biosignals. We introduce contrastive random lead coding (CRLC), where random subsets of the input channels are used to create positive pairs and compare with using augmentations and neighboring segments in time as positive pairs. We validate our approach by pre-training models on EEG and ECG data, and then fine-tuning them for downstream tasks. CRLC outperforms competing strategies in both scenarios in the channel-agnostic setting. Notably, for EEG tasks CRLC surpasses the current state-of-the-art reference model. While, the state-of-the-art reference model is superior in the ECG task, incorporating CRLC allows us to obtain comparable results. In conclusion, CRLC helps generalization across variable channel setups when training our channel-agnostic model. The code is available at https://github.com/theabrusch/Multiview_TS_SSL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes Contrastive Random Lead Coding (CRLC) for channel-agnostic contrastive self-supervised pretraining of biosignals. Positive pairs are formed by randomly selecting subsets of channels from the same recording (instead of data augmentations or temporally neighboring segments). Models are pretrained on EEG and ECG corpora, then fine-tuned on downstream tasks; CRLC is reported to outperform the two alternative positive-pair strategies in the channel-agnostic regime and to surpass a published SOTA reference on EEG tasks while remaining competitive on ECG when CRLC is incorporated.

Significance. If the empirical results are robust, the work supplies a practical route to pretrain a single model that can be applied to recordings with arbitrary channel counts, a frequent practical constraint in biosignal research. The public release of code is a clear strength that supports reproducibility. The significance is tempered by the fact that the central modeling assumption (random channel subsets reliably share task-relevant physiological content) receives no direct validation in the reported experiments.

major comments (2)
  1. [Methods / Experiments] The central empirical claim (CRLC superiority in the channel-agnostic setting) rests on the untested premise that randomly chosen channel subsets constitute valid positive pairs. No ablation on subset size, spatial coverage statistics, or information-loss metrics (e.g., mutual information between subset and full recording for task-relevant features) is presented; this is especially consequential for EEG, where electrodes are spatially localized and focal phenomena can be missed by a random draw.
  2. [Results] Table or figure reporting downstream performance (EEG and ECG fine-tuning) does not include statistical significance tests across random seeds or cross-validation folds, nor does it report variance when the number of available channels at test time is varied; without these, the reported outperformance of CRLC over augmentation and temporal baselines cannot be assessed for robustness.
minor comments (1)
  1. [Abstract] Abstract contains a stray comma: 'While, the state-of-the-art reference model is superior…'

Simulated Author's Rebuttal

2 responses · 0 unresolved

We are grateful to the referee for their insightful comments, which highlight important aspects for strengthening our work. We provide point-by-point responses to the major comments and indicate where revisions will be made to the manuscript.

read point-by-point responses
  1. Referee: [Methods / Experiments] The central empirical claim (CRLC superiority in the channel-agnostic setting) rests on the untested premise that randomly chosen channel subsets constitute valid positive pairs. No ablation on subset size, spatial coverage statistics, or information-loss metrics (e.g., mutual information between subset and full recording for task-relevant features) is presented; this is especially consequential for EEG, where electrodes are spatially localized and focal phenomena can be missed by a random draw.

    Authors: The referee correctly identifies that direct validation of the random channel subsets as positive pairs is not provided through ablations or information metrics. We maintain that the empirical results on downstream tasks serve as validation, as CRLC's outperformance indicates effective learning of shared representations. Nevertheless, we will revise the manuscript to include an ablation study varying the subset size and report its impact on performance for both EEG and ECG. We will also discuss the implications for spatial coverage in EEG, acknowledging the potential for missing focal events but emphasizing the benefit for channel-agnostic applications. revision: yes

  2. Referee: [Results] Table or figure reporting downstream performance (EEG and ECG fine-tuning) does not include statistical significance tests across random seeds or cross-validation folds, nor does it report variance when the number of available channels at test time is varied; without these, the reported outperformance of CRLC over augmentation and temporal baselines cannot be assessed for robustness.

    Authors: We concur that adding statistical tests and variance measures will improve the robustness assessment. In the revision, we will include results from multiple random seeds with standard deviations and conduct significance testing for the performance differences. Additionally, we will extend the evaluation to report performance variance across different numbers of channels at test time, providing a more comprehensive view of the channel-agnostic capabilities. revision: yes

Circularity Check

0 steps flagged

Empirical comparison of pretraining strategies; no derivation chain present

full rationale

The paper conducts an empirical evaluation of contrastive pretraining methods (CRLC using random channel subsets, augmentations, and temporal neighbors) on EEG/ECG data, followed by fine-tuning on downstream tasks. No equations, first-principles derivations, or predictions are claimed; performance differences are reported from experiments with released code. The central claim (CRLC superiority in channel-agnostic setting) rests on measured accuracies rather than any reduction to fitted inputs or self-citations. This is a standard self-contained empirical study with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard contrastive learning objective and the empirical assumption that random channel subsets preserve semantic content. No free parameters are fitted inside the reported claim itself.

axioms (2)
  • standard math Contrastive loss pulls embeddings of positive pairs closer than negative pairs.
    Invoked in the description of contrastive self-supervision.
  • domain assumption Random channel subsets from the same recording share the same underlying signal semantics.
    Core premise for treating them as positive pairs in CRLC.

pith-pipeline@v0.9.0 · 5795 in / 1209 out tokens · 36282 ms · 2026-05-25T09:03:13.032572+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages · 1 internal anchor

  1. [1]

    A simple framework for contrastive learning of visual representations,

    T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework for contrastive learning of visual representations,” in Proceedings of the 37th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, vol. 119. PMLR, July 2020, pp. 1597– 1607

  2. [2]

    BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2019, unpublished

  3. [3]

    The case for using digital EEG analysis in clinical sleep medicine,

    M. Younes, “The case for using digital EEG analysis in clinical sleep medicine,” Sleep Science and Practice , vol. 1, no. 1, February 2017

  4. [4]

    Dimensionality reduction by learning an invariant mapping,

    R. Hadsell, S. Chopra, and Y . LeCun, “Dimensionality reduction by learning an invariant mapping,” in 2006 IEEE Computer Society Con- ference on Computer Vision and Pattern Recognition (CVPR’06), vol. 2, 2006, pp. 1735–1742. BR ¨USCH et al.: CONTRASTIVE RANDOM LEAD CODING FOR CHANNEL-AGNOSTIC SELF-SUPERVISION OF BIOSIGNALS 13

  5. [5]

    Masked au- toencoders are scalable vision learners,

    K. He, X. Chen, S. Xie, Y . Li, P. Doll ´ar, and R. Girshick, “Masked au- toencoders are scalable vision learners,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , June 2022, pp. 16 000–16 009

  6. [6]

    Bootstrap your own latent - a new approach to self-supervised learning,

    J.-B. Grill, F. Strub, F. Altch ´e, C. Tallec, P. Richemond, E. Buchatskaya, C. Doersch, B. Avila Pires, Z. Guo, M. Gheshlaghi Azar, B. Piot, k. Kavukcuoglu, R. Munos, and M. Valko, “Bootstrap your own latent - a new approach to self-supervised learning,” in Advances in Neural Information Processing Systems, vol. 33. Curran Associates, Inc., 2020, pp. 21 ...

  7. [7]

    Barlow twins: Self- supervised learning via redundancy reduction,

    J. Zbontar, L. Jing, I. Misra, Y . LeCun, and S. Deny, “Barlow twins: Self- supervised learning via redundancy reduction,” in Proceedings of the 38th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, vol. 139. PMLR, July 2021, pp. 12 310– 12 320

  8. [8]

    Multi- view action recognition using contrastive learning,

    K. Shah, A. Shah, C. P. Lau, C. M. de Melo, and R. Chellapp, “Multi- view action recognition using contrastive learning,” in 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) , 2023, pp. 3370–3380

  9. [9]

    Robustsleepnet: Transfer learning for auto- mated sleep staging at scale,

    A. Guillot and V . Thorey, “Robustsleepnet: Transfer learning for auto- mated sleep staging at scale,” IEEE Transactions on Neural Systems and Rehabilitation Engineering, vol. 29, 2021

  10. [10]

    Self-supervised contrastive pre-training for time series via time-frequency consistency,

    X. Zhang, Z. Zhao, T. Tsiligkaridis, and M. Zitnik, “Self-supervised contrastive pre-training for time series via time-frequency consistency,” in Proceedings of Neural Information Processing Systems, NeurIPS , 2022

  11. [11]

    Bendr: Using transform- ers and a contrastive self-supervised learning task to learn from massive amounts of eeg data,

    D. Kostas, S. Aroca-Ouellette, and F. Rudzicz, “Bendr: Using transform- ers and a contrastive self-supervised learning task to learn from massive amounts of eeg data,” Frontiers in Human Neuroscience, vol. 15, 2021

  12. [12]

    Lead-agnostic self-supervised learning for local and global representations of elec- trocardiogram,

    J. Oh, H. Chung, J.-m. Kwon, D.-g. Hong, and E. Choi, “Lead-agnostic self-supervised learning for local and global representations of elec- trocardiogram,” in Proceedings of the Conference on Health, Inference, and Learning, ser. Proceedings of Machine Learning Research, vol. 174. PMLR, April 2022, pp. 338–353

  13. [13]

    Contrastive representa- tion learning for electroencephalogram classification,

    M. N. Mohsenvand, M. R. Izadi, and P. Maes, “Contrastive representa- tion learning for electroencephalogram classification,” in Proceedings of the Machine Learning for Health NeurIPS Workshop , ser. Proceedings of Machine Learning Research, vol. 136. PMLR, December 2020, pp. 238–253

  14. [14]

    Clocs: Contrastive learning of cardiac signals across space, time, and patients,

    D. Kiyasseh, T. Zhu, and D. A. Clifton, “Clocs: Contrastive learning of cardiac signals across space, time, and patients,” in International Conference on Machine Learning . PMLR, 2021, pp. 5606–5615

  15. [15]

    Multi-view self- supervised learning for multivariate variable-channel time series,

    T. Br ¨usch, M. N. Schmidt, and T. S. Alstrøm, “Multi-view self- supervised learning for multivariate variable-channel time series,” in 2023 IEEE 33rd International Workshop on Machine Learning for Signal Processing (MLSP), 2023, pp. 1–6

  16. [16]

    Time-series representation learning via temporal and contextual con- trasting,

    E. Eldele, M. Ragab, Z. Chen, M. Wu, C. K. Kwoh, X. Li, and C. Guan, “Time-series representation learning via temporal and contextual con- trasting,” in Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI-21, Z.-H. Zhou, Ed., 8 2021, pp. 2352– 2359, main Track

  17. [17]

    TS2Vec: Towards universal representation of time series,

    Z. Yue, Y . Wang, J. Duan, T. Yang, C. Huang, Y . Tong, and B. Xu, “TS2Vec: Towards universal representation of time series,” Proceedings of the AAAI Conference on Artificial Intelligence , vol. 36, no. 8, pp. 8980–8987, June 2022

  18. [18]

    COCOA: Cross modality contrastive learning for sensor data,

    S. Deldari, H. Xue, A. Saeed, D. V . Smith, and F. D. Salim, “COCOA: Cross modality contrastive learning for sensor data,”Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., vol. 6, no. 3, September 2022

  19. [19]

    Uncovering the structure of clinical eeg signals with self- supervised learning,

    H. Banville, O. Chehab, A. Hyv ¨arinen, D.-A. Engemann, and A. Gram- fort, “Uncovering the structure of clinical eeg signals with self- supervised learning,” Journal of Neural Engineering , vol. 18, no. 4, p. 046020, March 2021

  20. [20]

    Neural message passing for quantum chemistry,

    J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proceedings of the 34th International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, D. Precup and Y . W. Teh, Eds., vol. 70. PMLR, August 2017, pp. 1263–1272

  21. [21]

    You snooze, you win: The physionet/computing in cardiology challenge 2018,

    M. M. Ghassemi, B. E. Moody, L. W. H. Lehman, C. Song, Q. Li, H. Sun, R. G. Mark, M. B. Westover, and G. D. Clifford, “You snooze, you win: The physionet/computing in cardiology challenge 2018,” Computing in Cardiology , p. 8743916, 2018

  22. [22]

    Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,

    A. L. Goldberger, L. A. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley, “Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. E215–220, 2000

  23. [23]

    Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the eeg,

    B. Kemp, A. H. Zwinderman, B. Tuk, H. A. Kamphuisen, and J. J. Obery´e, “Analysis of a sleep-dependent neuronal feedback loop: The slow-wave microcontinuity of the eeg,”Ieee Transactions on Biomedical Engineering, vol. 47, no. 9, pp. 1185–1194, 2000

  24. [24]

    Montreal archive of sleep studies: an open-access resource for instrument benchmarking and exploratory research,

    C. O’Reilly, N. Gosselin, J. Carrier, and T. Nielsen, “Montreal archive of sleep studies: an open-access resource for instrument benchmarking and exploratory research,” Journal of Sleep Research, vol. 23, no. 6, pp. 628–635, 2014

  25. [25]

    Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020,

    E. A. P. Alday, A. Gu, A. J. Shah, C. Robichaux, A.-K. I. Wong, C. Liu, F. Liu, A. B. Rad, A. Elola, S. Seyedi, Q. Li, A. Sharma, G. D. Clifford, and M. A. Reyna, “Classification of 12-lead ecgs: the physionet/computing in cardiology challenge 2020,” Physiological Measurement, vol. 41, no. 12, p. 124003, December 2020

  26. [26]

    Will two do? varying dimensions in electrocardiog- raphy: The physionet/computing in cardiology challenge 2021 (version 1.0.3),

    M. Reyna, N. Sadr, A. Gu, E. A. P. Alday, C. Liu, S. Seyedi, A. Shah, and G. Clifford, “Will two do? varying dimensions in electrocardiog- raphy: The physionet/computing in cardiology challenge 2021 (version 1.0.3),” 2022

  27. [27]

    wav2vec 2.0: A framework for self-supervised learning of speech representations,

    A. Baevski, Y . Zhou, A. Mohamed, and M. Auli, “wav2vec 2.0: A framework for self-supervised learning of speech representations,” in Advances in Neural Information Processing Systems , vol. 33. Curran Associates, Inc., 2020, pp. 12 449–12 460

  28. [28]

    The Temple university hospital eeg data corpus,

    I. Obeid and J. Picone, “The Temple university hospital eeg data corpus,” Frontiers in Neuroscience, vol. 10, 2016

  29. [29]

    BCI2000: a general-purpose brain-computer interface (BCI) system,

    G. Schalk, D. McFarland, T. Hinterberger, N. Birbaumer, and J. Wolpaw, “BCI2000: a general-purpose brain-computer interface (BCI) system,” IEEE Transactions on Biomedical Engineering, vol. 51, no. 6, pp. 1034– 1043, 2004

  30. [30]

    The non-invasive Berlin Brain–Computer Interface: Fast acquisition of effective performance in untrained subjects,

    B. Blankertz, G. Dornhege, M. Krauledat, K.-R. M ¨uller, and G. Curio, “The non-invasive Berlin Brain–Computer Interface: Fast acquisition of effective performance in untrained subjects,” NeuroImage, vol. 37, no. 2, pp. 539–550, 2007

  31. [31]

    Deep learning with convolutional neural networks for EEG decoding and visualization,

    R. T. Schirrmeister, J. T. Springenberg, L. D. J. Fiederer, M. Glasstetter, K. Eggensperger, M. Tangermann, F. Hutter, W. Burgard, and T. Ball, “Deep learning with convolutional neural networks for EEG decoding and visualization,” Human Brain Mapping , aug 2017

  32. [32]

    A large electroencephalographic motor imagery dataset for electroencephalo- graphic brain computer interfaces,

    M. Kaya, M. K. Binli, E. Ozbay, H. Yanar, and Y . Mishchenko, “A large electroencephalographic motor imagery dataset for electroencephalo- graphic brain computer interfaces,” Scientific data , vol. 5, no. 1, pp. 1–16, 2018

  33. [33]

    V ogiatzis, E

    I. V ogiatzis, E. Koulouris, A. Ioannidis, E. Sdogkos, M. Pliatsika, P. Roditis, and M. Goumenakis, “The importance of the 15-lead versus 12-lead ECG recordings in the diagnosis and treatment of right ventricle and left ventricle posterior and lateral wall acute myocardial infarctions,” Acta Informatica Medica , vol. 27, no. 1, p. 35, 2019