Spatial Prediction pretext task learns spatial structure in self-supervised learning by regressing relative position and scale between image views, yielding more structured representations and better generalization.
Learning by reconstruction produces uninformative features for perception
6 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
polarities
background 2representative citing papers
Rhamba uses region-aware masking strategies and hybrid Attention-Mamba models pretrained on ABIDE fMRI data to achieve top AUROC on schizophrenia and ADHD classification tasks while outperforming prior methods.
Alethia is a pretrained audio encoder using continuous embedding prediction and generative flow-matching reconstruction that outperforms existing speech foundation models on voice deepfake tasks with better robustness and zero-shot generalization.
LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.
ECG-JEPA applies a joint-embedding predictive architecture with Cross-Pattern Attention to learn semantic representations from unlabeled 12-lead ECG data and reports state-of-the-art results on diagnostic classification, feature extraction, and segmentation.
Authors introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic geometric framework that unifies explanations for representation, computation, and generalization in shallow and deep neural networks.
citing papers explorer
-
Learning to Perceive "Where": Spatial Pretext Tasks for Robust Self-Supervised Learning
Spatial Prediction pretext task learns spatial structure in self-supervised learning by regressing relative position and scale between image views, yielding more structured representations and better generalization.
-
Rhamba: Region-Aware Hybrid Attention-Mamba Framework for Self-Supervised Learning in Resting-State fMRI
Rhamba uses region-aware masking strategies and hybrid Attention-Mamba models pretrained on ABIDE fMRI data to achieve top AUROC on schizophrenia and ADHD classification tasks while outperforming prior methods.
-
Alethia: A Foundational Encoder for Voice Deepfakes
Alethia is a pretrained audio encoder using continuous embedding prediction and generative flow-matching reconstruction that outperforms existing speech foundation models on voice deepfake tasks with better robustness and zero-shot generalization.
-
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics
LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.
-
Learning General Representation of 12-Lead Electrocardiogram with a Joint-Embedding Predictive Architecture
ECG-JEPA applies a joint-embedding predictive architecture with Cross-Pattern Attention to learn semantic representations from unlabeled 12-lead ECG data and reports state-of-the-art results on diagnostic classification, feature extraction, and segmentation.
-
Axiomatizing Neural Networks via Pursuit of Subspaces
Authors introduce the Pursuit of Subspaces (PoS) hypothesis, an axiomatic geometric framework that unifies explanations for representation, computation, and generalization in shallow and deep neural networks.