hub Mixed citations

A simple framework for contrastive learning of visual representations

Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton · 2020

Mixed citation behavior. Most common role is method (50%).

23 Pith papers citing it

Method 50% of classified citations

browse 23 citing papers

hub tools

JSON dossier citing papers JSON

citation-role summary

method 5 background 4 baseline 1

citation-polarity summary

use method 5 background 4 baseline 1

representative citing papers

Embracing Biased Transition Matrices for Complementary-Label Learning with Many Classes

cs.LG · 2026-05-15 · unverdicted · novelty 7.0

BICL uses biased non-uniform transition matrices to generate constrained complementary labels, enabling effective learning and over sevenfold accuracy gains on many-class image datasets.

MaxSketch: Robust Distinct Counting in Streams via Random Projections

stat.ML · 2026-05-15 · unverdicted · novelty 7.0

MaxSketch achieves O~(log n / ε²) memory for (1+ε)-approximate distinct counting in streams with geometric structure via max-linear random projections.

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

cs.CV · 2026-05-13 · unverdicted · novelty 7.0

SpurAudio benchmark shows state-of-the-art few-shot audio classifiers suffer large performance drops when background correlations are disrupted, even in large pretrained models.

Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

SAEParate disentangles sparse representations in diffusion models via contrastive clustering and nonlinear encoding to enable more precise concept unlearning with reduced side effects.

LatentUMM: Dual Latent Alignment for Unified Multimodal Models

cs.CV · 2026-05-18 · unverdicted · novelty 6.0

LatentUMM proposes dual latent alignment at modality and capacity levels plus latent dynamics stabilization to reduce semantic drift and improve consistency in unified multimodal models.

Latent Video Prediction Learns Better World Models

cs.CV · 2026-05-15 · unverdicted · novelty 6.0

Latent prediction video models exhibit a distinct robustness profile across corruption, occlusion, fine-grained discrimination, and temporal sensitivity compared to other self-supervised video models when used as world models.

Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective

cs.LG · 2026-05-13 · unverdicted · novelty 6.0 · 2 refs

ConSPO introduces a contrastive sequence-level policy optimization that aligns rollout scores with generation likelihoods via length-normalized log-probabilities and an InfoNCE-style group contrast with curriculum margin to outperform GRPO on LLM math reasoning benchmarks.

FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation

cs.CV · 2026-05-12 · unverdicted · novelty 6.0 · 2 refs

FuTCR improves new-class panoptic quality by up to 28% in continual panoptic segmentation by discovering future-like regions in background areas and applying targeted contrast and repulsion to restructure representations.

WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms

cs.LG · 2026-05-09 · unverdicted · novelty 6.0

WavesFM uses hierarchical SSL to pretrain a segment encoder on short waveforms followed by a temporal encoder on multi-day sequences, outperforming prior methods on 58 tasks after training on over 12 million hours of data from hundreds of thousands of people.

Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval

cs.CV · 2026-05-08 · unverdicted · novelty 6.0 · 2 refs

DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.

Rapidly deploying on-device eye tracking by distilling visual foundation models

cs.CV · 2026-04-02 · unverdicted · novelty 6.0

DistillGaze reduces median gaze error by 58.62% on a 2000+ participant dataset by distilling foundation models into a 256K-parameter on-device model using synthetic labeled data and unlabeled real data.

MRI-to-CT synthesis using drifting models

eess.IV · 2026-03-30 · unverdicted · novelty 6.0

Drifting models outperform diffusion, CNN, VAE, and GAN baselines in MRI-to-CT synthesis on two pelvis datasets with higher SSIM/PSNR, lower RMSE, and millisecond one-step inference.

CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model

cs.LG · 2025-06-10 · unverdicted · novelty 6.0

CodeBrain introduces a decoupled TFDual-Tokenizer and multi-scale EEGSSM architecture for an EEG foundation model pretrained on a large corpus, claiming strong generalization across eight downstream tasks and ten datasets.

Extending Pretrained 10-Second ECG Foundation Models to Longer Horizons

cs.LG · 2026-05-16 · unverdicted · novelty 5.0

A parameter-efficient plug-in framework adds structurally compatible long-sequence processing and semantically informed temporal modeling to extend pretrained 10-second ECG foundation models to longer variable-length inputs.

Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback

cs.LG · 2026-05-08 · unverdicted · novelty 5.0

SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.

BenchHAR: Benchmarking Self-Supervised Learning for Generalizable Sensor-based Activity Recognition

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

BenchHAR finds that hybrid reconstruction-plus-contrastive SSL with CNN encoders generalizes best for sensor HAR but overall performance on unseen distributions remains unsatisfactory.

ShellfishNet: A Domain-Specific Benchmark for Visual Recognition of Marine Molluscs

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

ShellfishNet is a new benchmark of 8,691 images across 32 mollusc taxa for evaluating vision models on real-world underwater ecological monitoring tasks including robustness to degradation.

Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness

cs.CV · 2026-05-08 · unverdicted · novelty 5.0

Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.

ConvFormer3D-TAP: Phase/Uncertainty-Aware Front-End Fusion for Cine CMR View Classification Pipelines

cs.CV · 2026-04-13 · unverdicted · novelty 5.0

ConvFormer3D-TAP classifies six cine CMR views at 96% accuracy using 3D conv tokenization, multiscale attention, and uncertainty-aware multi-clip fusion on 150k sequences.

GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search

cs.IR · 2026-02-17 · unverdicted · novelty 5.0

GaiaFlow combines semantic-guided diffusion tuning with early-exit and quantization methods to lower carbon emissions in neural information retrieval while maintaining competitive effectiveness.

HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval

cs.IR · 2026-05-22 · unverdicted · novelty 4.0

HARNESS-LM uses teacher fine-tuning, L2 query alignment, and contrastive refinement to distill large SLM retrievers into compact models that recover 98% precision with up to 27x lower latency on Bing Ads benchmarks.

A Gesture-Based Visual Learning Model for Acoustophoretic Interactions using a Swarm of AcoustoBots

cs.RO · 2026-04-21 · unverdicted · novelty 4.0

OpenCLIP-based gesture classification with linear probing controls AcoustoBot swarms at 87.8% accuracy and 3.95 s latency in controlled tests.

Representation learning from OCT images

cs.CV · 2026-05-04 · unverdicted · novelty 3.0

A structured survey of representation learning methods for retinal OCT image analysis, covering supervised, self-supervised, generative, multimodal, and foundation model approaches along with datasets and open problems.

citing papers explorer

Showing 23 of 23 citing papers.

Embracing Biased Transition Matrices for Complementary-Label Learning with Many Classes cs.LG · 2026-05-15 · unverdicted · none · ref 41
BICL uses biased non-uniform transition matrices to generate constrained complementary labels, enabling effective learning and over sevenfold accuracy gains on many-class image datasets.
MaxSketch: Robust Distinct Counting in Streams via Random Projections stat.ML · 2026-05-15 · unverdicted · none · ref 4
MaxSketch achieves O~(log n / ε²) memory for (1+ε)-approximate distinct counting in streams with geometric structure via max-linear random projections.
SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification cs.CV · 2026-05-13 · unverdicted · none · ref 8
SpurAudio benchmark shows state-of-the-art few-shot audio classifiers suffer large performance drops when background correlations are disrupted, even in large pretrained models.
Disentangled Sparse Representations for Concept-Separated Diffusion Unlearning cs.LG · 2026-05-12 · unverdicted · none · ref 6
SAEParate disentangles sparse representations in diffusion models via contrastive clustering and nonlinear encoding to enable more precise concept unlearning with reduced side effects.
LatentUMM: Dual Latent Alignment for Unified Multimodal Models cs.CV · 2026-05-18 · unverdicted · none · ref 3
LatentUMM proposes dual latent alignment at modality and capacity levels plus latent dynamics stabilization to reduce semantic drift and improve consistency in unified multimodal models.
Latent Video Prediction Learns Better World Models cs.CV · 2026-05-15 · unverdicted · none · ref 7
Latent prediction video models exhibit a distinct robustness profile across corruption, occlusion, fine-grained discrimination, and temporal sensitivity compared to other self-supervised video models when used as world models.
Revisiting Reinforcement Learning with Verifiable Rewards from a Contrastive Perspective cs.LG · 2026-05-13 · unverdicted · none · ref 30 · 2 links
ConSPO introduces a contrastive sequence-level policy optimization that aligns rollout scores with generation likelihoods via length-normalized log-probabilities and an InfoNCE-style group contrast with curriculum margin to outperform GRPO on LLM math reasoning benchmarks.
FuTCR: Future-Targeted Contrast and Repulsion for Continual Panoptic Segmentation cs.CV · 2026-05-12 · unverdicted · none · ref 33 · 2 links
FuTCR improves new-class panoptic quality by up to 28% in continual panoptic segmentation by discovering future-like regions in background areas and applying targeted contrast and repulsion to restructure representations.
WavesFM: Hierarchical Representation Learning for Longitudinal Wearable Sensor Waveforms cs.LG · 2026-05-09 · unverdicted · none · ref 36
WavesFM uses hierarchical SSL to pretrain a segment encoder on short waveforms followed by a temporal encoder on multi-day sequences, outperforming prior methods on 58 tasks after training on over 12 million hours of data from hundreds of thousands of people.
Decoupling Endpoint and Semantic Transition Learning for Zero-Shot Composed Image Retrieval cs.CV · 2026-05-08 · unverdicted · none · ref 4 · 2 links
DeCIR improves projection-based zero-shot composed image retrieval by decoupling endpoint and semantic transition alignment with separate low-rank adapters merged by LRDM, showing gains on CIRR, CIRCO, FashionIQ, and GeneCIS.
Rapidly deploying on-device eye tracking by distilling visual foundation models cs.CV · 2026-04-02 · unverdicted · none · ref 31
DistillGaze reduces median gaze error by 58.62% on a 2000+ participant dataset by distilling foundation models into a 256K-parameter on-device model using synthetic labeled data and unlabeled real data.
MRI-to-CT synthesis using drifting models eess.IV · 2026-03-30 · unverdicted · none · ref 34
Drifting models outperform diffusion, CNN, VAE, and GAN baselines in MRI-to-CT synthesis on two pelvis datasets with higher SSIM/PSNR, lower RMSE, and millisecond one-step inference.
CodeBrain: Bridging Decoupled Tokenizer and Multi-Scale Architecture for EEG Foundation Model cs.LG · 2025-06-10 · unverdicted · none · ref 64
CodeBrain introduces a decoupled TFDual-Tokenizer and multi-scale EEGSSM architecture for an EEG foundation model pretrained on a large corpus, claiming strong generalization across eight downstream tasks and ten datasets.
Extending Pretrained 10-Second ECG Foundation Models to Longer Horizons cs.LG · 2026-05-16 · unverdicted · none · ref 29
A parameter-efficient plug-in framework adds structurally compatible long-sequence processing and semantically informed temporal modeling to extend pretrained 10-second ECG foundation models to longer variable-length inputs.
Self-Play Enhancement via Advantage-Weighted Refinement in Online Federated LLM Fine-Tuning with Real-Time Feedback cs.LG · 2026-05-08 · unverdicted · none · ref 6
SPEAR enables online federated LLM fine-tuning by using feedback-guided self-play to create contrastive pairs trained with maximum likelihood on correct completions and confidence-weighted unlikelihood on incorrect ones, outperforming baselines without ground-truth contexts.
BenchHAR: Benchmarking Self-Supervised Learning for Generalizable Sensor-based Activity Recognition cs.CV · 2026-05-08 · unverdicted · none · ref 10
BenchHAR finds that hybrid reconstruction-plus-contrastive SSL with CNN encoders generalizes best for sensor HAR but overall performance on unseen distributions remains unsatisfactory.
ShellfishNet: A Domain-Specific Benchmark for Visual Recognition of Marine Molluscs cs.CV · 2026-05-08 · unverdicted · none · ref 13
ShellfishNet is a new benchmark of 8,691 images across 32 mollusc taxa for evaluating vision models on real-world underwater ecological monitoring tasks including robustness to degradation.
Pan-FM: A Pan-Organ Foundation Model with Saliency-Guided Masking for Missing Robustness cs.CV · 2026-05-08 · unverdicted · none · ref 13
Pan-FM learns balanced representations across seven organs by adaptively masking dominant organs during pre-training, yielding stronger disease prediction and missing-organ robustness than single-organ or naive multimodal baselines on UK Biobank.
ConvFormer3D-TAP: Phase/Uncertainty-Aware Front-End Fusion for Cine CMR View Classification Pipelines cs.CV · 2026-04-13 · unverdicted · none · ref 42
ConvFormer3D-TAP classifies six cine CMR views at 96% accuracy using 3D conv tokenization, multiscale attention, and uncertainty-aware multi-clip fusion on 150k sequences.
GaiaFlow: Semantic-Guided Diffusion Tuning for Carbon-Frugal Search cs.IR · 2026-02-17 · unverdicted · none · ref 56
GaiaFlow combines semantic-guided diffusion tuning with early-exit and quantization methods to lower carbon emissions in neural information retrieval while maintaining competitive effectiveness.
HARNESS-LM: A Three-Phase Training Recipe for Harnessing SLMs in Sponsored Search Retrieval cs.IR · 2026-05-22 · unverdicted · none · ref 22
HARNESS-LM uses teacher fine-tuning, L2 query alignment, and contrastive refinement to distill large SLM retrievers into compact models that recover 98% precision with up to 27x lower latency on Bing Ads benchmarks.
A Gesture-Based Visual Learning Model for Acoustophoretic Interactions using a Swarm of AcoustoBots cs.RO · 2026-04-21 · unverdicted · none · ref 19
OpenCLIP-based gesture classification with linear probing controls AcoustoBot swarms at 87.8% accuracy and 3.95 s latency in controlled tests.
Representation learning from OCT images cs.CV · 2026-05-04 · unverdicted · none · ref 48
A structured survey of representation learning methods for retinal OCT image analysis, covering supervised, self-supervised, generative, multimodal, and foundation model approaches along with datasets and open problems.

A simple framework for contrastive learning of visual representations

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer