hub

Learning deep representations by mutual information estimation and maximization

Hjelm, R · 2018 · stat.ML · arXiv 1808.06670

21 Pith papers cite this work. Polarity classification is still indexing.

21 Pith papers citing it

open full Pith review browse 21 citing papers arXiv PDF

abstract

In this work, we perform unsupervised learning of representations by maximizing mutual information between an input and the output of a deep neural network encoder. Importantly, we show that structure matters: incorporating knowledge about locality of the input to the objective can greatly influence a representation's suitability for downstream tasks. We further control characteristics of the representation by matching to a prior distribution adversarially. Our method, which we call Deep InfoMax (DIM), outperforms a number of popular unsupervised learning methods and competes with fully-supervised learning on several classification tasks. DIM opens new avenues for unsupervised learning of representations and is an important step towards flexible formulations of representation-learning objectives for specific end-goals.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 3 method 1

citation-polarity summary

background 2 unclear 1 use method 1

representative citing papers

A Unified Geometric Framework for Weighted Contrastive Learning

cs.LG · 2026-05-13 · unverdicted · novelty 8.0

Weighted InfoNCE objectives realize specific target geometries in embedding space, with SupCon producing size-dependent inter-class similarities under imbalance while Soft SupCon and certain continuous variants preserve regular simplex or unique optima.

Harnessing Linguistic Dissimilarity for Language Generalization on Unseen Low-Resource Varieties

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

A framework with TOPPing source selection and VACAI-Bowl dual-branch model yields 54.62% average improvement in dependency parsing across 10 low-resource varieties.

Towards Multi-Source Domain Generalization for Sleep Staging with Noisy Labels

cs.LG · 2026-04-11 · unverdicted · novelty 7.0

FF-TRUST delivers state-of-the-art sleep staging performance across domain shifts and both symmetric and asymmetric label noise by jointly regularizing temporal and spectral consistency on five public datasets.

A Simple Framework for Contrastive Learning of Visual Representations

cs.LG · 2020-02-13 · accept · novelty 7.0

SimCLR learns visual representations by contrasting augmented views of the same image and reaches 76.5% ImageNet top-1 accuracy with a linear classifier, matching a supervised ResNet-50.

Information as Maximum-Caliber Deviation: A bridge between Integrated Information Theory and the Free Energy Principle

q-bio.NC · 2026-05-03 · unverdicted · novelty 6.0

Information defined as maximum-caliber deviation derives IIT 3.0 cause-effect repertoires from constrained entropy maximization and equates to prediction error under CLT and LDT.

LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics

cs.LG · 2025-11-11 · conditional · novelty 6.0

LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.

Relative Contrastive Learning for Sequential Recommendation with Similarity-based Positive Pair Selection

cs.IR · 2025-04-27 · unverdicted · novelty 6.0

RCL adds similarity-based weak positive samples to supervised contrastive learning in sequential recommendation and reports an average 4.88% improvement over state-of-the-art methods across six datasets.

Multi-Scale Contrastive Learning for Video Temporal Grounding

cs.CV · 2024-12-10 · unverdicted · novelty 6.0

A multi-scale and cross-scale contrastive learning framework uses intra-encoder stage features and a new sampling process to link short-range and long-range video moments for temporal grounding.

Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings

cs.CL · 2023-05-23 · unverdicted · novelty 6.0

TaDSE learns dialogue sentence embeddings via template-guided self-supervised contrastive learning plus synthetic slot-filling augmentation and reports gains on five downstream benchmarks.

Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations

cs.IR · 2026-04-20 · unverdicted · novelty 6.0

LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.

Revisiting Feature Prediction for Learning Visual Representations from Video

cs.CV · 2024-02-15 · conditional · novelty 6.0

V-JEPA models trained only on feature prediction from 2 million public videos achieve 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet-1K using frozen ViT-H/16 backbones.

HuggingFace's Transformers: State-of-the-art Natural Language Processing

cs.CL · 2019-10-09 · accept · novelty 6.0

Hugging Face releases an open-source Python library that supplies a unified API and pretrained weights for major Transformer architectures used in natural language processing.

Information theoretic underpinning of self-supervised learning by clustering

cs.LG · 2026-05-12 · unverdicted · novelty 5.0

SSL clustering is derived as KL-divergence optimization where a teacher-distribution constraint normalizes via inverse cluster priors and simplifies to batch centering by Jensen's inequality.

M-IDoL: Information Decomposition for Modality-Specific and Diverse Representation Learning in Medical Foundation Model

cs.CV · 2026-04-10 · unverdicted · novelty 5.0

M-IDoL learns modality-specific and diverse representations by maximizing inter-modality entropy and minimizing intra-modality uncertainty through information decomposition in MoE subspaces.

ID-Sim: An Identity-Focused Similarity Metric

cs.CV · 2026-04-06 · unverdicted · novelty 5.0

ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.

Learning Disentangled Representations for Generalized Multi-view Clustering

cs.CV · 2026-05-15 · unverdicted · novelty 4.0

GMAE learns disentangled view-specific and view-common embeddings via dual-path autoencoders and cross-view adversarial training to boost performance on complete and incomplete multi-view clustering tasks.

Learning to Find Correlated Features by Maximizing Information Flow in Convolutional Neural Networks

cs.CV · 2019-06-30 · unverdicted · novelty 4.0

Introduces IFM loss regularization for CNNs to learn correlated discriminative features, tested on shiftedMNIST dataset.

Dynamic Visual-semantic Alignment for Zero-shot Learning with Ambiguous Labels

cs.CV · 2026-04-20 · unverdicted · novelty 4.0

DVSA improves zero-shot learning under ambiguous labels by mutually calibrating visual features and attributes with attention and dynamic disambiguation.

Information-Theoretic Measures in AI: A Practical Decision Guide

cs.AI · 2026-04-26 · unverdicted · novelty 3.0

A practical guide that organizes seven IT measures around three questions each—what it answers in AI, suitable estimators, and dangerous misuses—complete with flowchart, table, and worked examples.

InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization

cs.CV · 2026-05-08

DETR-ViP: Detection Transformer with Robust Discriminative Visual Prompts

cs.CV · 2026-04-16

citing papers explorer

Showing 21 of 21 citing papers.

A Unified Geometric Framework for Weighted Contrastive Learning cs.LG · 2026-05-13 · unverdicted · none · ref 29 · internal anchor
Weighted InfoNCE objectives realize specific target geometries in embedding space, with SupCon producing size-dependent inter-class similarities under imbalance while Soft SupCon and certain continuous variants preserve regular simplex or unique optima.
Harnessing Linguistic Dissimilarity for Language Generalization on Unseen Low-Resource Varieties cs.CL · 2026-05-06 · unverdicted · none · ref 3
A framework with TOPPing source selection and VACAI-Bowl dual-branch model yields 54.62% average improvement in dependency parsing across 10 low-resource varieties.
Towards Multi-Source Domain Generalization for Sleep Staging with Noisy Labels cs.LG · 2026-04-11 · unverdicted · none · ref 10
FF-TRUST delivers state-of-the-art sleep staging performance across domain shifts and both symmetric and asymmetric label noise by jointly regularizing temporal and spectral consistency on five public datasets.
A Simple Framework for Contrastive Learning of Visual Representations cs.LG · 2020-02-13 · accept · none · ref 27
SimCLR learns visual representations by contrasting augmented views of the same image and reaches 76.5% ImageNet top-1 accuracy with a linear classifier, matching a supervised ResNet-50.
Information as Maximum-Caliber Deviation: A bridge between Integrated Information Theory and the Free Energy Principle q-bio.NC · 2026-05-03 · unverdicted · none · ref 154 · internal anchor
Information defined as maximum-caliber deviation derives IIT 3.0 cause-effect repertoires from constrained entropy maximization and equates to prediction error under CLT and LDT.
LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics cs.LG · 2025-11-11 · conditional · none · ref 122 · internal anchor
LeJEPA derives an optimal isotropic Gaussian target for embeddings and enforces it via sketched regularization to deliver scalable, heuristics-free self-supervised pretraining with 79% ImageNet linear accuracy on ViT-H/14.
Relative Contrastive Learning for Sequential Recommendation with Similarity-based Positive Pair Selection cs.IR · 2025-04-27 · unverdicted · none · ref 14 · internal anchor
RCL adds similarity-based weak positive samples to supervised contrastive learning in sequential recommendation and reports an average 4.88% improvement over state-of-the-art methods across six datasets.
Multi-Scale Contrastive Learning for Video Temporal Grounding cs.CV · 2024-12-10 · unverdicted · none · ref 19 · internal anchor
A multi-scale and cross-scale contrastive learning framework uses intra-encoder stage features and a new sampling process to link short-range and long-range video moments for temporal grounding.
Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings cs.CL · 2023-05-23 · unverdicted · none · ref 16 · internal anchor
TaDSE learns dialogue sentence embeddings via template-guided self-supervised contrastive learning plus synthetic slot-filling augmentation and reports gains on five downstream benchmarks.
Modular Representation Compression: Adapting LLMs for Efficient and Effective Recommendations cs.IR · 2026-04-20 · unverdicted · none · ref 11
LLMs exhibit mid-layer representation advantage for recommendations; MARC compresses representations modularly to reduce costs while improving performance, as shown in a large-scale online advertising deployment.
Revisiting Feature Prediction for Learning Visual Representations from Video cs.CV · 2024-02-15 · conditional · none · ref 98
V-JEPA models trained only on feature prediction from 2 million public videos achieve 81.9% on Kinetics-400, 72.2% on Something-Something-v2, and 77.9% on ImageNet-1K using frozen ViT-H/16 backbones.
HuggingFace's Transformers: State-of-the-art Natural Language Processing cs.CL · 2019-10-09 · accept · none · ref 116
Hugging Face releases an open-source Python library that supplies a unified API and pretrained weights for major Transformer architectures used in natural language processing.
Information theoretic underpinning of self-supervised learning by clustering cs.LG · 2026-05-12 · unverdicted · none · ref 148
SSL clustering is derived as KL-divergence optimization where a teacher-distribution constraint normalizes via inverse cluster priors and simplifies to batch centering by Jensen's inequality.
M-IDoL: Information Decomposition for Modality-Specific and Diverse Representation Learning in Medical Foundation Model cs.CV · 2026-04-10 · unverdicted · none · ref 8
M-IDoL learns modality-specific and diverse representations by maximizing inter-modality entropy and minimizing intra-modality uncertainty through information decomposition in MoE subspaces.
ID-Sim: An Identity-Focused Similarity Metric cs.CV · 2026-04-06 · unverdicted · none · ref 26
ID-Sim is a new similarity metric that aims to capture human selective sensitivity to identities by training on curated real and generative synthetic data and validating against human annotations on recognition, retrieval, and generative tasks.
Learning Disentangled Representations for Generalized Multi-view Clustering cs.CV · 2026-05-15 · unverdicted · none · ref 51 · internal anchor
GMAE learns disentangled view-specific and view-common embeddings via dual-path autoencoders and cross-view adversarial training to boost performance on complete and incomplete multi-view clustering tasks.
Learning to Find Correlated Features by Maximizing Information Flow in Convolutional Neural Networks cs.CV · 2019-06-30 · unverdicted · none · ref 6 · internal anchor
Introduces IFM loss regularization for CNNs to learn correlated discriminative features, tested on shiftedMNIST dataset.
Dynamic Visual-semantic Alignment for Zero-shot Learning with Ambiguous Labels cs.CV · 2026-04-20 · unverdicted · none · ref 20
DVSA improves zero-shot learning under ambiguous labels by mutually calibrating visual features and attributes with attention and dynamic disambiguation.
Information-Theoretic Measures in AI: A Practical Decision Guide cs.AI · 2026-04-26 · unverdicted · none · ref 17
A practical guide that organizes seven IT measures around three questions each—what it answers in AI, suitable estimators, and dangerous misuses—complete with flowchart, table, and worked examples.
InfoGeo: Information-Theoretic Object-Centric Learning for Cross-View Generalizable UAV Geo-Localization cs.CV · 2026-05-08 · unreviewed · ref 65
DETR-ViP: Detection Transformer with Robust Discriminative Visual Prompts cs.CV · 2026-04-16 · unreviewed · ref 5

Learning deep representations by mutual information estimation and maximization

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer