Convmae: Masked convolution meets masked autoencoders

· 2022 · arXiv 2205.03892

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Towards All-Day Perception for Off-Road Driving: A Large-Scale Multispectral Dataset and Comprehensive Benchmark

cs.CV · 2026-04-30 · unverdicted · novelty 7.0

Presents the first large-scale infrared off-road dataset and a flow-free temporal model achieving state-of-the-art freespace detection performance with real-time inference.

UNIV: Unified Foundation Model for Infrared and Visible Modalities

cs.CV · 2025-09-19 · unverdicted · novelty 6.0

UNIV introduces Patch Cross-modal Contrastive Learning (PCCL) to build a unified semantic feature space for infrared and visible modalities, supported by the new MVIP dataset of 98,992 aligned pairs, with reported gains on infrared segmentation and detection tasks.

Sapiens2

cs.CV · 2026-04-23 · unverdicted · novelty 5.0

Sapiens2 improves pretraining, data scale, and architecture over its predecessor to set new state-of-the-art results on human pose estimation, body-part segmentation, normal estimation, and new tasks like pointmap and albedo estimation.

Self-Supervised Learning for Real-World Object Detection: a Survey

cs.CV · 2024-10-09 · unverdicted · novelty 5.0

Survey benchmarks SSL instance discrimination and masked image modeling for object detection, finding instance discrimination suits CNN encoders while MIM suits ViT encoders and custom pre-training, especially for small objects.

citing papers explorer

Showing 4 of 4 citing papers.

Towards All-Day Perception for Off-Road Driving: A Large-Scale Multispectral Dataset and Comprehensive Benchmark cs.CV · 2026-04-30 · unverdicted · none · ref 56
Presents the first large-scale infrared off-road dataset and a flow-free temporal model achieving state-of-the-art freespace detection performance with real-time inference.
UNIV: Unified Foundation Model for Infrared and Visible Modalities cs.CV · 2025-09-19 · unverdicted · none · ref 15
UNIV introduces Patch Cross-modal Contrastive Learning (PCCL) to build a unified semantic feature space for infrared and visible modalities, supported by the new MVIP dataset of 98,992 aligned pairs, with reported gains on infrared segmentation and detection tasks.
Sapiens2 cs.CV · 2026-04-23 · unverdicted · none · ref 12
Sapiens2 improves pretraining, data scale, and architecture over its predecessor to set new state-of-the-art results on human pose estimation, body-part segmentation, normal estimation, and new tasks like pointmap and albedo estimation.
Self-Supervised Learning for Real-World Object Detection: a Survey cs.CV · 2024-10-09 · unverdicted · none · ref 67
Survey benchmarks SSL instance discrimination and masked image modeling for object detection, finding instance discrimination suits CNN encoders while MIM suits ViT encoders and custom pre-training, especially for small objects.

Convmae: Masked convolution meets masked autoencoders

fields

years

verdicts

representative citing papers

citing papers explorer