GPVAE replaces the standard VAE latent prior with a temporal Gaussian process prior, combined with endoscopy-specific encoders and specular masking, to achieve up to 26.1% lower image reconstruction RMSE on the C3VDv2 colonoscopy dataset.
Masked Autoencoders in Computer Vision: A Comprehensive Survey,
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
Affinity-propagation clustering of Arctic VHSR imagery enables MAE pretraining of a ViT-Large encoder that outperforms ImageNet and Prithvi-EO-2.0 baselines by 5-15 percentage points in mean F1 on four downstream Arctic detection and segmentation tasks.
citing papers explorer
-
Gaussian Process Prior Variational Autoencoder for Endoscopic Videos
GPVAE replaces the standard VAE latent prior with a temporal Gaussian process prior, combined with endoscopy-specific encoders and specular masking, to achieve up to 26.1% lower image reconstruction RMSE on the C3VDv2 colonoscopy dataset.
-
Clustering Guided Domain-Specific Pretrained Foundation Model for Very High-Resolution Arctic Remote Sensing
Affinity-propagation clustering of Arctic VHSR imagery enables MAE pretraining of a ViT-Large encoder that outperforms ImageNet and Prithvi-EO-2.0 baselines by 5-15 percentage points in mean F1 on four downstream Arctic detection and segmentation tasks.