VASA is a vision-guided agent for open ad-hoc segmentation that creates and validates masks through planning, tool use, and error recovery, outperforming baselines on the new PARS benchmark and RefCOCOm.
Reproducible scaling laws for contrastive language-image learning
3 Pith papers cite this work. Polarity classification is still indexing.
years
2026 3verdicts
UNVERDICTED 3representative citing papers
MIRAGE achieves state-of-the-art mental image reconstruction from fMRI on the NSD-Imagery benchmark by using a linear backbone with multi-modal text and image features fed to a diffusion model.
A per-class loss reweighting scheme based on distributional robustness allows CLIP models to perform class-incremental and domain-incremental learning with minimal memory while limiting forgetting on CIFAR-100, ImageNet1K, and DomainNet.
citing papers explorer
-
Vision Harnessing Agent for Open Ad-hoc Segmentation
VASA is a vision-guided agent for open ad-hoc segmentation that creates and validates masks through planning, tool use, and error recovery, outperforming baselines on the new PARS benchmark and RefCOCOm.
-
MIRAGE: Robust multi-modal architectures translate fMRI-to-image models from vision to mental imagery
MIRAGE achieves state-of-the-art mental image reconstruction from fMRI on the NSD-Imagery benchmark by using a linear backbone with multi-modal text and image features fed to a diffusion model.
-
Memory-Efficient Continual Learning with CLIP Models
A per-class loss reweighting scheme based on distributional robustness allows CLIP models to perform class-incremental and domain-incremental learning with minimal memory while limiting forgetting on CIFAR-100, ImageNet1K, and DomainNet.