MPerS dynamically mixes semantic guidance from MLLM-generated RS captions with DINOv3 features via MixExperts and Linguistic Query Guided Attention to achieve superior semantic segmentation on three public remote sensing datasets.
Image fusion via vision-language model.arXiv preprint arXiv:2402.02235
3 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 3years
2026 3verdicts
UNVERDICTED 3roles
background 1polarities
background 1representative citing papers
FusionProxy is a distilled diffusion-based fusion module that adds thermal awareness to RGB vision systems in real time as an independent plug-and-play component.
Hybrid CNN-ViT with adaptive attention gate achieves 97.6% accuracy on brain tumor MRI classification, outperforming baselines.
citing papers explorer
-
MPerS: Dynamic MLLM MixExperts Perception-Guided Remote Sensing Scene Segmentation
MPerS dynamically mixes semantic guidance from MLLM-generated RS captions with DINOv3 features via MixExperts and Linguistic Query Guided Attention to achieve superior semantic segmentation on three public remote sensing datasets.
-
Adding Thermal Awareness to Visual Systems in Real-Time via Distilled Diffusion Models
FusionProxy is a distilled diffusion-based fusion module that adds thermal awareness to RGB vision systems in real time as an independent plug-and-play component.
-
CNN-ViT Fusion with Adaptive Attention Gate for Brain Tumor MRI Classification: A Hybrid Deep Learning Model
Hybrid CNN-ViT with adaptive attention gate achieves 97.6% accuracy on brain tumor MRI classification, outperforming baselines.