SpaceDG introduces the first large-scale degradation-aware spatial reasoning dataset using 3D Gaussian Splatting synthesis, showing that visual degradations impair MLLM performance but finetuning on the data improves robustness and can exceed human levels under degradation.
Analysing the robustness of vision-language-models to common corruptions
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
CFGPatch combines curved fractal geometry with modality-specific spiral textures to create adversarial patches that fool VIS-IR VLMs and transfer across classification, captioning, and VQA tasks.
citing papers explorer
-
SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation
SpaceDG introduces the first large-scale degradation-aware spatial reasoning dataset using 3D Gaussian Splatting synthesis, showing that visual degradations impair MLLM performance but finetuning on the data improves robustness and can exceed human levels under degradation.
-
Exposing Vulnerabilities in Visible-Infrared VLMs: A Unified Geometric Adversarial Framework with Cross-Task Transferability
CFGPatch combines curved fractal geometry with modality-specific spiral textures to create adversarial patches that fool VIS-IR VLMs and transfer across classification, captioning, and VQA tasks.