A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.
Eulermormer: Robust eulerian motion magnification via dynamic filtering within transformer
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
SQI uses axiomatic constraints, hierarchical decomposition, and counterfactual verification to align linguistic reasoning with visual perception in frozen VLMs, achieving second place on the DataCV 2026 illusion challenge.
citing papers explorer
-
3D Smoke Scene Reconstruction Guided by Vision Priors from Multimodal Large Language Models
A framework that combines MLLM-based image enhancement with a medium-aware 3D Gaussian Splatting model to reconstruct and render smoke scenes.
-
Beyond Shortcuts: Mitigating Visual Illusions in Frozen VLMs via Qualitative Reasoning
SQI uses axiomatic constraints, hierarchical decomposition, and counterfactual verification to align linguistic reasoning with visual perception in frozen VLMs, achieving second place on the DataCV 2026 illusion challenge.