Monocular depth estimation is recast as indirect feature restoration via an invertible diffusion module plus auxiliary viewpoint enhancement, delivering 4-38% RMSE gains on KITTI over baselines.
Evp: Enhanced visual perception using inverse multi-attentive feature refinement and regularized image-text alignment
2 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.CV 2years
2026 2roles
baseline 1polarities
baseline 1representative citing papers
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
citing papers explorer
-
Monocular Depth Estimation From the Perspective of Feature Restoration: A Diffusion Enhanced Depth Restoration Approach
Monocular depth estimation is recast as indirect feature restoration via an invertible diffusion module plus auxiliary viewpoint enhancement, delivering 4-38% RMSE gains on KITTI over baselines.
-
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.