GraphDepth integrates multi-scale GraphSAGE layers into a ResNet-101 U-Net with attention-gated skips and uncertainty estimation to deliver competitive monocular depth accuracy at lower computational cost than transformers.
arXiv preprint arXiv:2203.14211 , year=
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
fields
cs.CV 2verdicts
UNVERDICTED 2representative citing papers
Adding register tokens to Vision Transformers eliminates high-norm background artifacts and raises state-of-the-art performance on dense visual prediction tasks.
citing papers explorer
-
Efficient Hybrid CNN-GNN Architecture for Monocular Depth Estimation
GraphDepth integrates multi-scale GraphSAGE layers into a ResNet-101 U-Net with attention-gated skips and uncertainty estimation to deliver competitive monocular depth accuracy at lower computational cost than transformers.
-
Vision Transformers Need Registers
Adding register tokens to Vision Transformers eliminates high-norm background artifacts and raises state-of-the-art performance on dense visual prediction tasks.