hub

V-Net: Fully Convolutional Neural Networks for V olumetric Medical Image Segmentation

Fausto Milletari, Nassir Navab, Seyed-Ahmad Ahmadi · 2016 · DOI 10.1109/3dv

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

open at publisher browse 10 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

citation-role summary

background 3 baseline 1

citation-polarity summary

background 3 baseline 1

representative citing papers

ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue

cs.RO · 2026-05-02 · unverdicted · novelty 7.0

ESARBench is the first unified benchmark for MLLM-driven UAV agents that must explore, locate clues, and decide on victim positions in photorealistic simulated SAR environments.

A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features

cs.CV · 2025-10-01 · unverdicted · novelty 7.0

FastForward represents scenes as collections of 3D-anchored image features and performs camera pose estimation via feed-forward correspondence prediction, achieving competitive accuracy with minimal mapping time.

DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation

cs.CV · 2026-05-08 · conditional · novelty 6.0

Frozen DINOv3 features with multi-view MLP probes, entropy-weighted fusion, and spatial regularization achieve 0.895 Dice on Kvasir-SEG, 0.897 on ISIC 2018, and 0.908 on BraTS FLAIR, recovering 98.4% of full-data performance with only five annotated patients.

Moondream Segmentation: From Words to Masks

cs.CV · 2026-04-03 · unverdicted · novelty 6.0

Moondream Segmentation achieves 80.2% cIoU on RefCOCO by autoregressively decoding paths from referring expressions and using RL to refine masks, plus releases a cleaned RefCOCO-M dataset.

DSER: Spectral Epipolar Representation for Efficient Light Field Depth Estimation

cs.CV · 2025-08-12 · unverdicted · novelty 6.0

DSER combines spectral epipolar regularization with a hybrid pipeline of gradient initialization, plane-sweeping, multiscale refinement, and occlusion-aware random walk to produce structurally consistent depth maps from light fields.

Ada2MS: A Hybrid Optimization Algorithm Based on Exponential Mixing of Elementwise and Global Second-Moment Estimates

cs.LG · 2026-05-19 · unverdicted · novelty 5.0

Ada2MS is a new optimizer that exponentially mixes elementwise and global second-moment estimates to interpolate between AdamW and momentum-SGD behaviors and reports competitive results on visual tasks under a unified protocol.

Efficient 3D Content Reconstruction and Generation

cs.CV · 2026-05-18 · unverdicted · novelty 5.0

Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.

Dual-stream Spatio-Temporal GCN-Transformer Network for 3D Human Pose Estimation

cs.CV · 2026-04-20 · unverdicted · novelty 5.0

MixTGFormer reports state-of-the-art 3D pose estimation errors of 37.6 mm on Human3.6M and 15.7 mm on MPI-INF-3DHP by using parallel GCN-Transformer streams with SE layers for local-global feature fusion.

Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction

cs.CV · 2026-04-03 · unverdicted · novelty 5.0

A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.

4D Radar Semantic Segmentation of People in Field Conditions Using Temporal Multi-View Networks

cs.CV · 2024-04-08 · unverdicted · novelty 5.0

TMVA4D uses CNN and ConvLSTM encoders on multi-view 2D projections of 4D radar point clouds for semantic segmentation of people, reporting Dice 75.9% and IoU 61.2% in field tests.

citing papers explorer

Showing 10 of 10 citing papers.

ESARBench: A Benchmark for Agentic UAV Embodied Search and Rescue cs.RO · 2026-05-02 · unverdicted · none · ref 6
ESARBench is the first unified benchmark for MLLM-driven UAV agents that must explore, locate clues, and decide on victim positions in photorealistic simulated SAR environments.
A Scene is Worth a Thousand Features: Feed-Forward Camera Localization from a Collection of Image Features cs.CV · 2025-10-01 · unverdicted · none · ref 2
FastForward represents scenes as collections of 3D-anchored image features and performs camera pose estimation via feed-forward correspondence prediction, achieving competitive accuracy with minimal mapping time.
DINO-MVR: Multi-View Readout of Frozen DINOv3 for Annotation-Efficient Medical Segmentation cs.CV · 2026-05-08 · conditional · none · ref 6
Frozen DINOv3 features with multi-view MLP probes, entropy-weighted fusion, and spatial regularization achieve 0.895 Dice on Kvasir-SEG, 0.897 on ISIC 2018, and 0.908 on BraTS FLAIR, recovering 98.4% of full-data performance with only five annotated patients.
Moondream Segmentation: From Words to Masks cs.CV · 2026-04-03 · unverdicted · none · ref 14
Moondream Segmentation achieves 80.2% cIoU on RefCOCO by autoregressively decoding paths from referring expressions and using RL to refine masks, plus releases a cleaned RefCOCO-M dataset.
DSER: Spectral Epipolar Representation for Efficient Light Field Depth Estimation cs.CV · 2025-08-12 · unverdicted · none · ref 21
DSER combines spectral epipolar regularization with a hybrid pipeline of gradient initialization, plane-sweeping, multiscale refinement, and occlusion-aware random walk to produce structurally consistent depth maps from light fields.
Ada2MS: A Hybrid Optimization Algorithm Based on Exponential Mixing of Elementwise and Global Second-Moment Estimates cs.LG · 2026-05-19 · unverdicted · none · ref 35
Ada2MS is a new optimizer that exponentially mixes elementwise and global second-moment estimates to interpolate between AdamW and momentum-SGD behaviors and reports competitive results on visual tasks under a unified protocol.
Efficient 3D Content Reconstruction and Generation cs.CV · 2026-05-18 · unverdicted · none · ref 276
Presents Instant3D for rapid text/image-to-3D generation via multi-view diffusion plus feed-forward reconstruction, and FastMap for 10x faster structure-from-motion with comparable accuracy.
Dual-stream Spatio-Temporal GCN-Transformer Network for 3D Human Pose Estimation cs.CV · 2026-04-20 · unverdicted · none · ref 47
MixTGFormer reports state-of-the-art 3D pose estimation errors of 37.6 mm on Human3.6M and 15.7 mm on MPI-INF-3DHP by using parallel GCN-Transformer streams with SE layers for local-global feature fusion.
Hierarchical Awareness Adapters with Hybrid Pyramid Feature Fusion for Dense Depth Prediction cs.CV · 2026-04-03 · unverdicted · none · ref 22
A multilevel perceptual CRF model using Swin Transformer, HPF fusion, HA adapters, and dynamic scaling attention achieves state-of-the-art monocular depth estimation on NYU Depth v2, KITTI, and MatterPort3D with reduced error and fast inference.
4D Radar Semantic Segmentation of People in Field Conditions Using Temporal Multi-View Networks cs.CV · 2024-04-08 · unverdicted · none · ref 11
TMVA4D uses CNN and ConvLSTM encoders on multi-view 2D projections of 4D radar point clouds for semantic segmentation of people, reporting Dice 75.9% and IoU 61.2% in field tests.

V-Net: Fully Convolutional Neural Networks for V olumetric Medical Image Segmentation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer