hub Mixed citations

Attention U-Net: Learning Where to Look for the Pancreas

Ozan Oktay, Jo Schlemper, Loic Le Folgoc, Matthew Lee, Mattias Heinrich, Kazunari Misawa · 2018 · cs.CV · arXiv 1804.03999

Mixed citation behavior. Most common role is background (56%).

72 Pith papers citing it

Background 56% of classified citations

open full Pith review browse 72 citing papers arXiv PDF

abstract

We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 7 baseline 1 method 1

citation-polarity summary

background 5 unclear 2 baseline 1 use method 1

representative citing papers

Human and AI collaboration for pulmonary nodule segmentation

cs.CV · 2026-06-21 · unverdicted · novelty 7.0

Hi-Seg achieves a mean Dice score of nearly 85% for pulmonary nodule segmentation by having humans iteratively refine prompts for the Segment Anything Model, outperforming standalone deep learning and SAM models on a large multi-center dataset.

AuraMask: An Extensible Pipeline for Developing Aesthetic Anti-Facial Recognition Image Filters

cs.CV · 2026-05-13 · conditional · novelty 7.0

AuraMask produces 40 aesthetic anti-facial recognition filters that match or exceed prior adversarial effectiveness and achieve significantly higher user acceptance in a 630-person study.

TopoU-Net: a U-Net architecture for topological domains

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.

XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation

cs.CV · 2026-03-28 · unverdicted · novelty 7.0

XAttnRes introduces cross-stage attention residuals that maintain a global feature history and selectively aggregate prior representations, improving medical image segmentation and performing on par with baselines even without skip connections.

Gated Differential Linear Attention: A Linear-Time Decoder for High-Fidelity Medical Segmentation

cs.CV · 2026-03-03 · unverdicted · novelty 7.0

GDLA delivers state-of-the-art accuracy on CT, MRI, ultrasound and dermoscopy segmentation benchmarks while keeping linear O(N) complexity in a PVT encoder-decoder.

Information Filtering via Variational Regularization for Robot Manipulation

cs.RO · 2026-01-29 · unverdicted · novelty 7.0

Variational Regularization imposes an adaptive information bottleneck on noisy intermediate features in DP3-UNet and DP3-DiT policies, consistently raising task success rates on RoboTwin2.0, Adroit, and MetaWorld while achieving new state-of-the-art results.

S2M-Net: Spectral-Spatial Mixing for Medical Image Segmentation with Morphology-Aware Adaptive Loss

cs.CV · 2026-01-03 · unverdicted · novelty 7.0

S2M-Net achieves state-of-the-art Dice scores on 16 medical datasets across 8 modalities using a 4.7M-parameter spectral-spatial mixer and morphology-aware adaptive loss, outperforming transformers with 3.5-6x fewer parameters.

CurvSegFlow: Time-Conditioned Flow Matching for Robust Segmentation of Curvilinear Structures in Noisy Biomedical Images

cs.CV · 2026-06-19 · unverdicted · novelty 6.0

CurvSegFlow applies time-conditioned flow matching with a U-Net backbone and triple-term loss to progressively refine segmentations of thin structures in noisy images, reporting competitive performance on microtubule, vessel, and nerve datasets.

PU-UNet: Stable Multiplicative Interactions for Medical Image Segmentation

cs.CV · 2026-06-18 · unverdicted · novelty 6.0

PU-UNet integrates stabilized product units into low-resolution residual blocks of a U-Net, reporting higher Dice scores than a matched residual U-Net baseline on ISIC 2018, Kvasir-SEG, and BUSI datasets with nearly identical parameters and latency.

EyeMVP: OCT-Informed Fundus Representation Learning via Paired CFP--OCT Pretraining

cs.CV · 2026-06-13 · unverdicted · novelty 6.0

EyeMVP learns OCT-informed CFP representations via cross-modal masked reconstruction on 674k paired triples and reports competitive or superior performance on 15 retinal classification and segmentation tasks.

Learning Dynamic Aperture from One-turn Maps

physics.acc-ph · 2026-06-05 · unverdicted · novelty 6.0

A deep surrogate model learns coarse-grained dynamic aperture directly from suitably encoded one-turn maps by treating stability prediction as image segmentation and transfers to realistic EIC tracking.

MS-DKC: A Dataset Knowledge Card Framework for Designing and Adapting Medical Image Segmentation Models

cs.CV · 2026-06-04 · unverdicted · novelty 6.0

MS-DKC is a dataset knowledge card framework that maps image, morphology, supervision, context, and risk descriptors to design priors and failure modes, shown to produce dataset-specific model adaptations with improved metrics on DRIVE, ISIC2018, and ACDC.

XSSR: Cross-Domain Self-Supervised Representative Selection for Efficient Annotation in Medical Image Segmentation

cs.CV · 2026-06-03 · conditional · novelty 6.0

XSSR selects 5% of target samples via source-trained MAE embeddings and auto-calibrated greedy scoring to reach 99.3% of full-data Dice on chest X-ray and outperform random/CoreSet baselines on retinal and prostate MRI benchmarks.

BiSegMamba: Efficient Bidirectional Tri-Oriented Mamba for 3D Medical Image Segmentation

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

BiSegMamba is a bidirectional tri-oriented Mamba architecture that improves performance and reduces FLOPs in 3D medical image segmentation across brain, cardiac, abdominal, and vascular tasks.

LegSegNet: A Public Deep Learning System for Lower Extremity CT Tissue Segmentation and Quantification

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

LegSegNet is the first public end-to-end deep learning system for lower extremity CT tissue segmentation and body composition quantification, reporting an average Dice score of 89.31 on held-out test slices.

K-U-KAN: Koopman-Enhanced U-KAN for 3D Dental Reconstruction from a Single Panoramic X-ray Radiograph

cs.CV · 2026-05-24 · unverdicted · novelty 6.0

K-U-KAN combines KAN feature lifting, Koopman linear dynamics, and U-KAN refinement with physical and geometric priors to reconstruct 3D dental anatomy from single panoramic radiographs, matching baselines on metrics while improving perceptual quality and halving training time.

StruMPL: Multi-task Dense Regression under Disjoint Partial Supervision and MNAR Labels

cs.CV · 2026-05-19 · unverdicted · novelty 6.0

StruMPL is a multi-task dense regression model that jointly addresses disjoint partial supervision, MNAR labels, and inter-task physical constraints for improved forest biomass estimation from Earth observation.

Spectral Vision Transformer for Efficient Tokenization with Limited Data

cs.CV · 2026-05-12 · unverdicted · novelty 6.0

A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.

FEFormer: Frequency-enhanced Vision Transformer for Generic Knowledge Extraction and Adaptive Feature Fusion in Volumetric Medical Image Segmentation

eess.IV · 2026-05-12 · unverdicted · novelty 6.0

A frequency-enhanced Vision Transformer with FDSA, FGMLP, WAFF, and FCSB modules delivers superior volumetric medical image segmentation performance and efficiency over prior state-of-the-art methods.

ESICA: A Scalable Framework for Text-Guided 3D Medical Image Segmentation

cs.CV · 2026-04-27 · unverdicted · novelty 6.0

ESICA delivers state-of-the-art accuracy on a five-modality 3D medical segmentation benchmark while offering a compact variant with far fewer parameters.

Mapping License Plate Recoverability Under Extreme Viewing Angles for Opportunistic Urban Sensing

cs.CV · 2026-04-26 · unverdicted · novelty 6.0 · 2 refs

Defines recoverability maps via dense synthetic degradation sweeps and two summary metrics to show AI restoration recovers license plates from ~93% of extreme angle parameter space, with geometry rather than model architecture as the binding limit.

Learning from Noisy Prompts: Saliency-Guided Prompt Distillation for Robust Segmentation with SAM

cs.CV · 2026-04-25 · unverdicted · novelty 6.0

SPD improves SAM segmentation robustness to noisy prompts by learning anatomical saliency priors, distilling consensus prompts from adjacent slices, and enforcing pairwise slice consistency.

Toward Polymorphic Backdoor against Semantic Communication via Intensity-Based Poisoning

cs.CR · 2026-04-25 · unverdicted · novelty 6.0

SemBugger achieves polymorphic backdoors in semantic communication via graded-intensity trigger poisoning and hierarchical loss, plus a noise-based defense with a theoretical efficacy bound.

CDSA-Net:Collaborative Decoupling of Vascular Structure and Background for High-Fidelity Coronary Digital Subtraction Angiography

cs.CV · 2026-04-19 · unverdicted · novelty 6.0

CDSA-Net decouples vascular structure extraction and background restoration in coronary DSA via hierarchical geometric priors and adaptive noise modeling to eliminate artifacts while preserving tissue fidelity.

citing papers explorer

Showing 10 of 10 citing papers after filters.

CHEM: Estimating and Understanding Hallucinations in Deep Learning for Image Processing cs.CV · 2025-12-10 · unverdicted · none · ref 47 · internal anchor
The paper defines the Conformal Hallucination Estimation Metric (CHEM) that localizes hallucination-prone regions in image reconstruction models via multiscale representations and distribution-free conformal regression.
Can We Go Beyond Visual Features? Neural Tissue Relation Modeling for Relational Graph Analysis in Non-Melanoma Skin Histology cs.CV · 2025-12-07 · unverdicted · none · ref 26 · internal anchor
NTRM combines CNNs with tissue-level graph neural networks to model inter-tissue relationships, delivering 4.9% to 31.25% higher Dice scores than prior methods on a non-melanoma skin cancer histology segmentation benchmark.
SAMRI: Segment Any MRI eess.IV · 2025-10-30 · conditional · none · ref 21 · internal anchor
SAMRI fine-tunes only the mask decoder of SAM on 1.1 million MRI slices from 30 datasets to reach mean DSC 0.87 on 47 targets and strong zero-shot performance.
Category-based Galaxy Image Generation via Diffusion Models astro-ph.IM · 2025-06-19 · unverdicted · none · ref 62 · internal anchor
GalCatDiff applies category embeddings and a novel Astro-RAB block inside diffusion models to produce galaxy images whose color and size distributions match observations more closely than prior generative approaches.
GroupKAN: Efficient Kolmogorov-Arnold Networks via Grouped Spline Modeling cs.CV · 2025-11-07 · conditional · none · ref 17 · internal anchor
GroupKAN reduces KAN parameter scaling via intra-group spline mappings, delivering 79.80% average IoU (+1.11% over U-KAN) at 47.6% of the parameters on BUSI, GlaS, and CVC datasets.
BGRem: A background noise remover for astronomical images based on a diffusion model astro-ph.IM · 2025-10-06 · unverdicted · none · ref 20 · internal anchor
BGRem applies a supervised diffusion model to denoise MeerLICHT and Fermi-LAT images, raising true-positive source detections by roughly 7% when used before SExtractor.
A novel attention mechanism for noise-adaptive and robust segmentation of microtubules in microscopy images q-bio.QM · 2025-07-10 · conditional · none · ref 11 · internal anchor
ASE_Res_UNet with a novel noise-adaptive attention mechanism outperforms ablated variants and alternative architectures in segmenting microtubules from noisy synthetic and real microscopy images while using fewer parameters and transfers to other curvilinear structures.
MSLAU-Net: A Hybrid CNN-Transformer Network for Medical Image Segmentation cs.CV · 2025-05-24 · conditional · none · ref 34 · internal anchor
MSLAU-Net proposes a hybrid CNN-Transformer architecture using multi-scale linear attention and lightweight top-down aggregation that outperforms prior methods on medical segmentation benchmarks across three modalities.
Focal Modulation and Bidirectional Feature Fusion Network for Medical Image Segmentation cs.CV · 2025-10-23 · unverdicted · none · ref 53 · internal anchor
FM-BFF-Net combines focal modulation attention with bidirectional encoder-decoder fusion in a CNN-transformer architecture and reports higher Dice and Jaccard scores than recent methods across eight medical image datasets.
Clinical utility of foundation models in musculoskeletal MRI for biomarker fidelity and predictive outcomes eess.IV · 2025-01-23 · unverdicted · none · ref 6 · internal anchor
Fine-tuned foundation models produce reliable MSK MRI biomarkers that support workload-reducing triage and calibrated 48-month prediction of knee replacement and incident OA.

Attention U-Net: Learning Where to Look for the Pancreas

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer