Recognition: 3 theorem links
· Lean TheoremAttention U-Net: Learning Where to Look for the Pancreas
Pith reviewed 2026-05-12 21:13 UTC · model grok-4.3
The pith
Attention gates added to U-Net let the model learn to focus on target structures in CT images, raising segmentation accuracy while removing the need for separate organ localization steps.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce attention gates (AGs) that automatically learn to focus on target structures of varying shapes and sizes in medical images. When integrated into U-Net, the gates suppress irrelevant regions in the input while emphasizing salient features for the segmentation task. This removes the requirement for explicit external tissue or organ localisation modules in cascaded CNNs. Experiments on two large CT abdominal datasets for multi-class segmentation demonstrate that AGs improve U-Net prediction performance consistently across datasets and training sizes while preserving computational efficiency.
What carries the argument
Attention gates (AGs), modules inserted into the skip connections of U-Net that learn to filter feature maps by suppressing irrelevant spatial regions and amplifying task-relevant ones.
If this is right
- AGs integrate into standard CNNs such as U-Net with only minor added computation.
- Prediction accuracy and sensitivity increase consistently on abdominal CT segmentation tasks.
- The model works across different dataset sizes and multiple training conditions without retraining the base architecture.
- Cascaded localisation-plus-segmentation pipelines become unnecessary.
- Computational efficiency remains comparable to the unmodified U-Net.
Where Pith is reading between the lines
- The same gate design could be tested on MRI or ultrasound volumes where organ boundaries vary even more than in CT.
- Because the gates operate on feature maps, they might reduce the amount of manual annotation needed for training by guiding the network to salient areas automatically.
- Replacing explicit localisation stages with learned attention could shorten overall inference pipelines in clinical workflows.
- Combining AGs with other forms of attention, such as channel-wise, remains an open extension not explored in the reported experiments.
Load-bearing premise
Attention gates will reliably learn to suppress irrelevant regions and highlight salient features for target structures of varying shapes and sizes without requiring explicit external tissue or organ localisation modules.
What would settle it
Train both standard U-Net and Attention U-Net on the same small subset of one CT dataset and measure Dice scores plus inference time on a fixed test set; if Dice does not rise or runtime overhead exceeds minimal levels, the central claim does not hold.
read the original abstract
We propose a novel attention gate (AG) model for medical imaging that automatically learns to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs). AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy. The proposed Attention U-Net architecture is evaluated on two large CT abdominal datasets for multi-class image segmentation. Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency. The code for the proposed architecture is publicly available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes Attention Gates (AGs) as an architectural component to integrate into U-Net for medical image segmentation. AGs automatically learn to suppress irrelevant regions and highlight salient features for target structures of varying shapes and sizes, with the goal of eliminating explicit external tissue/organ localization modules required in cascaded CNN pipelines. The Attention U-Net is evaluated on two large CT abdominal datasets for multi-class segmentation, claiming consistent performance gains over standard U-Net across datasets and training sizes with minimal computational overhead. The code is made publicly available.
Significance. If the central claims hold, the work provides a lightweight attention mechanism that can improve segmentation sensitivity and accuracy in standard CNNs without separate localization stages, which would simplify pipelines in medical imaging. The public code supports reproducibility, a clear strength. However, the significance is limited by the absence of direct comparisons to cascaded baselines, leaving open whether observed gains truly substitute for explicit localization or merely reflect added model capacity.
major comments (2)
- Abstract: The load-bearing claim that AGs 'enable us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks' is not supported by the experiments, which compare Attention U-Net only to plain U-Net on two CT datasets. No cascaded baseline (coarse localization network followed by fine segmentation) is evaluated for either Dice accuracy or total inference cost, so it remains possible that gains arise from multi-scale attention adding capacity rather than substituting for localization.
- Experimental Results section: The abstract asserts 'consistent gains' and 'improved prediction performance' but the reported evaluation lacks specific quantitative metrics (e.g., Dice scores per class or dataset), error bars, number of training runs, or implementation details such as training sizes and hyper-parameters, which undermines verification of the performance claims and cross-dataset consistency.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below, acknowledging where the manuscript claims require qualification or additional clarification. We will revise the manuscript accordingly to strengthen the presentation of results and temper unsupported assertions.
read point-by-point responses
-
Referee: Abstract: The load-bearing claim that AGs 'enable us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks' is not supported by the experiments, which compare Attention U-Net only to plain U-Net on two CT datasets. No cascaded baseline (coarse localization network followed by fine segmentation) is evaluated for either Dice accuracy or total inference cost, so it remains possible that gains arise from multi-scale attention adding capacity rather than substituting for localization.
Authors: We agree that the abstract claim is not directly supported by the experiments, as no cascaded baseline is evaluated. The attention gates are intended to provide implicit localization by suppressing irrelevant regions, which is supported by the observed improvements over standard U-Net. However, without a head-to-head comparison on accuracy and inference cost, we cannot claim that AGs fully substitute for explicit localization modules. We will revise the abstract to qualify the statement (e.g., 'can reduce the need for explicit external localization modules') and add a limitations paragraph discussing this point. A cascaded baseline comparison is not feasible to add at this stage due to time and scope constraints. revision: partial
-
Referee: Experimental Results section: The abstract asserts 'consistent gains' and 'improved prediction performance' but the reported evaluation lacks specific quantitative metrics (e.g., Dice scores per class or dataset), error bars, number of training runs, or implementation details such as training sizes and hyper-parameters, which undermines verification of the performance claims and cross-dataset consistency.
Authors: The full manuscript reports per-class Dice scores, Hausdorff distances, and other metrics for both datasets in Tables 1–3, with results broken down by training set size (25%, 50%, 100%). Hyperparameters and training protocols are detailed in Section 3.2. We acknowledge that standard deviations across multiple runs and the precise number of independent training runs were not reported. We will add these (from 3 runs per configuration) and ensure all quantitative results are more explicitly cross-referenced in the text to better substantiate the claims of consistent gains. revision: yes
Circularity Check
No significant circularity; novel architectural component with independent empirical tests
full rationale
The paper proposes attention gates as an independent architectural addition to U-Net, with the central claim (elimination of explicit cascaded localization modules) supported by direct experiments on two external CT datasets showing Dice improvements. No equations, self-definitional reductions, fitted parameters renamed as predictions, or load-bearing self-citations appear in the provided text. The derivation chain consists of a new mechanism plus standard training and evaluation, remaining self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Convolutional neural networks trained end-to-end on medical CT images can perform multi-class segmentation tasks.
invented entities (1)
-
Attention Gate (AG)
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith.Foundation.DAlembert.Inevitabilitybilinear_family_forced echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
AGs automatically learn to focus on target structures of varying shapes and sizes. Models trained with AGs implicitly learn to suppress irrelevant regions in an input image while highlighting salient features useful for a specific task. This enables us to eliminate the necessity of using explicit external tissue/organ localisation modules of cascaded convolutional neural networks (CNNs).
-
IndisputableMonolith.Foundation.LedgerCanonicalityno_free_knobs echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
AGs can be easily integrated into standard CNN architectures such as the U-Net model with minimal computational overhead while increasing the model sensitivity and prediction accuracy.
-
IndisputableMonolith.Foundation.DimensionForcingdimension_forced unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Experimental results show that AGs consistently improve the prediction performance of U-Net across different datasets and training sizes while preserving computational efficiency.
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Forward citations
Cited by 29 Pith papers
-
AuraMask: An Extensible Pipeline for Developing Aesthetic Anti-Facial Recognition Image Filters
AuraMask produces 40 aesthetic anti-facial recognition filters that match or exceed prior adversarial effectiveness and achieve significantly higher user acceptance in a 630-person study.
-
TopoU-Net: a U-Net architecture for topological domains
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
-
XAttnRes: Cross-Stage Attention Residuals for Medical Image Segmentation
XAttnRes introduces cross-stage attention residuals that maintain a global feature history and selectively aggregate prior representations, improving medical image segmentation and performing on par with baselines eve...
-
Spectral Vision Transformer for Efficient Tokenization with Limited Data
A spectral vision transformer achieves equitable or superior performance with fewer parameters than standard ViTs, CNNs, and other models by using spectral projections for tokenization in limited-data medical imaging.
-
FEFormer: Frequency-enhanced Vision Transformer for Generic Knowledge Extraction and Adaptive Feature Fusion in Volumetric Medical Image Segmentation
A frequency-enhanced Vision Transformer with FDSA, FGMLP, WAFF, and FCSB modules delivers superior volumetric medical image segmentation performance and efficiency over prior state-of-the-art methods.
-
Polygon-mamba: Retinal vessel segmentation using polygon scanning mamba and space-frequency collaborative attention
Polygon-Mamba achieves F1 scores of 0.8283, 0.8282, and 0.8251 on DRIVE, STARE, and CHASE_DB1 by combining polygon scanning Mamba with space-frequency collaborative attention to better detect small retinal vessels.
-
ESICA: A Scalable Framework for Text-Guided 3D Medical Image Segmentation
ESICA delivers state-of-the-art accuracy on a five-modality 3D medical segmentation benchmark while offering a compact variant with far fewer parameters.
-
Mapping License Plate Recoverability Under Extreme Viewing Angles for Oppor-tunistic Urban Sensing
Recoverability maps use synthetic sweeps of viewing angles and artifacts to quantify the recoverable fraction of parameter space for license plate restoration, with the best model succeeding on 93% and geometry settin...
-
Learning from Noisy Prompts: Saliency-Guided Prompt Distillation for Robust Segmentation with SAM
SPD improves SAM segmentation robustness to noisy prompts by learning anatomical saliency priors, distilling consensus prompts from adjacent slices, and enforcing pairwise slice consistency.
-
Toward Polymorphic Backdoor against Semantic Communication via Intensity-Based Poisoning
SemBugger achieves polymorphic backdoors in semantic communication via graded-intensity trigger poisoning and hierarchical loss, plus a noise-based defense with a theoretical efficacy bound.
-
CDSA-Net:Collaborative Decoupling of Vascular Structure and Background for High-Fidelity Coronary Digital Subtraction Angiography
CDSA-Net decouples vascular structure extraction and background restoration in coronary DSA via hierarchical geometric priors and adaptive noise modeling to eliminate artifacts while preserving tissue fidelity.
-
Geometrical Cross-Attention and Nonvoid Voxelization for Efficient 3D Medical Image Segmentation
GCNV-Net achieves state-of-the-art accuracy on multiple 3D medical segmentation benchmarks while cutting FLOPs by 56% and inference latency by 68% through dynamic nonvoid voxelization and geometric attention.
-
Geometric Flood Depth Estimation: Fusing Transformer-Based Segmentation with Digital Elevation Models
A pipeline uses Mask2Former flood masks and DEMs to compute a single water surface elevation then derives local depths under hydrostatic equilibrium.
-
Beyond ViT Tokens: Masked-Diffusion Pretrained Convolutional Pathology Foundation Model for Cell-Level Dense Prediction
A masked-diffusion pretrained convolutional model outperforms ViT pathology foundation models on cell-level dense prediction tasks in histology.
-
MambaLiteUNet: Cross-Gated Adaptive Feature Fusion for Robust Skin Lesion Segmentation
MambaLiteUNet integrates Mamba into U-Net with adaptive fusion, local-global mixing, and cross-gated attention modules to reach 87.12% IoU and 93.09% Dice on skin lesion datasets while cutting parameters by 93.6%.
-
EDU-Net: Retinal Pathological Fluid Segmentation in OCT Images with Multiscale Feature Fusion and Boundary Optimization
EDU-Net fuses multiscale local and global features with boundary optimization to achieve state-of-the-art segmentation of intraretinal and subretinal fluid in OCT images.
-
Align then Refine: Text-Guided 3D Prostate Lesion Segmentation
A text-guided multi-encoder U-Net with alignment loss, heatmap calibration, and confidence-gated cross-attention refiner sets new state-of-the-art 3D prostate lesion segmentation performance on the PI-CAI dataset.
-
HQF-Net: A Hybrid Quantum-Classical Multi-Scale Fusion Network for Remote Sensing Image Segmentation
HQF-Net reports mIoU gains on three remote-sensing benchmarks by adding quantum circuits to skip connections and a mixture-of-experts bottleneck inside a classical U-Net fused with a DINOv3 backbone.
-
Attention-Guided Flow-Matching for Sparse 3D Geological Generation
3D-GeoFlow reformulates discrete categorical 3D geological generation as simulation-free continuous vector field regression with 3D attention gates, claiming to outperform heuristics and diffusion models on a 2,200-ca...
-
Med-DisSeg: Dispersion-Driven Representation Learning for Fine-Grained Medical Image Segmentation
Med-DisSeg uses a dispersive loss on batch representations plus adaptive multi-scale decoding to achieve state-of-the-art fine-grained segmentation on five medical imaging datasets.
-
Edge-Cloud Collaborative Pothole Detection via Onboard Event Screening and Federated Temporal Segmentation
An edge-cloud framework screens vibration events onboard with a GMM and uses a federated 1D Attention U-Net for temporal segmentation to detect potholes while reducing data transmission.
-
Multi-Dataset Cross-Domain Knowledge Distillation for Unified Medical Image Segmentation, Classification, and Detection
A multi-dataset cross-domain knowledge distillation approach improves unified performance on medical image segmentation, classification, and detection by transferring domain-invariant features from a joint teacher mod...
-
MAE-Based Self-Supervised Pretraining for Data-Efficient Medical Image Segmentation Using nnFormer
MAE self-supervised pretraining of nnFormer yields higher Dice scores, faster convergence, and better generalization when labeled medical segmentation data is scarce.
-
PBE-UNet: A light weight Progressive Boundary-Enhanced U-Net with Scale-Aware Aggregation for Ultrasound Image Segmentation
PBE-UNet adds scale-aware aggregation and progressive boundary expansion modules to U-Net and reports better segmentation performance than prior methods on four ultrasound datasets.
-
SwinTextUNet: Integrating CLIP-Based Text Guidance into Swin Transformer U-Nets for Medical Image Segmentation
SwinTextUNet integrates CLIP text guidance into Swin U-Net via cross-attention and convolutional fusion, achieving 86.47% Dice and 78.2% IoU on QaTaCOV19 medical image segmentation.
-
SAGE-GAN: Towards Realistic and Robust Segmentation of Spatially Ordered Nanoparticles via Attention-Guided GANs
SAGE-GAN integrates a self-attention U-Net into a CycleGAN framework to generate realistic synthetic electron microscopy image-mask pairs that augment training data for nanoparticle segmentation without human labeling.
-
Attention-ResUNet for Automated Fetal Head Segmentation
Attention-ResUNet reaches 99.30% mean Dice score on the HC18 fetal head ultrasound dataset, outperforming ResUNet, Attention U-Net, Swin U-Net, U-Net, and U-Net++ with statistical significance.
-
Adaptive Dual Residual U-Net with Attention Gate and Multiscale Spatial Attention Mechanisms (ADRUwAMS)
ADRUwAMS reports Dice scores of 0.9229 (whole tumor), 0.8432 (tumor core), and 0.8004 (enhancing tumor) on BraTS 2020 after training on BraTS 2019/2020 datasets.
-
Benchmarking CNN- and Transformer-Based Models for Surgical Instrument Segmentation in Robotic-Assisted Surgery
DeepLabV3 matches SegFormer performance in multi-class surgical instrument segmentation while convolutional baselines like UNet remain competitive on the SAR-RARP50 dataset.
Reference graph
Works this paper leans on
-
[1]
arXiv preprint arXiv:1707.07998 (2017)
Anderson, P., He, X., Buehler, C., Teney, D., Johnson, M., Gould, S., Zhang, L.: Bottom-up and top-down attention for image captioning and vqa. arXiv preprint arXiv:1707.07998 (2017)
-
[2]
Neural Machine Translation by Jointly Learning to Align and Translate
Bahdanau, D., Cho, K., Bengio, Y .: Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[3]
arXiv preprint arXiv:1710.09289 (2017)
Bai, W., Sinclair, M., Tarroni, G., Oktay, O., Rajchl, M., Vaillant, G., Lee, A.M., Aung, N., Lukaschuk, E., Sanghvi, M.M., et al.: Human-level CMR image analysis with deep fully convolutional networks. arXiv preprint arXiv:1710.09289 (2017)
-
[4]
Cai, J., Lu, L., Xie, Y ., Xing, F., Yang, L.: Improving deep pancreas segmentation in CT and MRI images via recurrent neural contextual learning and direct loss function. In: MICCAI (2017)
work page 2017
-
[5]
Cerrolaza, J.J., Summers, R.M., Linguraru, M.G.: Soft multi-organ shape models via generalized PCA: A general framework. In: MICCAI. pp. 219–228. Springer (2016)
work page 2016
-
[6]
Gibson, E., Giganti, F., Hu, Y ., Bonmati, E., Bandula, S., Gurusamy, K., Davidson, B.R., Pereira, S.P., Clarkson, M.J., Barratt, D.C.: Towards image-guided pancreas and biliary endoscopy: Au- tomatic multi-organ segmentation on abdominal CT with dense dilated networks. In: MICCAI. pp. 728–736. Springer (2017)
work page 2017
-
[7]
arXiv preprint arXiv:1612.07771 (2016)
Greff, K., Srivastava, R.K., Schmidhuber, J.: Highway and residual networks learn unrolled iterative estimation. arXiv preprint arXiv:1612.07771 (2016)
-
[8]
arXiv preprint arXiv:1801.09449 (2018)
Heinrich, M.P., Blendowski, M., Oktay, O.: TernaryNet: Faster deep model inference without GPUs for medical 3D segmentation using sparse and binary convolutions. arXiv preprint arXiv:1801.09449 (2018)
-
[9]
Heinrich, M.P., Oktay, O.: BRIEFnet: Deep pancreas segmentation using binary sparse convolu- tions. In: MICCAI. pp. 329–337. Springer (2017)
work page 2017
-
[10]
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. arXiv:1709.01507 (2017)
-
[11]
Jetley, S., Lord, N.A., Lee, N., Torr, P.: Learn to pay attention. In: International Conference on Learning Representations (2018), https://openreview.net/forum?id=HyzbhfWRW
work page 2018
-
[12]
In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries
Kamnitsas, K., Bai, W., Ferrante, E., McDonagh, S., Sinclair, M., Pawlowski, N., Rajchl, M., Lee, M., Kainz, B., Rueckert, D., Glocker, B.: Ensembles of multiple models and architectures for robust brain tumour segmentation. In: Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 450–462. Cham (2018)
work page 2018
-
[13]
Medical image analysis 36, 61–78 (2017)
Kamnitsas, K., Ledig, C., Newcombe, V .F., Simpson, J.P., Kane, A.D., Menon, D.K., Rueckert, D., Glocker, B.: Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical image analysis 36, 61–78 (2017)
work page 2017
-
[14]
arXiv preprint arXiv:1801.05173 (2018)
Khened, M., Kollerathu, V .A., Krishnamurthi, G.: Fully convolutional multi-scale residual densenets for cardiac segmentation and automated cardiac diagnosis using ensemble of classi- fiers. arXiv preprint arXiv:1801.05173 (2018)
-
[15]
Adam: A Method for Stochastic Optimization
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[16]
In: Artificial Intelligence and Statistics
Lee, C.Y ., Xie, S., Gallagher, P., Zhang, Z., Tu, Z.: Deeply-supervised nets. In: Artificial Intelligence and Statistics. pp. 562–570 (2015)
work page 2015
-
[17]
arXiv preprint arXiv:1711.08324 (2017)
Liao, F., Liang, M., Li, Z., Hu, X., Song, S.: Evaluate the malignancy of pulmonary nodules using the 3D deep leaky noisy-or network. arXiv preprint arXiv:1711.08324 (2017)
-
[18]
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: IEEE CVPR. pp. 3431–3440 (2015) 9
work page 2015
-
[19]
Luong, M.T., Pham, H., Manning, C.D.: Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025 (2015)
-
[20]
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumet- ric medical image segmentation. In: 3D Vision. pp. 565–571. IEEE (2016)
work page 2016
-
[21]
In: Advances in neural information processing systems
Mnih, V ., Heess, N., Graves, A., et al.: Recurrent models of visual attention. In: Advances in neural information processing systems. pp. 2204–2212 (2014)
work page 2014
-
[22]
Oda, M., Shimizu, N., Roth, H.R., Karasawa, K., Kitasaka, T., Misawa, K., Fujiwara, M., Rueckert, D., Mori, K.: 3D FCN feature driven regression forest-based pancreas localization and segmentation. In: DLMI, pp. 222–230. Springer (2017)
work page 2017
-
[23]
Payer, C., Štern, D., Bischof, H., Urschler, M.: Multi-label whole heart segmentation using CNNs and anatomical label configurations. In: STACOM. pp. 190–198. Springer (2017)
work page 2017
-
[24]
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: MICCAI. pp. 234–241. Springer (2015)
work page 2015
-
[25]
and Farag, Ayman and Turkbey, Evrim B
Roth, H., Farag, A., Turkbey, E.B., Lu, L., Liu, J., Summers, R.M.: Data from Pancreas-CT. The Cancer Imaging Archive (2016), http://doi.org/10.7937/K9/TCIA.2016.tNB1kqBU
-
[26]
Medical Image Analysis 45, 94 – 107 (2018)
Roth, H.R., Lu, L., Lay, N., Harrison, A.P., Farag, A., Sohn, A., Summers, R.M.: Spatial aggre- gation of holistically-nested convolutional neural networks for automated pancreas localization and segmentation. Medical Image Analysis 45, 94 – 107 (2018)
work page 2018
-
[27]
arXiv preprint arXiv:1704.06382 (2017)
Roth, H.R., Oda, H., Hayashi, Y ., Oda, M., Shimizu, N., Fujiwara, M., Misawa, K., Mori, K.: Hierarchical 3D fully convolutional networks for multi-organ segmentation. arXiv preprint arXiv:1704.06382 (2017)
-
[28]
Medical image analysis 28, 46–65 (2016)
Saito, A., Nawano, S., Shimizu, A.: Joint optimization of segmentation and shape prior from level-set-based statistical shape model, and its application to the automated segmentation of abdominal organs. Medical image analysis 28, 46–65 (2016)
work page 2016
-
[29]
arXiv preprint arXiv:1709.04696 (2017)
Shen, T., Zhou, T., Long, G., Jiang, J., Pan, S., Zhang, C.: Disan: Directional self-attention network for rnn/cnn-free language understanding. arXiv preprint arXiv:1709.04696 (2017)
- [30]
-
[31]
Veliˇckovi´c, P., Cucurull, G., Casanova, A., Romero, A., Liò, P., Bengio, Y .: Graph attention networks. arXiv preprint arXiv:1710.10903 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[32]
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., Tang, X.: Residual attention network for image classification. In: IEEE CVPR. pp. 3156–3164 (2017)
work page 2017
-
[33]
Wang, X., Girshick, R., Gupta, A., He, K.: Non-local neural networks. arXiv preprint arXiv:1711.07971 (2017)
-
[34]
Wolz, R., Chu, C., Misawa, K., Fujiwara, M., Mori, K., Rueckert, D.: Automated abdominal multi-organ segmentation with subject-specific atlas generation. IEEE TMI 32(9) (2013)
work page 2013
-
[35]
In: Proceedings of the IEEE international conference on computer vision
Xie, S., Tu, Z.: Holistically-nested edge detection. In: Proceedings of the IEEE international conference on computer vision. pp. 1395–1403 (2015)
work page 2015
-
[36]
arXiv preprint arXiv:1701.06452 (2017)
Ypsilantis, P.P., Montana, G.: Learning what to look in chest X-rays with a recurrent visual attention model. arXiv preprint arXiv:1701.06452 (2017)
-
[37]
arXiv preprint arXiv:1709.04518 (2017)
Yu, Q., Xie, L., Wang, Y ., Zhou, Y ., Fishman, E.K., Yuille, A.L.: Recurrent saliency transfor- mation network: Incorporating multi-stage visual cues for small organ segmentation. arXiv preprint arXiv:1709.04518 (2017)
-
[38]
Zhou, Y ., Xie, L., Shen, W., Wang, Y ., Fishman, E.K., Yuille, A.L.: A fixed-point model for pancreas segmentation in abdominal CT scans. In: MICCAI. pp. 693–701. Springer (2017)
work page 2017
-
[39]
In: International MICCAI Workshop on Medical Computer Vision
Zografos, V ., Valentinitsch, A., Rempfler, M., Tombari, F., Menze, B.: Hierarchical multi-organ segmentation without registration in 3D abdominal CT images. In: International MICCAI Workshop on Medical Computer Vision. pp. 37–46. Springer (2015) 10
work page 2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.