Recognition: unknown
ASGNet: Adaptive Spectrum Guidance Network for Automatic Polyp Segmentation
Pith reviewed 2026-05-10 11:19 UTC · model grok-4.3
The pith
Integrating frequency-domain spectral features into a neural network overcomes local spatial bias to segment polyps more completely.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that a spectrum-guided non-local perception module, combined with multi-source semantic extraction and dense cross-layer interaction decoding, integrates spectral features carrying global attributes to reduce spatial-domain bias, enhance polyp discriminability, and produce more accurate segmentations than purely spatial approaches.
What carries the argument
Spectrum-guided non-local perception module that jointly aggregates local spatial details with global frequency-domain information to refine polyp structures and boundaries.
If this is right
- Polyp boundaries become sharper because global context corrects incomplete local detections.
- Preliminary localization improves when high-level semantic cues from multiple sources guide the process.
- Cross-layer feature fusion produces representations that maintain both fine detail and overall structure.
- The same architecture can be applied directly to the five common polyp benchmarks without retraining from scratch.
Where Pith is reading between the lines
- Similar spectrum guidance could address local bias in other medical imaging tasks such as tumor or lesion segmentation.
- If the frequency integration proves stable, it might reduce reliance on heavy spatial data augmentation during training.
- The approach opens a route to test whether pure frequency-domain models can replace hybrid designs for global shape recovery.
Load-bearing premise
Spectral features from the frequency domain will reliably overcome local spatial bias and yield more complete polyp structures without introducing new artifacts or needing dataset-specific adjustments.
What would settle it
On a held-out colonoscopy dataset containing polyps with unusual shapes or heavy occlusion, ASGNet produces lower boundary accuracy or more fragmented masks than a spatial-only baseline network.
Figures
read the original abstract
Early identification and removal of polyps can reduce the risk of developing colorectal cancer. However, the diverse morphologies, complex backgrounds and often concealed nature of polyps make polyp segmentation in colonoscopy images highly challenging. Despite the promising performance of existing deep learning-based polyp segmentation methods, their perceptual capabilities remain biased toward local regions, mainly because of the strong spatial correlations between neighboring pixels in the spatial domain. This limitation makes it difficult to capture the complete polyp structures, ultimately leading to sub-optimal segmentation results. In this paper, we propose a novel adaptive spectrum guidance network, called ASGNet, which addresses the limitations of spatial perception by integrating spectral features with global attributes. Specifically, we first design a spectrum-guided non-local perception module that jointly aggregates local and global information, therefore enhancing the discriminability of polyp structures, and refining their boundaries. Moreover, we introduce a multi-source semantic extractor that integrates rich high-level semantic information to assist in the preliminary localization of polyps. Furthermore, we construct a dense cross-layer interaction decoder that effectively integrates diverse information from different layers and strengthens it to generate high-quality representations for accurate polyp segmentation. Extensive quantitative and qualitative results demonstrate the superiority of our ASGNet approach over 21 state-of-the-art methods across five widely-used polyp segmentation benchmarks. The code will be publicly available at: https://github.com/CSYSI/ASGNet.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ASGNet for polyp segmentation in colonoscopy images to address local spatial bias in existing deep learning methods. It introduces a spectrum-guided non-local perception module that aggregates local and global information via spectral features, a multi-source semantic extractor for high-level semantics, and a dense cross-layer interaction decoder. The central claim is that ASGNet outperforms 21 state-of-the-art methods across five standard polyp segmentation benchmarks, with code to be released publicly.
Significance. If the empirical superiority holds after verification, the work could advance medical image segmentation by showing that frequency-domain spectral guidance can mitigate local biases and improve capture of complete polyp structures. This has direct relevance to colorectal cancer screening. The public code commitment supports reproducibility.
major comments (3)
- [Methods (spectrum-guided non-local perception module)] Methods section (spectrum-guided non-local perception module): The description of the adaptive guidance and FFT-based spectral feature extraction does not address how the module avoids introducing ringing artifacts or spurious high-frequency components from common colonoscopy issues like illumination gradients and specular highlights. This is load-bearing for the central claim that spectral features reliably overcome local bias to yield higher Dice/IoU without new errors.
- [Experiments] Experiments section: The superiority over 21 SOTA methods on five benchmarks is asserted without visible quantitative tables, ablation breakdowns, error bars, or statistical tests in the provided text. This makes it impossible to assess whether gains are statistically meaningful or driven by the novel spectral component versus the multi-source extractor and dense decoder.
- [Ablation studies] Ablation studies: The experiments must isolate the spectrum-guided module's contribution (e.g., via controlled removal or replacement with standard non-local blocks) to confirm it is responsible for the reported performance rather than the other standard components.
minor comments (2)
- [Abstract] Abstract: Asserts 'extensive quantitative and qualitative results' but supplies none; ensure all tables, figures, and metrics are clearly presented and referenced in the full manuscript.
- [Throughout] Notation and terminology: Ensure consistent definitions for 'spectral features', 'frequency attributes', and 'adaptive guidance' across sections to avoid ambiguity.
Simulated Author's Rebuttal
We are grateful to the referee for the detailed and constructive feedback on our paper. We address each major comment below and outline the revisions we will make to improve the manuscript.
read point-by-point responses
-
Referee: Methods section (spectrum-guided non-local perception module): The description of the adaptive guidance and FFT-based spectral feature extraction does not address how the module avoids introducing ringing artifacts or spurious high-frequency components from common colonoscopy issues like illumination gradients and specular highlights. This is load-bearing for the central claim that spectral features reliably overcome local bias to yield higher Dice/IoU without new errors.
Authors: We thank the referee for this important observation. The spectrum-guided non-local perception module employs adaptive spectral filtering that learns to emphasize polyp-relevant frequencies while suppressing high-frequency noise. To make this explicit, we will revise the Methods section to include a new subsection detailing the artifact mitigation: specifically, the use of adaptive soft-thresholding on spectral coefficients and Hann windowing prior to FFT to prevent ringing from illumination gradients and specular highlights. This will directly support the claim that spectral guidance improves boundary precision without introducing spurious errors. revision: yes
-
Referee: Experiments section: The superiority over 21 SOTA methods on five benchmarks is asserted without visible quantitative tables, ablation breakdowns, error bars, or statistical tests in the provided text. This makes it impossible to assess whether gains are statistically meaningful or driven by the novel spectral component versus the multi-source extractor and dense decoder.
Authors: We apologize if the tables were not immediately apparent in the excerpt. The full manuscript contains quantitative comparison tables in Section 4 reporting Dice, IoU, and other metrics for ASGNet versus 21 SOTA methods across the five benchmarks (Kvasir-SEG, CVC-ClinicDB, CVC-ColonDB, ETIS, CVC-T). To address the concern and strengthen the evidence, we will add error bars from five independent runs, paired statistical tests (Wilcoxon signed-rank), and explicit discussion of how the spectral module drives the gains beyond the other components. revision: yes
-
Referee: Ablation studies: The experiments must isolate the spectrum-guided module's contribution (e.g., via controlled removal or replacement with standard non-local blocks) to confirm it is responsible for the reported performance rather than the other standard components.
Authors: We appreciate this suggestion for rigor. Our existing ablations already compare the full model against variants without the spectrum-guided module. In the revision, we will add controlled experiments replacing the spectrum-guided non-local perception module with a standard non-local block (as in NLNet) while keeping the multi-source extractor and decoder fixed. Performance drops on all five benchmarks will be reported to isolate the spectral guidance contribution. revision: yes
Circularity Check
No circularity: empirical architecture proposal with independent benchmark validation
full rationale
The paper presents an empirical deep-learning architecture (ASGNet) for polyp segmentation, consisting of a spectrum-guided non-local module, multi-source extractor, and dense decoder. Its central claim is comparative superiority on five public benchmarks against 21 baselines, supported by quantitative Dice/IoU metrics and qualitative results. No derivation chain, fitted parameters renamed as predictions, or self-citation load-bearing uniqueness theorems appear; the design choices are motivated by stated limitations of spatial-domain methods and are tested directly against external data. The work is therefore self-contained against falsifiable benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
free parameters (1)
- network hyperparameters and training schedule
axioms (2)
- domain assumption Spectral features capture global polyp attributes more effectively than purely spatial convolutions
- standard math Standard back-propagation and data-augmentation pipelines suffice to train the proposed architecture
Reference graph
Works this paper leans on
-
[1]
Colorectal cancer statistics, 2020,
R. L. Siegel, K. D. Miller, A. Goding Sauer, S. A. Fedewa, L. F. Butterly, J. C. Anderson, A. Cercek, R. A. Smith, and A. Jemal, “Colorectal cancer statistics, 2020,”CA: a cancer journal for clinicians, vol. 70, no. 3, pp. 145–164, 2020
2020
-
[2]
Global colorectal cancer burden in 2020 and pro- jections to 2040,
Y . Xi and P. Xu, “Global colorectal cancer burden in 2020 and pro- jections to 2040,”Translational oncology, vol. 14, no. 10, p. 101174, 2021
2020
-
[3]
Factors influencing the miss rate of polyps in a back-to-back colonoscopy study,
A. Leufkens, M. Van Oijen, F. Vleggaar, and P. Siersema, “Factors influencing the miss rate of polyps in a back-to-back colonoscopy study,” Endoscopy, pp. 470–475, 2012
2012
-
[4]
Accurate polyp segmentation for 3d ct colongraphy using multi-staged probabilistic binary learning and compositional model,
L. Lu, A. Barbu, M. Wolf, J. Liang, M. Salganicoff, and D. Comaniciu, “Accurate polyp segmentation for 3d ct colongraphy using multi-staged probabilistic binary learning and compositional model,” inComputer Vision and Pattern Recognition (CVPR). IEEE, 2008, pp. 1–8
2008
-
[5]
Colon polyp segmentation using texture analysis,
A. S ´anchez-Gonz´alez, B. Garcia-Zapirain, D. Sierra-Sosa, and A. El- maghraby, “Colon polyp segmentation using texture analysis,” inInter- national Symposium on Signal Processing and Information Technology (ISSPIT), 2018, pp. 579–588
2018
-
[6]
Colorectal polyp segmentation using front propagation on surfaces guided by shape,
K. Krishnan, Y . Soniwal, A. Madrosiya, and N. Desai, “Colorectal polyp segmentation using front propagation on surfaces guided by shape,” in Engineering in Medicine and Biology Society (EMBS), 2015, pp. 3093– 3096
2015
-
[7]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2015, pp. 234–241
2015
-
[8]
Unet++: Redesigning skip connections to exploit multiscale features in image segmentation,
Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: Redesigning skip connections to exploit multiscale features in image segmentation,”IEEE Transactions on Medical Imaging, vol. 39, no. 6, pp. 1856–1867, 2019
2019
-
[9]
Shallow attention network for polyp segmentation,
J. Wei, Y . Hu, R. Zhang, Z. Li, S. K. Zhou, and S. Cui, “Shallow attention network for polyp segmentation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 699–708
2021
-
[10]
Cross- level feature aggregation network for polyp segmentation,
T. Zhou, Y . Zhou, K. He, C. Gong, J. Yang, H. Fu, and D. Shen, “Cross- level feature aggregation network for polyp segmentation,”Pattern Recognition, vol. 140, p. 109555, 2023
2023
-
[11]
Selective feature aggrega- tion network with area-boundary constraints for polyp segmentation,
Y . Fang, C. Chen, Y . Yuan, and K.-y. Tong, “Selective feature aggrega- tion network with area-boundary constraints for polyp segmentation,” in Medical Image Computing and Computer Assisted Intervention (MIC- CAI), 2019, pp. 302–310
2019
-
[12]
The devil is in the boundary: Boundary-enhanced polyp segmentation,
Z. Liu, S. Zheng, X. Sun, Z. Zhu, Y . Zhao, X. Yang, and Y . Zhao, “The devil is in the boundary: Boundary-enhanced polyp segmentation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 7, pp. 5414–5423, 2024
2024
-
[13]
Polyper: Boundary sensitive polyp segmentation,
H. Shao, Y . Zhang, and Q. Hou, “Polyper: Boundary sensitive polyp segmentation,” inAAAI Conference on Artificial Intelligence (AAAI), vol. 38, no. 5, 2024, pp. 4731–4739
2024
-
[14]
Polyp segmentation via semantic en- hanced perceptual network,
T. Wang, X. Qi, and G. Yang, “Polyp segmentation via semantic en- hanced perceptual network,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 12, pp. 12 594–12 607, 2024
2024
-
[15]
Meganet: Multi-scale edge-guided attention network for weak boundary polyp segmentation,
N.-T. Bui, D.-H. Hoang, Q.-T. Nguyen, M.-T. Tran, and N. Le, “Meganet: Multi-scale edge-guided attention network for weak boundary polyp segmentation,” inWinter Conference on Applications of Computer Vision (WACV), 2024, pp. 7985–7994
2024
-
[16]
Polyp-mixer: An efficient context-aware mlp-based paradigm for polyp segmentation,
J.-H. Shi, Q. Zhang, Y .-H. Tang, and Z.-Q. Zhang, “Polyp-mixer: An efficient context-aware mlp-based paradigm for polyp segmentation,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 33, no. 1, pp. 30–42, 2022
2022
-
[17]
Glconet: Learning mul- tisource perception representation for camouflaged object detection,
Y . Sun, H. Xuan, J. Yang, and L. Luo, “Glconet: Learning mul- tisource perception representation for camouflaged object detection,” IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 7, pp. 13 262–13 275, 2025
2025
-
[18]
Dminet: dense multi-scale inference network for salient object detection,
C. Xia, Y . Sun, X. Gao, B. Ge, and S. Duan, “Dminet: dense multi-scale inference network for salient object detection,”The Visual Computer, vol. 38, no. 9, pp. 3059–3072, 2022
2022
-
[19]
Polyp-pvt: Polyp segmentation with pyramid vision transformers,
B. Dong, W. Wang, D.-P. Fan, J. Li, H. Fu, and L. Shao, “Polyp-pvt: Polyp segmentation with pyramid vision transformers,”arXiv preprint arXiv:2108.06932, 2021
-
[20]
Ppnet: Pyramid pooling based network for polyp segmentation,
K. Hu, W. Chen, Y . Sun, X. Hu, Q. Zhou, and Z. Zheng, “Ppnet: Pyramid pooling based network for polyp segmentation,”Computers in Biology and Medicine, vol. 160, p. 107028, 2023
2023
-
[21]
Lssnet: A method for colon polyp segmentation based on local feature supplementation and shallow feature supplementation,
W. Wang, H. Sun, and X. Wang, “Lssnet: A method for colon polyp segmentation based on local feature supplementation and shallow feature supplementation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2024, pp. 446–456. JOURNAL OF LATEX CLASS FILES 11
2024
-
[22]
Dual-perspective united transformer for object segmentation in optical remote sensing images,
Y . Sun, J. Yan, J. Qian, C. Xu, J. Yang, and L. Luo, “Dual-perspective united transformer for object segmentation in optical remote sensing images,” inThirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI), 2025, pp. 1909–1917
2025
-
[23]
Pvt v2: Improved baselines with pyramid vision transformer,
W. Wang, E. Xie, X. Li, D.-P. Fan, K. Song, D. Liang, T. Lu, P. Luo, and L. Shao, “Pvt v2: Improved baselines with pyramid vision transformer,” Computational Visual Media, vol. 8, no. 3, pp. 415–424, 2022
2022
-
[24]
Msbp-net: a multi-scale boundary prediction network for automated polyp segmentation,
X.-L. Pan, J.-R. Ding, X. Li, S. Liu, J. Wang, B. Hua, G.-Z. Tang, and C.-H. Zhong, “Msbp-net: a multi-scale boundary prediction network for automated polyp segmentation,”Pattern Recognition, vol. 170, p. 112101, 2026
2026
-
[25]
Fda: Fourier domain adaptation for semantic segmentation,
Y . Yang and S. Soatto, “Fda: Fourier domain adaptation for semantic segmentation,” inComputer Vision and Pattern Recognition (CVPR), 2020, pp. 4085–4095
2020
-
[26]
Spectr: Spectral transformer for microscopic hyperspectral pathology image segmentation,
B. Yun, B. Lei, J. Chen, H. Wang, S. Qiu, W. Shen, Q. Li, and Y . Wang, “Spectr: Spectral transformer for microscopic hyperspectral pathology image segmentation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 6, pp. 4610–4624, 2024
2024
-
[27]
Frequency-spatial entanglement learning for camouflaged object detection,
Y . Sun, C. Xu, J. Yang, H. Xuan, and L. Luo, “Frequency-spatial entanglement learning for camouflaged object detection,”arXiv preprint arXiv:2409.01686, 2024
-
[28]
Global filter networks for image classification,
Y . Rao, W. Zhao, Z. Zhu, J. Lu, and J. Zhou, “Global filter networks for image classification,” inAdvances in neural information processing systems (NIPS), vol. 34, 2021, pp. 980–993
2021
-
[29]
United domain cognition network for salient object detection in optical remote sensing images,
Y . Sun, J. Yang, and L. Luo, “United domain cognition network for salient object detection in optical remote sensing images,”IEEE Transactions on Geoscience and Remote Sensing, vol. 62, p. 3497579, 2024
2024
-
[30]
Ftmf-net: A fourier transform-multiscale feature fusion network for segmentation of small polyp objects,
G. Liu, Z. Chen, D. Liu, B. Chang, and Z. Dou, “Ftmf-net: A fourier transform-multiscale feature fusion network for segmentation of small polyp objects,”IEEE Transactions on Instrumentation and Measurement, vol. 72, pp. 1–15, 2023
2023
-
[31]
Freqformer: Efficient polyp segmentation via wavelet transform,
X. Zhou and T. Chen, “Freqformer: Efficient polyp segmentation via wavelet transform,” inInternational Conference on Multimedia and Expo (ICME), 2024, pp. 1–6
2024
-
[32]
Pstnet: Enhanced polyp segmentation with multi-scale alignment and frequency domain in- tegration,
W. Xu, R. Xu, C. Wang, X. Li, S. Xu, and L. Guo, “Pstnet: Enhanced polyp segmentation with multi-scale alignment and frequency domain in- tegration,”IEEE Journal of Biomedical and Health Informatics, vol. 28, no. 10, pp. 6042–6053, 2024
2024
-
[33]
Fully convolutional neural networks for polyp segmentation in colonoscopy,
P. Brandao, E. Mazomenos, G. Ciuti, R. Cali `o, F. Bianchi, A. Men- ciassi, P. Dario, A. Koulaouzidis, A. Arezzo, and D. Stoyanov, “Fully convolutional neural networks for polyp segmentation in colonoscopy,” inMedical Image Computing and Computer Assisted Intervention (MIC- CAI), vol. 10134. Spie, 2017, pp. 101–107
2017
-
[34]
Uncertainty-aware hierarchical aggregation network for medical image segmentation,
T. Zhou, Y . Zhou, G. Li, G. Chen, and J. Shen, “Uncertainty-aware hierarchical aggregation network for medical image segmentation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 8, pp. 7440–7453, 2024
2024
-
[35]
Run: Reversible unfolding network for concealed object segmentation,
C. He, R. Zhang, F. Xiao, C. Fang, L. Tang, Y . Zhang, L. Kong, D.- P. Fan, K. Li, and S. Farsiu, “Run: Reversible unfolding network for concealed object segmentation,”International Conference on Machine Learning (ICML), 2025
2025
-
[36]
Mdpnet: Multiscale dynamic polyp-focus network for enhancing medical image polyp segmentation,
A. A. Kamara, S. He, A. Joseph Fofanah, R. Xu, and Y . Chen, “Mdpnet: Multiscale dynamic polyp-focus network for enhancing medical image polyp segmentation,”IEEE Transactions on Medical Imaging, vol. 44, no. 12, pp. 5208–5220, 2025
2025
-
[37]
Epsegnet: Lightweight semantic recalibration and assembly for efficient polyp segmentation,
H. Wu and Z. Zhao, “Epsegnet: Lightweight semantic recalibration and assembly for efficient polyp segmentation,”IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 8, pp. 13 805– 13 817, 2025
2025
-
[38]
Boundary-guided feature-aligned network for colorectal polyp seg- mentation,
G. Yue, S. Wu, G. Li, C. Zhao, Y . Hao, T. Zhou, and B. Zhao, “Boundary-guided feature-aligned network for colorectal polyp seg- mentation,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 7, pp. 6993–7004, 2025
2025
-
[39]
Colonic polyp segmentation based on transformer-convolutional neural networks fusion,
C. Luo, Y . Wang, Z. Deng, Q. Lou, Z. Zhao, Y . Ge, and S. Hu, “Colonic polyp segmentation based on transformer-convolutional neural networks fusion,”Pattern Recognition, vol. 170, p. 112116, 2026
2026
-
[40]
Tokens-to-token vit: Training vision transformers from scratch on imagenet,
L. Yuan, Y . Chen, T. Wang, W. Yu, Y . Shi, Z.-H. Jiang, F. E. Tay, J. Feng, and S. Yan, “Tokens-to-token vit: Training vision transformers from scratch on imagenet,” inInternational Conference on Computer Vision (ICCV), 2021, pp. 558–567
2021
-
[41]
Ssaim: Not all self- attentions contain effective spatial structure in diffusion models for text-to-image editing,
Z. Yu, J. Dai, Y . Zhang, J. Yang, and L. Luo, “Ssaim: Not all self- attentions contain effective spatial structure in diffusion models for text-to-image editing,” inProceedings of the 33rd ACM International Conference on Multimedia, 2025, pp. 9472–9480
2025
-
[42]
Ttfdiffusion: Training-free and text-free image editing in diffusion models with structural and semantic disentanglement,
Z. Yu, J. Jin, J. Zhao, Z. Fu, and J. Yang, “Ttfdiffusion: Training-free and text-free image editing in diffusion models with structural and semantic disentanglement,”Neurocomputing, vol. 619, p. 129159, 2025
2025
-
[43]
Aggregating dense and attentional multi-scale feature network for salient object detection,
Y . Sun, C. Xia, X. Gao, H. Yan, B. Ge, and K.-C. Li, “Aggregating dense and attentional multi-scale feature network for salient object detection,” Digital Signal Processing, vol. 130, p. 103747, 2022
2022
-
[44]
arXiv preprint arXiv:2505.10931 (2025)
C. Wang, W. Lu, X. Li, J. Yang, and L. Luo, “M4-sar: A multi-resolution, multi-polarization, multi-scene, multi-source dataset and benchmark for optical-sar fusion object detection,”arXiv preprint arXiv:2505.10931, 2025
-
[45]
Uniformer: Unified transformer for efficient spatiotemporal representation learning
K. Li, Y . Wang, P. Gao, G. Song, Y . Liu, H. Li, and Y . Qiao, “Uniformer: Unified transformer for efficient spatiotemporal representation learning,” arXiv preprint arXiv:2201.04676, 2022
-
[46]
Small but mighty: Dynamic wavelet expert-guided fine-tuning of large-scale models for optical remote sensing object segmentation,
Y . Sun, C. Wang, J. Yang, and L. Luo, “Small but mighty: Dynamic wavelet expert-guided fine-tuning of large-scale models for optical remote sensing object segmentation,”AAAI Conference on Artificial Intelligence (AAAI), 2025
2025
-
[47]
Swin transformer: Hierarchical vision transformer using shifted windows,
Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inInternational Conference on Computer Vision (ICCV), 2021, pp. 10 012–10 022
2021
-
[48]
Controllable-lpmoe: Adapting to challenging object segmentation via dynamic local priors from mixture- of-experts,
Y . Sun, J. Lian, J. Yang, and L. Luo, “Controllable-lpmoe: Adapting to challenging object segmentation via dynamic local priors from mixture- of-experts,” inInternational Conference on Computer Vision (ICCV), 2025, pp. 22 327–22 337
2025
-
[49]
Rcnet: Related context-driven network with hierarchical attention for salient object detection,
C. Xia, Y . Sun, K.-C. Li, B. Ge, H. Zhang, B. Jiang, and J. Zhang, “Rcnet: Related context-driven network with hierarchical attention for salient object detection,”Expert Systems with Applications, vol. 237, p. 121441, 2024
2024
-
[50]
Localized background-aware generative distillation for enhanced remote sensing object detection,
C. Wang, Y . Sun, J. Yang, and L. Luo, “Localized background-aware generative distillation for enhanced remote sensing object detection,” IEEE Transactions on Circuits and Systems for Video Technology, 2026
2026
-
[51]
Msod: A large-scale multi-scene dataset and a novel diagonal-geometry loss for sar object detection,
C. Wang, W. Fang, X. Li, J. Yang, and L. Luo, “Msod: A large-scale multi-scene dataset and a novel diagonal-geometry loss for sar object detection,”IEEE Transactions on Geoscience and Remote Sensing, 2025
2025
-
[52]
Parformer: Transformer- based multi-task network for pedestrian attribute recognition,
X. Fan, Y . Zhang, Y . Lu, and H. Wang, “Parformer: Transformer- based multi-task network for pedestrian attribute recognition,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 1, pp. 411–423, 2024
2024
-
[53]
Transformer tracking via frequency fusion,
X. Hu, B. Zhong, Q. Liang, S. Zhang, N. Li, X. Li, and R. Ji, “Transformer tracking via frequency fusion,”IEEE Transactions on Circuits and Systems for Video Technology, vol. 34, no. 2, pp. 1020– 1031, 2024
2024
-
[54]
Fsi: Frequency and spatial interactive learning for image restoration in under-display cameras,
C. Liu, X. Wang, S. Li, Y . Wang, and X. Qian, “Fsi: Frequency and spatial interactive learning for image restoration in under-display cameras,” inInternational Conference on Computer Vision (ICCV), 2023, pp. 12 537–12 546
2023
-
[55]
Ffcnet: Fourier transform-based frequency learning and complex convolutional network for colon disease classification,
K.-N. Wang, Y . He, S. Zhuang, J. Miao, X. He, P. Zhou, G. Yang, G.- Q. Zhou, and S. Li, “Ffcnet: Fourier transform-based frequency learning and complex convolutional network for colon disease classification,” in Medical Image Computing and Computer-Assisted Intervention (MIC- CAI), 2022, pp. 78–87
2022
-
[56]
Fdtnet: Enhanc- ing frequency-aware representation for prohibited object detection from x-ray images via dual-stream transformers,
Z. Zhu, Y . Zhu, H. Wang, N. Wang, J. Ye, and X. Ling, “Fdtnet: Enhanc- ing frequency-aware representation for prohibited object detection from x-ray images via dual-stream transformers,”Engineering Applications of Artificial Intelligence, vol. 133, p. 108076, 2024
2024
-
[57]
Deep residual learning for image recognition,
K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inComputer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778
2016
-
[58]
Adaptive context selection for polyp segmentation,
R. Zhang, G. Li, Z. Li, S. Cui, D. Qian, and Y . Yu, “Adaptive context selection for polyp segmentation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2020, pp. 253–262
2020
-
[59]
Enhanced u-net: A feature enhancement network for polyp segmentation,
K. Patel, A. M. Bur, and G. Wang, “Enhanced u-net: A feature enhancement network for polyp segmentation,” inConference on Robots and Vision, 2021, pp. 181–188
2021
-
[60]
Automatic polyp segmentation via multi- scale subtraction network,
X. Zhao, L. Zhang, and H. Lu, “Automatic polyp segmentation via multi- scale subtraction network,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 120–130
2021
-
[61]
Global and local feature reconstruction for medical image segmentation,
J. Song, X. Chen, Q. Zhu, F. Shi, D. Xiang, Z. Chen, Y . Fan, L. Pan, and W. Zhu, “Global and local feature reconstruction for medical image segmentation,”IEEE Transactions on Medical Imaging, vol. 41, no. 9, pp. 2273–2284, 2022
2022
-
[62]
Duplex contextual relation net- work for polyp segmentation,
Z. Yin, K. Liang, Z. Ma, and J. Guo, “Duplex contextual relation net- work for polyp segmentation,” inInternational Symposium on Biomed- ical Imaging (ISBI), 2022, pp. 1–5. JOURNAL OF LATEX CLASS FILES 12
2022
-
[63]
Swin-unet: Unet-like pure transformer for medical image segmenta- tion,
H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-unet: Unet-like pure transformer for medical image segmenta- tion,” inEuropean Conference on Computer Vision (ECCV), 2022, pp. 205–218
2022
-
[64]
Transfuse: Fusing transformers and cnns for medical image segmentation,
Y . Zhang, H. Liu, and Q. Hu, “Transfuse: Fusing transformers and cnns for medical image segmentation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 14–24
2021
-
[65]
Utnet: a hybrid transformer archi- tecture for medical image segmentation,
Y . Gao, M. Zhou, and D. N. Metaxas, “Utnet: a hybrid transformer archi- tecture for medical image segmentation,” inMedical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 61–71
2021
-
[66]
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation
J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “Transunet: Transformers make strong encoders for medical image segmentation,”arXiv preprint arXiv:2102.04306, 2021
work page internal anchor Pith review arXiv 2021
-
[67]
Camouflaged object detection with feature decomposition and edge reconstruction,
C. He, K. Li, Y . Zhang, L. Tang, Y . Zhang, Z. Guo, and X. Li, “Camouflaged object detection with feature decomposition and edge reconstruction,” inComputer Vision and Pattern Recognition (CVPR), 2023, pp. 22 046–22 055
2023
-
[68]
A benchmark for endoluminal scene segmentation of colonoscopy images,
D. V ´azquez, J. Bernal, F. J. S ´anchez, G. Fern ´andez-Esparrach, A. M. L´opez, A. Romero, M. Drozdzal, and A. Courville, “A benchmark for endoluminal scene segmentation of colonoscopy images,”Journal of healthcare engineering, vol. 2017, no. 1, p. 4037190, 2017
2017
-
[69]
Automated polyp detection in colonoscopy videos using shape and context information,
N. Tajbakhsh, S. R. Gurudu, and J. Liang, “Automated polyp detection in colonoscopy videos using shape and context information,”IEEE Transactions on Medical Imaging, vol. 35, no. 2, pp. 630–644, 2015
2015
-
[70]
Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer,
J. Silva, A. Histace, O. Romain, X. Dray, and B. Granado, “Toward embedded detection of polyps in wce images for early diagnosis of colorectal cancer,”International journal of computer assisted radiology and surgery, vol. 9, pp. 283–293, 2014
2014
-
[71]
Kvasir-seg: A segmented polyp dataset,
D. Jha, P. H. Smedsrud, M. A. Riegler, P. Halvorsen, T. De Lange, D. Johansen, and H. D. Johansen, “Kvasir-seg: A segmented polyp dataset,” inMultiMedia Modeling, 2020, pp. 451–462
2020
-
[72]
Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians,
J. Bernal, F. J. S ´anchez, G. Fern ´andez-Esparrach, D. Gil, C. Rodr ´ıguez, and F. Vilari ˜no, “Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians,”Comput- erized medical imaging and graphics, vol. 43, pp. 99–111, 2015
2015
-
[73]
How to evaluate foreground maps?
R. Margolin, L. Zelnik-Manor, and A. Tal, “How to evaluate foreground maps?” inComputer Vision and Pattern Recognition (CVPR), 2014, pp. 248–255
2014
-
[74]
Structure-measure: A new way to evaluate foreground maps,
D.-P. Fan, M.-M. Cheng, Y . Liu, T. Li, and A. Borji, “Structure-measure: A new way to evaluate foreground maps,” inInternational Conference on Computer Vision (ICCV), 2017, pp. 4548–4557
2017
-
[75]
Enhanced-alignment measure for binary foreground map evaluation,
D.-P. Fan, C. Gong, Y . Cao, B. Ren, M.-M. Cheng, and A. Borji, “Enhanced-alignment measure for binary foreground map evaluation,” in International Joint Conference on Artificial Intelligence (IJCAI), 2018, pp. 698–704
2018
-
[76]
Restormer: Efficient transformer for high-resolution image restoration,
S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, “Restormer: Efficient transformer for high-resolution image restoration,” inComputer Vision and Pattern Recognition (CVPR), 2022, pp. 5728– 5739
2022
-
[77]
Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,
L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, “Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 4, pp. 834–848, 2017
2017
-
[78]
Receptive field block net for accurate and fast object detection,
S. Liu, D. Huanget al., “Receptive field block net for accurate and fast object detection,” inEuropean Conference on Computer Vision (ECCV), 2018, pp. 385–400
2018
-
[79]
Feature pyramid networks for object detection,
T.-Y . Lin, P. Doll´ar, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” inInternational Con- ference on Computer Vision (ICCV), 2017, pp. 2117–2125. Yanguang Sunis currently pursuing his Ph.D. at Nanjing University of Science and Technology (NJUST), Nanjing, Jiangsu, China, under the super- vision of Pro...
2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.