Focal Modulation and Bidirectional Feature Fusion Network for Medical Image Segmentation

Hamid Alinejad-Rokny; Imran Razzak; Moin Safdar; Mubeen Ghafoor; Shahzaib Iqbal; Tariq M.Khan; Thantrira Porntaveetus

arxiv: 2510.20933 · v2 · submitted 2025-10-23 · 💻 cs.CV · cs.AI

Focal Modulation and Bidirectional Feature Fusion Network for Medical Image Segmentation

Moin Safdar , Shahzaib Iqbal , Mubeen Ghafoor , Tariq M.Khan , Imran Razzak , Thantrira Porntaveetus , Hamid Alinejad-Rokny This is my paper

Pith reviewed 2026-05-18 04:10 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords medical image segmentationfocal modulation attentionbidirectional feature fusionhybrid CNN-transformerpolyp segmentationskin lesion segmentationultrasound imaging

0 comments

The pith

FM-BFF-Net uses focal modulation attention and bidirectional feature fusion to achieve better accuracy than recent methods in medical image segmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FM-BFF-Net to overcome the limitations of standard convolutional networks in capturing global context for segmenting medical images. By combining CNNs with transformer elements, it adds a focal modulation attention to better understand surrounding context and a bidirectional module that lets encoder and decoder features interact across different scales. This results in sharper boundaries and better handling of structures that vary in size and appearance. Tests across eight different datasets for tasks like finding polyps and outlining skin lesions show consistent gains over existing top methods in standard overlap measures.

Core claim

The network combines convolutional and transformer components, employs a focal modulation attention mechanism to refine context awareness, and introduces a bidirectional feature fusion module that enables efficient interaction between encoder and decoder representations across scales. Through this design, FM-BFF-Net enhances boundary precision and robustness to variations in lesion size, shape, and contrast.

What carries the argument

Focal modulation attention mechanism that refines context awareness combined with bidirectional feature fusion module for encoder-decoder interaction across scales.

If this is right

The design improves boundary precision for structures with complicated borders and varied sizes.
It shows adaptability across polyp detection, skin lesion segmentation, and ultrasound imaging.
Consistent outperformance on eight public datasets supports its use in diverse clinical imaging scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The bidirectional fusion idea could be tested on video or 3D volume segmentation to track changes over time or depth.
If the modules reduce sensitivity to size and contrast variations, they might improve detection of small or rare lesions in imbalanced datasets.
Similar attention and fusion patterns might apply to non-medical tasks like satellite or industrial defect segmentation.

Load-bearing premise

The performance gains come from the focal modulation attention and bidirectional feature fusion modules rather than from differences in training protocol, model size, or dataset tuning.

What would settle it

Retraining the compared state-of-the-art methods with identical training protocol, data splits, and model capacity as FM-BFF-Net and observing no difference in Jaccard or Dice scores would falsify the claim.

Figures

Figures reproduced from arXiv: 2510.20933 by Hamid Alinejad-Rokny, Imran Razzak, Moin Safdar, Mubeen Ghafoor, Shahzaib Iqbal, Tariq M.Khan, Thantrira Porntaveetus.

**Figure 1.** Figure 1: Overview of the proposed M-BFF-Net architecture for medical image segmentation. The model integrates convolutional and transformer-based [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: (a) Architecture of the proposed Focal Modulation-based ConvFormer Attention Block (FMCAB), which combines convolutional and attention [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

**Figure 3.** Figure 3: Detailed schematic of the proposed Bidirectional Feature Fusion [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Architectural schematic of the proposed Vision Transformer Module [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Visual performance comparison of the proposed M-BFF-Net on Kvasir-SEG dataset. [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Visual performance comparison of the proposed M-BFF-Net on CVC-Clinic dataset. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Visual performance comparison of the proposed M-BFF-Net on CVC-ColonDB dataset. [PITH_FULL_IMAGE:figures/full_fig_p008_7.png] view at source ↗

**Figure 8.** Figure 8: Visual performance comparison of the proposed M-BFF-Net on CVC-ColonDB dataset. [PITH_FULL_IMAGE:figures/full_fig_p009_8.png] view at source ↗

**Figure 9.** Figure 9: Visual performance comparison of the proposed M-BFF-Net on BUSI [82] dataset. [PITH_FULL_IMAGE:figures/full_fig_p009_9.png] view at source ↗

**Figure 10.** Figure 10: Visual performance comparison of the proposed M-BFF-Net on DDTI [83] dataset. [PITH_FULL_IMAGE:figures/full_fig_p010_10.png] view at source ↗

**Figure 11.** Figure 11: Failure cases comparison of the proposed M-BFF-Net on CVC-ColonDB dataset. [PITH_FULL_IMAGE:figures/full_fig_p011_11.png] view at source ↗

**Figure 12.** Figure 12: Failure cases comparison of the proposed M-BFF-Net on ISIC2017 dataset. [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

read the original abstract

Medical image segmentation is essential for clinical applications such as disease diagnosis, treatment planning, and disease development monitoring because it provides precise morphological and spatial information on anatomical structures that directly influence treatment decisions. Convolutional neural networks significantly impact image segmentation; however, since convolution operations are local, capturing global contextual information and long-range dependencies is still challenging. Their capacity to precisely segment structures with complicated borders and a variety of sizes is impacted by this restriction. Since transformers use self-attention methods to capture global context and long-range dependencies efficiently, integrating transformer-based architecture with CNNs is a feasible approach to overcoming these challenges. To address these challenges, we propose the Focal Modulation and Bidirectional Feature Fusion Network for Medical Image Segmentation, referred to as FM-BFF-Net in the remainder of this paper. The network combines convolutional and transformer components, employs a focal modulation attention mechanism to refine context awareness, and introduces a bidirectional feature fusion module that enables efficient interaction between encoder and decoder representations across scales. Through this design, FM-BFF-Net enhances boundary precision and robustness to variations in lesion size, shape, and contrast. Extensive experiments on eight publicly available datasets, including polyp detection, skin lesion segmentation, and ultrasound imaging, show that FM-BFF-Net consistently surpasses recent state-of-the-art methods in Jaccard index and Dice coefficient, confirming its effectiveness and adaptability for diverse medical imaging scenarios.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FM-BFF-Net is a CNN-transformer hybrid with focal modulation and bidirectional fusion that claims gains on eight medical segmentation datasets, but the improvements are not isolated from training or capacity differences.

read the letter

The paper puts forward FM-BFF-Net, which mixes convolutional layers with transformer elements, adds a focal modulation attention block to sharpen context, and uses a bidirectional feature fusion module to pass information both ways between encoder and decoder stages. The central result is that this combination beats recent methods on Jaccard and Dice across polyp, skin lesion, and ultrasound datasets. That multi-dataset scope is the part worth noting first, since many segmentation papers stick to one or two benchmarks.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes FM-BFF-Net, a hybrid CNN-Transformer architecture for medical image segmentation. It introduces a focal modulation attention mechanism to refine context awareness and a bidirectional feature fusion module to enable efficient interaction between encoder and decoder representations across scales. The paper claims that this design enhances boundary precision and robustness to variations in lesion size, shape, and contrast, and reports that extensive experiments on eight publicly available datasets for polyp detection, skin lesion segmentation, and ultrasound imaging show consistent outperformance over recent state-of-the-art methods in Jaccard index and Dice coefficient.

Significance. Should the performance improvements hold under rigorous controlled experiments and be attributable to the proposed modules, the work would offer a practical advancement in hybrid architectures for medical image segmentation by better capturing global contextual information and long-range dependencies while maintaining the strengths of convolutional networks. This could have implications for improving diagnostic accuracy in clinical settings involving variable lesion characteristics.

major comments (2)

The abstract asserts consistent outperformance on eight datasets but supplies no quantitative tables, statistical tests, ablation results, or error bars; without these the central performance claim cannot be verified.
No ablation experiments are presented that isolate the focal modulation attention mechanism or the bidirectional feature fusion module (e.g., by removing each component and re-training a capacity-matched baseline under identical optimizer, scheduler, and augmentation settings). This leaves open the possibility that reported Jaccard/Dice gains arise from training-protocol differences or model capacity rather than the claimed modules.

minor comments (1)

The eight datasets are referenced generically in the abstract and experiments description but are not enumerated with their names, sizes, or modalities, which would improve clarity for readers assessing generalizability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We have carefully reviewed the major comments and provide detailed point-by-point responses below, including planned revisions to address the concerns raised.

read point-by-point responses

Referee: The abstract asserts consistent outperformance on eight datasets but supplies no quantitative tables, statistical tests, ablation results, or error bars; without these the central performance claim cannot be verified.

Authors: We appreciate this point. The abstract is intended as a concise summary of the key findings, while the full quantitative results—including comparative tables for Dice and Jaccard scores across all eight datasets, statistical significance tests (e.g., paired t-tests), and error bars from multiple training runs—are presented in detail in Section 4 (Experiments) and the associated tables. To improve verifiability directly from the abstract, we will revise it to incorporate specific average performance gains and reference the main results section. revision: partial
Referee: No ablation experiments are presented that isolate the focal modulation attention mechanism or the bidirectional feature fusion module (e.g., by removing each component and re-training a capacity-matched baseline under identical optimizer, scheduler, and augmentation settings). This leaves open the possibility that reported Jaccard/Dice gains arise from training-protocol differences or model capacity rather than the claimed modules.

Authors: We agree that ablation studies are essential to isolate the contribution of each proposed component. We will add a dedicated ablation section that systematically removes the focal modulation attention mechanism and the bidirectional feature fusion module one at a time. Each variant will be compared against capacity-matched baselines trained under identical conditions (same optimizer, learning rate scheduler, data augmentations, and random seeds) to ensure fair attribution of performance improvements. revision: yes

Circularity Check

0 steps flagged

Empirical architecture proposal with no circular derivation chain

full rationale

The paper proposes FM-BFF-Net as a hybrid CNN-transformer architecture incorporating focal modulation attention and bidirectional feature fusion modules. Its central claims rest on comparative experiments across eight public datasets rather than any mathematical derivation, prediction, or first-principles result. No equations are shown that reduce outputs to fitted inputs by construction, no self-citations are invoked as load-bearing uniqueness theorems, and no ansatz or renaming of known results is presented as a derivation. The work is therefore self-contained against external benchmarks and receives a score of 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The claim rests on the untested premise that the two new modules drive the measured gains and on standard assumptions about supervised training of segmentation networks.

axioms (1)

domain assumption Convolution operations are local and therefore insufficient for global context and long-range dependencies in medical images.
Explicitly stated in the abstract as the core limitation motivating the hybrid design.

invented entities (2)

Focal modulation attention mechanism no independent evidence
purpose: Refine context awareness inside the hybrid encoder-decoder
Introduced as a core component of FM-BFF-Net; no independent evidence outside the paper is supplied.
Bidirectional feature fusion module no independent evidence
purpose: Enable efficient interaction between encoder and decoder representations across scales
Introduced as a core component of FM-BFF-Net; no independent evidence outside the paper is supplied.

pith-pipeline@v0.9.0 · 5804 in / 1309 out tokens · 40965 ms · 2026-05-18T04:10:41.259446+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The network combines convolutional and transformer components, employs a focal modulation attention mechanism to refine context awareness, and introduces a bidirectional feature fusion module that enables efficient interaction between encoder and decoder representations across scales.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Extensive experiments on eight publicly available datasets... show that FM-BFF-Net consistently surpasses recent state-of-the-art methods in Jaccard index and Dice coefficient.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages · 2 internal anchors

[1]

Automatic retinal vessel extraction algorithm,

T. A. Soomro, M. A. Khan, J. Gao, T. M. Khan, M. Paul, and N. Mir, “Automatic retinal vessel extraction algorithm,” in2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2016, pp. 1–8

work page 2016
[2]

Automatic optic disk detection and segmentation by variational active contour estimation in retinal fundus images,

S. S. Naqvi, N. Fatima, T. M. Khan, Z. U. Rehman, and M. A. Khan, “Automatic optic disk detection and segmentation by variational active contour estimation in retinal fundus images,”Signal, Image and Video Processing, vol. 13, no. 6, pp. 1191–1198, 2019

work page 2019
[3]

Shallow vessel segmentation network for automatic retinal vessel seg- mentation,

T. M. Khan, F. Abdullah, S. S. Naqvi, M. Arsalan, and M. A. Khan, “Shallow vessel segmentation network for automatic retinal vessel seg- mentation,” in2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020, pp. 1–7

work page 2020
[4]

A semantically flexible feature fusion network for retinal vessel segmentation,

T. M. Khan, A. Robles-Kelly, and S. S. Naqvi, “A semantically flexible feature fusion network for retinal vessel segmentation,” inInternational Conference on Neural Information Processing. Springer, Cham, 2020, pp. 159–167

work page 2020
[5]

A review on glaucoma disease detection using computerized techniques,

F. Abdullah, R. Imtiaz, H. A. Madni, H. A. Khan, T. M. Khan, M. A. Khan, and S. S. Naqvi, “A review on glaucoma disease detection using computerized techniques,”IEEE Access, vol. 9, pp. 37 311–37 333, 2021

work page 2021
[6]

Residual multiscale full convolutional network (rm-fcn) for high resolution se- mantic segmentation of retinal vasculature,

T. M. Khan, A. Robles-Kelly, S. S. Naqvi, and A. Muhammad, “Residual multiscale full convolutional network (rm-fcn) for high resolution se- mantic segmentation of retinal vasculature,” inStructural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshops, S+ SSPR 2020, Padua, Italy, January 21–22, 2021, Proceedings. Springer Nat...

work page 2020
[7]

Rc-net: A convolutional neural network for retinal vessel segmentation,

T. M. Khan, A. Robles-Kelly, and S. S. Naqvi, “Rc-net: A convolutional neural network for retinal vessel segmentation,” in2021 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2021, pp. 01–07

work page 2021
[8]

G-net light: A lightweight modified google net for retinal vessel segmentation,

S. Iqbal, S. Naqvi, H. Ahmed, A. Saadat, and T. M. Khan, “G-net light: A lightweight modified google net for retinal vessel segmentation,” in Photonics, vol. 9, no. 12. MDPI, 2022, pp. 923–936. 12

work page 2022
[9]

Prompt deep light-weight vessel segmentation network (plvs-net),

M. Arsalan, T. M. Khan, S. S. Naqvi, M. Nawaz, and I. Razzak, “Prompt deep light-weight vessel segmentation network (plvs-net),”IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 20, no. 2, pp. 1363–1371, 2022

work page 2022
[10]

Recent trends and advances in fundus image analysis: A review,

S. Iqbal, T. M. Khan, K. Naveed, S. S. Naqvi, and S. J. Nawaz, “Recent trends and advances in fundus image analysis: A review,”Computers in Biology and Medicine, vol. 151, p. 106277, 2022

work page 2022
[11]

Semi-supervised 3d- inceptionnet for segmentation and survival prediction of head and neck primary cancers,

A. Qayyum, M. Mazher, T. Khan, and I. Razzak, “Semi-supervised 3d- inceptionnet for segmentation and survival prediction of head and neck primary cancers,”Engineering Applications of Artificial Intelligence, vol. 117, p. 105590, 2023

work page 2023
[12]

Simple and robust depth-wise cascaded network for polyp segmentation,

T. M. Khan, M. Arsalan, I. Razzak, and E. Meijering, “Simple and robust depth-wise cascaded network for polyp segmentation,”Engineering Applications of Artificial Intelligence, vol. 121, p. 106023, 2023

work page 2023
[13]

Retinal vessel segmentation via a multi-resolution contextual network and adversarial learning,

T. M. Khan, S. S. Naqvi, A. Robles-Kelly, and I. Razzak, “Retinal vessel segmentation via a multi-resolution contextual network and adversarial learning,”Neural Networks, vol. 165, pp. 310–320, 2023

work page 2023
[14]

Robust retinal blood vessel segmentation using a patch-based statistical adaptive multi-scale line detector,

S. Iqbal, K. Naveed, S. S. Naqvi, A. Naveed, and T. M. Khan, “Robust retinal blood vessel segmentation using a patch-based statistical adaptive multi-scale line detector,”Digital Signal Processing, vol. 139, p. 104075, 2023

work page 2023
[15]

Mlr-net: A multi-layer residual convolutional neural network for leather defect segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, and G. Holmes, “Mlr-net: A multi-layer residual convolutional neural network for leather defect segmentation,”Engineering applications of artificial intelligence, vol. 126, p. 107007, 2023

work page 2023
[16]

Fusion of textural and visual information for medical image modality retrieval using deep learning-based feature engineering,

S. Iqbal, A. N. Qureshi, M. Alhussein, I. A. Choudhry, K. Aurangzeb, and T. M. Khan, “Fusion of textural and visual information for medical image modality retrieval using deep learning-based feature engineering,” IEEE Access, vol. 11, pp. 93 238–93 253, 2023

work page 2023
[17]

Feature enhancer segmentation network (fes-net) for vessel segmentation,

T. M. Khan, M. Arsalan, S. Iqbal, I. Razzak, and E. Meijering, “Feature enhancer segmentation network (fes-net) for vessel segmentation,” in 2023 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2023, pp. 160–167

work page 2023
[18]

Pca: Progressive class-wise attention for skin lesions diagnosis,

A. Naveed, S. S. Naqvi, T. M. Khan, and I. Razzak, “Pca: Progressive class-wise attention for skin lesions diagnosis,”Engineering Applica- tions of Artificial Intelligence, vol. 127, p. 107417, 2024

work page 2024
[19]

Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, M. Usman, H. A. Khan, and I. Razzak, “Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices,”IEEE journal of biomedical and health informatics, 2023

work page 2023
[20]

Self-supervised spatial–temporal transformer fusion based federated framework for 4d cardiovascular image segmentation,

M. Mazher, I. Razzak, A. Qayyum, M. Tanveer, S. Beier, T. Khan, and S. A. Niederer, “Self-supervised spatial–temporal transformer fusion based federated framework for 4d cardiovascular image segmentation,” Information Fusion, vol. 106, p. 102256, 2024

work page 2024
[21]

Ra-net: Region-aware attention network for skin lesion segmentation,

A. Naveed, S. S. Naqvi, S. Iqbal, I. Razzak, H. A. Khan, and T. M. Khan, “Ra-net: Region-aware attention network for skin lesion segmentation,” Cognitive Computation, vol. 16, no. 5, pp. 2279–2296, 2024

work page 2024
[22]

Advancing medical image segmentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images,

S. Javed, T. M. Khan, A. Qayyum, H. Alinejad-Rokny, A. Sowmya, and I. Razzak, “Advancing medical image segmentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images,”arXiv preprint arXiv:2405.17520, 2024

work page arXiv 2024
[23]

Lmbis-net: A lightweight bidirectional skip connection based multipath cnn for retinal blood vessel segmentation,

M. Matloob Abbasi, S. Iqbal, K. Aurangzeb, M. Alhussein, and T. M. Khan, “Lmbis-net: A lightweight bidirectional skip connection based multipath cnn for retinal blood vessel segmentation,”Scientific Reports, vol. 14, no. 1, p. 15219, 2024

work page 2024
[24]

Lmbf- net: A lightweight multipath bidirectional focal attention network for multifeatures segmentation,

T. M. Khan, S. Iqbal, S. S. Naqvi, I. Razzak, and E. Meijering, “Lmbf- net: A lightweight multipath bidirectional focal attention network for multifeatures segmentation,” in2024 IEEE International Conference on Image Processing (ICIP). IEEE, 2024, pp. 2807–2813

work page 2024
[25]

Region guided attention network for retinal vessel segmentation,

S. Javed, T. M. Khan, A. Qayyum, A. Sowmya, and I. Razzak, “Region guided attention network for retinal vessel segmentation,”arXiv preprint arXiv:2407.18970, 2024

work page arXiv 2024
[26]

Tesl-net: a transformer-enhanced cnn for accurate skin lesion segmentation,

S. Iqbal, M. Zeeshan, M. Mehmood, T. Khan, and I. Razzak, “Tesl-net: a transformer-enhanced cnn for accurate skin lesion segmentation,” in 2024 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2024, pp. 313–320

work page 2024
[27]

Euis-net: A convolutional neural network for efficient ultrasound im- age segmentation,

S. Iqbal, H. Ahmed, M. Sharif, M. Hena, T. M. Khan, and I. Razzak, “Euis-net: A convolutional neural network for efficient ultrasound im- age segmentation,” inInternational Conference on Neural Information Processing. Springer Nature Singapore Singapore, 2024, pp. 388–401

work page 2024
[28]

Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, and E. Meijering, “Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation,”Pattern Recognition, vol. 158, p. 111028, 2025

work page 2025
[29]

Ad-net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation,

A. Naveed, S. S. Naqvi, T. M. Khan, S. Iqbal, M. Y . Wani, and H. A. Khan, “Ad-net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation,” Neural Computing and Applications, vol. 36, no. 35, pp. 22 277–22 299, 2024

work page 2024
[30]

Lssf-net: Lightweight segmentation with self-awareness, spatial atten- tion, and focal modulation,

H. Farooq, Z. Zafar, A. Saadat, T. M. Khan, S. Iqbal, and I. Razzak, “Lssf-net: Lightweight segmentation with self-awareness, spatial atten- tion, and focal modulation,”Artificial Intelligence in Medicine, vol. 158, 2024

work page 2024
[31]

Esdmr-net: A lightweight network with expand-squeeze and dual multiscale residual connections for medical image segmentation,

T. M. Khan, S. S. Naqvi, and E. Meijering, “Esdmr-net: A lightweight network with expand-squeeze and dual multiscale residual connections for medical image segmentation,”Engineering Applications of Artificial Intelligence, vol. 133, p. 107995, 2024

work page 2024
[32]

(2024) LVS-Net: A Lightweight Vessels Segmentation Network for Retinal Image Analysis

M. Mehmood, S. Iqbal, T. M. Khan, I. Spence, and M. Fahim, “Lvs-net: A lightweight vessels segmentation network for retinal image analysis,” arXiv preprint arXiv:2412.05968, 2024

work page arXiv 2024
[33]

Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,

Y . Xu, T. M. Khan, Y . Song, and E. Meijering, “Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,” Artificial Intelligence Review, vol. 58, no. 3, p. 93, 2025

work page 2025
[34]

Fm-net: Focal modulation-based network foraccurate skin lesion segmentation,

A. Naveed, S. S. Naqvi, T. M. Khan, Z. H. Janjua, S. A. M. Kirmani, and B. Qasim, “Fm-net: Focal modulation-based network foraccurate skin lesion segmentation,” 2025

work page 2025
[35]

The role of ai in early detection of life-threatening diseases: A retinal imaging perspective,

T. M. Khan, T. A. Soomro, and I. Razzak, “The role of ai in early detection of life-threatening diseases: A retinal imaging perspective,” arXiv preprint arXiv:2505.20810, 2025

work page arXiv 2025
[36]

Lfra- net: A lightweight focal and region-aware attention network for retinal vessel segmentatio,

M. Mehmood, S. Iqbal, T. M. Khan, I. Spence, and M. Fahim, “Lfra- net: A lightweight focal and region-aware attention network for retinal vessel segmentatio,”arXiv preprint arXiv:2509.11811, 2025

work page arXiv 2025
[37]

Entropy- driven adaptive neural architecture search for cell segmentation on edge devices,

Y . Xu, T. M. Khan, Y . Zhu, Y . Song, and E. Meijering, “Entropy- driven adaptive neural architecture search for cell segmentation on edge devices,”Available at SSRN 5490340, 2025

work page 2025
[38]

A novel approach to skin lesion segmentation using transformer attention and focal modulation,

T. M. Khan, D. Lin, S. Iqbal, and E. Meijering, “A novel approach to skin lesion segmentation using transformer attention and focal modulation,” Engineering Applications of Artificial Intelligence, vol. 162, p. 112603, 2025

work page 2025
[39]

Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation,

N. Tajbakhsh, L. Jeyaseelan, Q. Li, J. N. Chiang, Z. Wu, and X. Ding, “Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation,”Medical image analysis, vol. 63, p. 101693, 2020

work page 2020
[40]

Artificial intelli- gence as a diagnostic tool in non-invasive imaging in the assessment of coronary artery disease,

G. Doolub, M. Mamalakis, S. Alabed, R. J. Van der Geest, A. J. Swift, J. C. Rodrigues, P. Garg, N. V . Joshi, and A. Dastidar, “Artificial intelli- gence as a diagnostic tool in non-invasive imaging in the assessment of coronary artery disease,”Medical Sciences, vol. 11, no. 1, p. 20, 2023

work page 2023
[41]

A survey on instance segmentation: state of the art,

A. M. Hafiz and G. M. Bhat, “A survey on instance segmentation: state of the art,”International journal of multimedia information retrieval, vol. 9, no. 3, pp. 171–189, 2020

work page 2020
[42]

Improving the accuracy of lane detection by enhancing the long-range dependence,

B. Liu, L. Feng, Q. Zhao, G. Li, and Y . Chen, “Improving the accuracy of lane detection by enhancing the long-range dependence,”Electronics, vol. 12, no. 11, p. 2518, 2023

work page 2023
[43]

Multi-scale image recognition strategy based on convolutional neural network,

H. Zhang, S. Diao, Y . Yang, J. Zhong, and Y . Yan, “Multi-scale image recognition strategy based on convolutional neural network,”Journal of Computing and Electronic Information Management, vol. 12, no. 3, pp. 107–113, 2024

work page 2024
[44]

Fbsm: Foveabox- based boundary-aware segmentation method for green apples in natural orchards,

W. Jia, Z. Wang, R. Zhao, Z. Ji, X. Yin, and G. Liu, “Fbsm: Foveabox- based boundary-aware segmentation method for green apples in natural orchards,”Expert Systems with Applications, vol. 260, p. 125426, 2025

work page 2025
[45]

Retinalitenet: A lightweight transformer based cnn for retinal feature segmentation,

M. Mehmood, M. Alsharari, S. Iqbal, I. Spence, and M. Fahim, “Retinalitenet: A lightweight transformer based cnn for retinal feature segmentation,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 2454–2463

work page 2024
[46]

Implementing mobile phone solutions for health in resource constrained areas: Understanding the opportunities and challenges,

T. D. Manda and J. Herstad, “Implementing mobile phone solutions for health in resource constrained areas: Understanding the opportunities and challenges,” inE-Infrastructures and E-Services on Developing Countries: First International ICST Conference, AFRICOM 2009, Ma- puto, Mozambique, December 3-4, 2009. Proceedings 1. Springer, 2010, pp. 95–104

work page 2009
[47]

Deep learning for medical image segmentation: State- of-the-art advancements and challenges,

M. E. Rayed, S. S. Islam, S. I. Niha, J. R. Jim, M. M. Kabir, and M. Mridha, “Deep learning for medical image segmentation: State- of-the-art advancements and challenges,”Informatics in Medicine Un- locked, p. 101504, 2024

work page 2024
[48]

U-Net: Convolutional net- works for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional net- works for biomedical image segmentation,” inInternational Confer- ence on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234–241

work page 2015
[49]

H- DenseUNet: Hybrid densely connected UNet for liver and tumor seg- mentation from CT volumes,

X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H- DenseUNet: Hybrid densely connected UNet for liver and tumor seg- mentation from CT volumes,”IEEE Transactions on Medical Imaging, vol. 37, no. 12, pp. 2663–2674, 2018

work page 2018
[50]

UNet++: Redesigning skip connections to exploit multiscale features in image 13 segmentation,

Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: Redesigning skip connections to exploit multiscale features in image 13 segmentation,”IEEE Transactions on Medical Imaging, vol. 39, no. 6, pp. 1856–1867, 2019

work page 2019
[51]

UNet3+: A full-scale connected UNet for medical image segmentation,

H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y . Iwamoto, X. Han, Y .-W. Chen, and J. Wu, “UNet3+: A full-scale connected UNet for medical image segmentation,” inIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 1055–1059

work page 2020
[52]

nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation,

F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier- Hein, “nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation,”Nature Methods, vol. 18, no. 2, pp. 203–211, 2021

work page 2021
[53]

Attention U-Net: Learning Where to Look for the Pancreas

O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y . Hammerla, B. Kainzet al., “Attention U-Net: Learning where to look for the pancreas,”arXiv:1804.03999, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[54]

Joint optic disc and cup segmentation based on multi-label deep network and polar transformation,

H. Fu, J. Cheng, Y . Xu, D. W. K. Wong, J. Liu, and X. Cao, “Joint optic disc and cup segmentation based on multi-label deep network and polar transformation,”IEEE Transactions on Medical Imaging, vol. 37, no. 7, pp. 1597–1605, 2018

work page 2018
[55]

Inf-Net: Automatic COVID-19 lung infection segmentation from CT images,

D.-P. Fan, T. Zhou, G.-P. Ji, Y . Zhou, G. Chen, H. Fu, J. Shen, and L. Shao, “Inf-Net: Automatic COVID-19 lung infection segmentation from CT images,”IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2626–2637, 2020

work page 2020
[56]

Rethinking semantic segmen- tation from a sequence-to-sequence perspective with transformers,

S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y . Wang, Y . Fu, J. Feng, T. Xiang, P. H. S. Torr, and L. Zhang, “Rethinking semantic segmen- tation from a sequence-to-sequence perspective with transformers,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6881–6890

work page 2021
[57]

ResT: An efficient transformer for vi- sual recognition,

Q. Zhang and Y .-B. Yang, “ResT: An efficient transformer for vi- sual recognition,”Advances in Neural Information Processing Systems (NeurIPS), pp. 15 475–15 485, 2021

work page 2021
[58]

CrossFormer: A versatile vision transformer hinging on cross-scale attention,

W. Wang, L. Yao, L. Chen, B. Lin, D. Cai, X. He, and W. Liu, “CrossFormer: A versatile vision transformer hinging on cross-scale attention,”arXiv:2108.00154, 2021

work page arXiv 2021
[59]

Swin Transformer: Hierarchical vision transformer using shifted win- dows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical vision transformer using shifted win- dows,” inIEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10 012–10 022

work page 2021
[60]

Segmenter: Trans- former for semantic segmentation,

R. Strudel, R. Garcia, I. Laptev, and C. Schmid, “Segmenter: Trans- former for semantic segmentation,” inIEEE/CVF International Confer- ence on Computer Vision (ICCV), 2021, pp. 7262–7272

work page 2021
[61]

TransReID: Transformer-based object re-identification,

S. He, H. Luo, P. Wang, F. Wang, H. Li, and W. Jiang, “TransReID: Transformer-based object re-identification,” inIEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15 013–15 022

work page 2021
[62]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017
[63]

Training data-efficient image transformers & distillation through attention,

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” inInternational Conference on Machine Learning (ICML), 2021, pp. 10 347–10 357

work page 2021
[64]

CoAtNet: Marrying convolution and attention for all data sizes,

Z. Dai, H. Liu, Q. V . Le, and M. Tan, “CoAtNet: Marrying convolution and attention for all data sizes,”Advances in Neural Information Processing Systems (NeurIPS), pp. 3965–3977, 2021

work page 2021
[65]

Bottleneck transformers for visual recognition,

A. Srinivas, T.-Y . Lin, N. Parmar, J. Shlens, P. Abbeel, and A. Vaswani, “Bottleneck transformers for visual recognition,” inIEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16 519–16 529

work page 2021
[66]

H2Former: An effi- cient hierarchical hybrid transformer for medical image segmentation,

A. He, K. Wang, T. Li, C. Du, S. Xia, and H. Fu, “H2Former: An effi- cient hierarchical hybrid transformer for medical image segmentation,” IEEE Transactions on Medical Imaging, vol. 42, no. 9, pp. 2763–2775, 2023

work page 2023
[67]

TransFuse: Fusing transformers and CNNs for medical image segmentation,

Y . Zhang, H. Liu, and Q. Hu, “TransFuse: Fusing transformers and CNNs for medical image segmentation,” inInternational Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 14–24

work page 2021
[68]

CoTr: Efficiently bridg- ing CNN and Transformer for 3D medical image segmentation,

Y . Xie, J. Zhang, C. Shen, and Y . Xia, “CoTr: Efficiently bridg- ing CNN and Transformer for 3D medical image segmentation,” arXiv:2103.03024, 2021

work page arXiv 2021
[69]

After- Unet: Axial fusion transformer U-Net for medical image segmentation,

X. Yan, H. Tang, S. Sun, H. Ma, D. Kong, and X. Xie, “After- Unet: Axial fusion transformer U-Net for medical image segmentation,” inIEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 3971–3981

work page 2022
[70]

nnformer: Interleaved transformer for volumetric segmentation.arXiv preprint arXiv:2109.03201, 2021

H.-Y . Zhou, J. Guo, Y . Zhang, L. Yu, L. Wang, and Y . Yu, “nnFormer: In- terleaved transformer for volumetric segmentation,”arXiv:2109.03201, 2021

work page arXiv 2021
[71]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “TransUNet: Transformers make strong encoders for medical image segmentation,”arXiv:2102.04306, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021
[72]

Swin-Unet: Unet-like pure transformer for medical image segmenta- tion,

H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-Unet: Unet-like pure transformer for medical image segmenta- tion,” inEuropean Conference on Computer Vision (ECCV) Workshops, 2023, pp. 205–218

work page 2023
[73]

UTNet: A hybrid transformer architecture for medical image segmentation,

Y . Gao, M. Zhou, and D. N. Metaxas, “UTNet: A hybrid transformer architecture for medical image segmentation,” inInternational Confer- ence on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 61–71

work page 2021
[74]

Meta-Polyp: A baseline for efficient polyp segmentation,

Q.-H. Trinh, “Meta-Polyp: A baseline for efficient polyp segmentation,” inIEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), 2023, pp. 742–747

work page 2023
[75]

Attention Res-UNet with Guided Decoder for semantic segmentation of brain tumors,

D. Maji, P. Sigedar, and M. Singh, “Attention Res-UNet with Guided Decoder for semantic segmentation of brain tumors,”Biomedical Signal Processing and Control, vol. 71, p. 103077, 2022

work page 2022
[76]

Bi- directional ConvLSTM U-Net with densley connected convolutions,

R. Azad, M. Asadi-Aghbolaghi, M. Fathy, and S. Escalera, “Bi- directional ConvLSTM U-Net with densley connected convolutions,” inIEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2019

work page 2019
[77]

Using DUCK-Net for polyp image segmentation,

R.-G. Dumitru, D. Peteleaza, and C. Craciun, “Using DUCK-Net for polyp image segmentation,”Scientific Reports, vol. 13, no. 1, p. 9803, 2023

work page 2023
[78]

TBConvL-Net: A hybrid deep learning architecture for robust medical image segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, and E. Meijering, “TBConvL-Net: A hybrid deep learning architecture for robust medical image segmentation,”Pattern Recognition, p. 111028, 2024

work page 2024
[79]

Unet++: A nested U-Net architecture for medical image segmentation,

Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested U-Net architecture for medical image segmentation,” inDeep Learning in Medical Image Analysis (DLMIA) & Multimodal Learning for Clinical Decision Support (ML-CDS) Held in Conjunction with MICCAI, 2018, pp. 3–11

work page 2018
[80]

FAT-Net: Feature adaptive transformers for automated skin lesion segmentation,

H. Wu, S. Chen, G. Chen, W. Wang, B. Lei, and Z. Wen, “FAT-Net: Feature adaptive transformers for automated skin lesion segmentation,” Medical Image Analysis, vol. 76, p. 102327, 2022

work page 2022

Showing first 80 references.

[1] [1]

Automatic retinal vessel extraction algorithm,

T. A. Soomro, M. A. Khan, J. Gao, T. M. Khan, M. Paul, and N. Mir, “Automatic retinal vessel extraction algorithm,” in2016 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2016, pp. 1–8

work page 2016

[2] [2]

Automatic optic disk detection and segmentation by variational active contour estimation in retinal fundus images,

S. S. Naqvi, N. Fatima, T. M. Khan, Z. U. Rehman, and M. A. Khan, “Automatic optic disk detection and segmentation by variational active contour estimation in retinal fundus images,”Signal, Image and Video Processing, vol. 13, no. 6, pp. 1191–1198, 2019

work page 2019

[3] [3]

Shallow vessel segmentation network for automatic retinal vessel seg- mentation,

T. M. Khan, F. Abdullah, S. S. Naqvi, M. Arsalan, and M. A. Khan, “Shallow vessel segmentation network for automatic retinal vessel seg- mentation,” in2020 International Joint Conference on Neural Networks (IJCNN). IEEE, 2020, pp. 1–7

work page 2020

[4] [4]

A semantically flexible feature fusion network for retinal vessel segmentation,

T. M. Khan, A. Robles-Kelly, and S. S. Naqvi, “A semantically flexible feature fusion network for retinal vessel segmentation,” inInternational Conference on Neural Information Processing. Springer, Cham, 2020, pp. 159–167

work page 2020

[5] [5]

A review on glaucoma disease detection using computerized techniques,

F. Abdullah, R. Imtiaz, H. A. Madni, H. A. Khan, T. M. Khan, M. A. Khan, and S. S. Naqvi, “A review on glaucoma disease detection using computerized techniques,”IEEE Access, vol. 9, pp. 37 311–37 333, 2021

work page 2021

[6] [6]

Residual multiscale full convolutional network (rm-fcn) for high resolution se- mantic segmentation of retinal vasculature,

T. M. Khan, A. Robles-Kelly, S. S. Naqvi, and A. Muhammad, “Residual multiscale full convolutional network (rm-fcn) for high resolution se- mantic segmentation of retinal vasculature,” inStructural, Syntactic, and Statistical Pattern Recognition: Joint IAPR International Workshops, S+ SSPR 2020, Padua, Italy, January 21–22, 2021, Proceedings. Springer Nat...

work page 2020

[7] [7]

Rc-net: A convolutional neural network for retinal vessel segmentation,

T. M. Khan, A. Robles-Kelly, and S. S. Naqvi, “Rc-net: A convolutional neural network for retinal vessel segmentation,” in2021 Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2021, pp. 01–07

work page 2021

[8] [8]

G-net light: A lightweight modified google net for retinal vessel segmentation,

S. Iqbal, S. Naqvi, H. Ahmed, A. Saadat, and T. M. Khan, “G-net light: A lightweight modified google net for retinal vessel segmentation,” in Photonics, vol. 9, no. 12. MDPI, 2022, pp. 923–936. 12

work page 2022

[9] [9]

Prompt deep light-weight vessel segmentation network (plvs-net),

M. Arsalan, T. M. Khan, S. S. Naqvi, M. Nawaz, and I. Razzak, “Prompt deep light-weight vessel segmentation network (plvs-net),”IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 20, no. 2, pp. 1363–1371, 2022

work page 2022

[10] [10]

Recent trends and advances in fundus image analysis: A review,

S. Iqbal, T. M. Khan, K. Naveed, S. S. Naqvi, and S. J. Nawaz, “Recent trends and advances in fundus image analysis: A review,”Computers in Biology and Medicine, vol. 151, p. 106277, 2022

work page 2022

[11] [11]

Semi-supervised 3d- inceptionnet for segmentation and survival prediction of head and neck primary cancers,

A. Qayyum, M. Mazher, T. Khan, and I. Razzak, “Semi-supervised 3d- inceptionnet for segmentation and survival prediction of head and neck primary cancers,”Engineering Applications of Artificial Intelligence, vol. 117, p. 105590, 2023

work page 2023

[12] [12]

Simple and robust depth-wise cascaded network for polyp segmentation,

T. M. Khan, M. Arsalan, I. Razzak, and E. Meijering, “Simple and robust depth-wise cascaded network for polyp segmentation,”Engineering Applications of Artificial Intelligence, vol. 121, p. 106023, 2023

work page 2023

[13] [13]

Retinal vessel segmentation via a multi-resolution contextual network and adversarial learning,

T. M. Khan, S. S. Naqvi, A. Robles-Kelly, and I. Razzak, “Retinal vessel segmentation via a multi-resolution contextual network and adversarial learning,”Neural Networks, vol. 165, pp. 310–320, 2023

work page 2023

[14] [14]

Robust retinal blood vessel segmentation using a patch-based statistical adaptive multi-scale line detector,

S. Iqbal, K. Naveed, S. S. Naqvi, A. Naveed, and T. M. Khan, “Robust retinal blood vessel segmentation using a patch-based statistical adaptive multi-scale line detector,”Digital Signal Processing, vol. 139, p. 104075, 2023

work page 2023

[15] [15]

Mlr-net: A multi-layer residual convolutional neural network for leather defect segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, and G. Holmes, “Mlr-net: A multi-layer residual convolutional neural network for leather defect segmentation,”Engineering applications of artificial intelligence, vol. 126, p. 107007, 2023

work page 2023

[16] [16]

Fusion of textural and visual information for medical image modality retrieval using deep learning-based feature engineering,

S. Iqbal, A. N. Qureshi, M. Alhussein, I. A. Choudhry, K. Aurangzeb, and T. M. Khan, “Fusion of textural and visual information for medical image modality retrieval using deep learning-based feature engineering,” IEEE Access, vol. 11, pp. 93 238–93 253, 2023

work page 2023

[17] [17]

Feature enhancer segmentation network (fes-net) for vessel segmentation,

T. M. Khan, M. Arsalan, S. Iqbal, I. Razzak, and E. Meijering, “Feature enhancer segmentation network (fes-net) for vessel segmentation,” in 2023 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2023, pp. 160–167

work page 2023

[18] [18]

Pca: Progressive class-wise attention for skin lesions diagnosis,

A. Naveed, S. S. Naqvi, T. M. Khan, and I. Razzak, “Pca: Progressive class-wise attention for skin lesions diagnosis,”Engineering Applica- tions of Artificial Intelligence, vol. 127, p. 107417, 2024

work page 2024

[19] [19]

Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, M. Usman, H. A. Khan, and I. Razzak, “Ldmres-net: A lightweight neural network for efficient medical image segmentation on iot and edge devices,”IEEE journal of biomedical and health informatics, 2023

work page 2023

[20] [20]

Self-supervised spatial–temporal transformer fusion based federated framework for 4d cardiovascular image segmentation,

M. Mazher, I. Razzak, A. Qayyum, M. Tanveer, S. Beier, T. Khan, and S. A. Niederer, “Self-supervised spatial–temporal transformer fusion based federated framework for 4d cardiovascular image segmentation,” Information Fusion, vol. 106, p. 102256, 2024

work page 2024

[21] [21]

Ra-net: Region-aware attention network for skin lesion segmentation,

A. Naveed, S. S. Naqvi, S. Iqbal, I. Razzak, H. A. Khan, and T. M. Khan, “Ra-net: Region-aware attention network for skin lesion segmentation,” Cognitive Computation, vol. 16, no. 5, pp. 2279–2296, 2024

work page 2024

[22] [22]

Advancing medical image segmentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images,

S. Javed, T. M. Khan, A. Qayyum, H. Alinejad-Rokny, A. Sowmya, and I. Razzak, “Advancing medical image segmentation with mini-net: A lightweight solution tailored for efficient segmentation of medical images,”arXiv preprint arXiv:2405.17520, 2024

work page arXiv 2024

[23] [23]

Lmbis-net: A lightweight bidirectional skip connection based multipath cnn for retinal blood vessel segmentation,

M. Matloob Abbasi, S. Iqbal, K. Aurangzeb, M. Alhussein, and T. M. Khan, “Lmbis-net: A lightweight bidirectional skip connection based multipath cnn for retinal blood vessel segmentation,”Scientific Reports, vol. 14, no. 1, p. 15219, 2024

work page 2024

[24] [24]

Lmbf- net: A lightweight multipath bidirectional focal attention network for multifeatures segmentation,

T. M. Khan, S. Iqbal, S. S. Naqvi, I. Razzak, and E. Meijering, “Lmbf- net: A lightweight multipath bidirectional focal attention network for multifeatures segmentation,” in2024 IEEE International Conference on Image Processing (ICIP). IEEE, 2024, pp. 2807–2813

work page 2024

[25] [25]

Region guided attention network for retinal vessel segmentation,

S. Javed, T. M. Khan, A. Qayyum, A. Sowmya, and I. Razzak, “Region guided attention network for retinal vessel segmentation,”arXiv preprint arXiv:2407.18970, 2024

work page arXiv 2024

[26] [26]

Tesl-net: a transformer-enhanced cnn for accurate skin lesion segmentation,

S. Iqbal, M. Zeeshan, M. Mehmood, T. Khan, and I. Razzak, “Tesl-net: a transformer-enhanced cnn for accurate skin lesion segmentation,” in 2024 International Conference on Digital Image Computing: Techniques and Applications (DICTA). IEEE, 2024, pp. 313–320

work page 2024

[27] [27]

Euis-net: A convolutional neural network for efficient ultrasound im- age segmentation,

S. Iqbal, H. Ahmed, M. Sharif, M. Hena, T. M. Khan, and I. Razzak, “Euis-net: A convolutional neural network for efficient ultrasound im- age segmentation,” inInternational Conference on Neural Information Processing. Springer Nature Singapore Singapore, 2024, pp. 388–401

work page 2024

[28] [28]

Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, and E. Meijering, “Tbconvl-net: A hybrid deep learning architecture for robust medical image segmentation,”Pattern Recognition, vol. 158, p. 111028, 2025

work page 2025

[29] [29]

Ad-net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation,

A. Naveed, S. S. Naqvi, T. M. Khan, S. Iqbal, M. Y . Wani, and H. A. Khan, “Ad-net: Attention-based dilated convolutional residual network with guided decoder for robust skin lesion segmentation,” Neural Computing and Applications, vol. 36, no. 35, pp. 22 277–22 299, 2024

work page 2024

[30] [30]

Lssf-net: Lightweight segmentation with self-awareness, spatial atten- tion, and focal modulation,

H. Farooq, Z. Zafar, A. Saadat, T. M. Khan, S. Iqbal, and I. Razzak, “Lssf-net: Lightweight segmentation with self-awareness, spatial atten- tion, and focal modulation,”Artificial Intelligence in Medicine, vol. 158, 2024

work page 2024

[31] [31]

Esdmr-net: A lightweight network with expand-squeeze and dual multiscale residual connections for medical image segmentation,

T. M. Khan, S. S. Naqvi, and E. Meijering, “Esdmr-net: A lightweight network with expand-squeeze and dual multiscale residual connections for medical image segmentation,”Engineering Applications of Artificial Intelligence, vol. 133, p. 107995, 2024

work page 2024

[32] [32]

(2024) LVS-Net: A Lightweight Vessels Segmentation Network for Retinal Image Analysis

M. Mehmood, S. Iqbal, T. M. Khan, I. Spence, and M. Fahim, “Lvs-net: A lightweight vessels segmentation network for retinal image analysis,” arXiv preprint arXiv:2412.05968, 2024

work page arXiv 2024

[33] [33]

Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,

Y . Xu, T. M. Khan, Y . Song, and E. Meijering, “Edge deep learning in computer vision and medical diagnostics: a comprehensive survey,” Artificial Intelligence Review, vol. 58, no. 3, p. 93, 2025

work page 2025

[34] [34]

Fm-net: Focal modulation-based network foraccurate skin lesion segmentation,

A. Naveed, S. S. Naqvi, T. M. Khan, Z. H. Janjua, S. A. M. Kirmani, and B. Qasim, “Fm-net: Focal modulation-based network foraccurate skin lesion segmentation,” 2025

work page 2025

[35] [35]

The role of ai in early detection of life-threatening diseases: A retinal imaging perspective,

T. M. Khan, T. A. Soomro, and I. Razzak, “The role of ai in early detection of life-threatening diseases: A retinal imaging perspective,” arXiv preprint arXiv:2505.20810, 2025

work page arXiv 2025

[36] [36]

Lfra- net: A lightweight focal and region-aware attention network for retinal vessel segmentatio,

M. Mehmood, S. Iqbal, T. M. Khan, I. Spence, and M. Fahim, “Lfra- net: A lightweight focal and region-aware attention network for retinal vessel segmentatio,”arXiv preprint arXiv:2509.11811, 2025

work page arXiv 2025

[37] [37]

Entropy- driven adaptive neural architecture search for cell segmentation on edge devices,

Y . Xu, T. M. Khan, Y . Zhu, Y . Song, and E. Meijering, “Entropy- driven adaptive neural architecture search for cell segmentation on edge devices,”Available at SSRN 5490340, 2025

work page 2025

[38] [38]

A novel approach to skin lesion segmentation using transformer attention and focal modulation,

T. M. Khan, D. Lin, S. Iqbal, and E. Meijering, “A novel approach to skin lesion segmentation using transformer attention and focal modulation,” Engineering Applications of Artificial Intelligence, vol. 162, p. 112603, 2025

work page 2025

[39] [39]

Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation,

N. Tajbakhsh, L. Jeyaseelan, Q. Li, J. N. Chiang, Z. Wu, and X. Ding, “Embracing imperfect datasets: A review of deep learning solutions for medical image segmentation,”Medical image analysis, vol. 63, p. 101693, 2020

work page 2020

[40] [40]

Artificial intelli- gence as a diagnostic tool in non-invasive imaging in the assessment of coronary artery disease,

G. Doolub, M. Mamalakis, S. Alabed, R. J. Van der Geest, A. J. Swift, J. C. Rodrigues, P. Garg, N. V . Joshi, and A. Dastidar, “Artificial intelli- gence as a diagnostic tool in non-invasive imaging in the assessment of coronary artery disease,”Medical Sciences, vol. 11, no. 1, p. 20, 2023

work page 2023

[41] [41]

A survey on instance segmentation: state of the art,

A. M. Hafiz and G. M. Bhat, “A survey on instance segmentation: state of the art,”International journal of multimedia information retrieval, vol. 9, no. 3, pp. 171–189, 2020

work page 2020

[42] [42]

Improving the accuracy of lane detection by enhancing the long-range dependence,

B. Liu, L. Feng, Q. Zhao, G. Li, and Y . Chen, “Improving the accuracy of lane detection by enhancing the long-range dependence,”Electronics, vol. 12, no. 11, p. 2518, 2023

work page 2023

[43] [43]

Multi-scale image recognition strategy based on convolutional neural network,

H. Zhang, S. Diao, Y . Yang, J. Zhong, and Y . Yan, “Multi-scale image recognition strategy based on convolutional neural network,”Journal of Computing and Electronic Information Management, vol. 12, no. 3, pp. 107–113, 2024

work page 2024

[44] [44]

Fbsm: Foveabox- based boundary-aware segmentation method for green apples in natural orchards,

W. Jia, Z. Wang, R. Zhao, Z. Ji, X. Yin, and G. Liu, “Fbsm: Foveabox- based boundary-aware segmentation method for green apples in natural orchards,”Expert Systems with Applications, vol. 260, p. 125426, 2025

work page 2025

[45] [45]

Retinalitenet: A lightweight transformer based cnn for retinal feature segmentation,

M. Mehmood, M. Alsharari, S. Iqbal, I. Spence, and M. Fahim, “Retinalitenet: A lightweight transformer based cnn for retinal feature segmentation,” inProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, 2024, pp. 2454–2463

work page 2024

[46] [46]

Implementing mobile phone solutions for health in resource constrained areas: Understanding the opportunities and challenges,

T. D. Manda and J. Herstad, “Implementing mobile phone solutions for health in resource constrained areas: Understanding the opportunities and challenges,” inE-Infrastructures and E-Services on Developing Countries: First International ICST Conference, AFRICOM 2009, Ma- puto, Mozambique, December 3-4, 2009. Proceedings 1. Springer, 2010, pp. 95–104

work page 2009

[47] [47]

Deep learning for medical image segmentation: State- of-the-art advancements and challenges,

M. E. Rayed, S. S. Islam, S. I. Niha, J. R. Jim, M. M. Kabir, and M. Mridha, “Deep learning for medical image segmentation: State- of-the-art advancements and challenges,”Informatics in Medicine Un- locked, p. 101504, 2024

work page 2024

[48] [48]

U-Net: Convolutional net- works for biomedical image segmentation,

O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional net- works for biomedical image segmentation,” inInternational Confer- ence on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234–241

work page 2015

[49] [49]

H- DenseUNet: Hybrid densely connected UNet for liver and tumor seg- mentation from CT volumes,

X. Li, H. Chen, X. Qi, Q. Dou, C.-W. Fu, and P.-A. Heng, “H- DenseUNet: Hybrid densely connected UNet for liver and tumor seg- mentation from CT volumes,”IEEE Transactions on Medical Imaging, vol. 37, no. 12, pp. 2663–2674, 2018

work page 2018

[50] [50]

UNet++: Redesigning skip connections to exploit multiscale features in image 13 segmentation,

Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “UNet++: Redesigning skip connections to exploit multiscale features in image 13 segmentation,”IEEE Transactions on Medical Imaging, vol. 39, no. 6, pp. 1856–1867, 2019

work page 2019

[51] [51]

UNet3+: A full-scale connected UNet for medical image segmentation,

H. Huang, L. Lin, R. Tong, H. Hu, Q. Zhang, Y . Iwamoto, X. Han, Y .-W. Chen, and J. Wu, “UNet3+: A full-scale connected UNet for medical image segmentation,” inIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 1055–1059

work page 2020

[52] [52]

nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation,

F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier- Hein, “nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation,”Nature Methods, vol. 18, no. 2, pp. 203–211, 2021

work page 2021

[53] [53]

Attention U-Net: Learning Where to Look for the Pancreas

O. Oktay, J. Schlemper, L. L. Folgoc, M. Lee, M. Heinrich, K. Misawa, K. Mori, S. McDonagh, N. Y . Hammerla, B. Kainzet al., “Attention U-Net: Learning where to look for the pancreas,”arXiv:1804.03999, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[54] [54]

Joint optic disc and cup segmentation based on multi-label deep network and polar transformation,

H. Fu, J. Cheng, Y . Xu, D. W. K. Wong, J. Liu, and X. Cao, “Joint optic disc and cup segmentation based on multi-label deep network and polar transformation,”IEEE Transactions on Medical Imaging, vol. 37, no. 7, pp. 1597–1605, 2018

work page 2018

[55] [55]

Inf-Net: Automatic COVID-19 lung infection segmentation from CT images,

D.-P. Fan, T. Zhou, G.-P. Ji, Y . Zhou, G. Chen, H. Fu, J. Shen, and L. Shao, “Inf-Net: Automatic COVID-19 lung infection segmentation from CT images,”IEEE Transactions on Medical Imaging, vol. 39, no. 8, pp. 2626–2637, 2020

work page 2020

[56] [56]

Rethinking semantic segmen- tation from a sequence-to-sequence perspective with transformers,

S. Zheng, J. Lu, H. Zhao, X. Zhu, Z. Luo, Y . Wang, Y . Fu, J. Feng, T. Xiang, P. H. S. Torr, and L. Zhang, “Rethinking semantic segmen- tation from a sequence-to-sequence perspective with transformers,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 6881–6890

work page 2021

[57] [57]

ResT: An efficient transformer for vi- sual recognition,

Q. Zhang and Y .-B. Yang, “ResT: An efficient transformer for vi- sual recognition,”Advances in Neural Information Processing Systems (NeurIPS), pp. 15 475–15 485, 2021

work page 2021

[58] [58]

CrossFormer: A versatile vision transformer hinging on cross-scale attention,

W. Wang, L. Yao, L. Chen, B. Lin, D. Cai, X. He, and W. Liu, “CrossFormer: A versatile vision transformer hinging on cross-scale attention,”arXiv:2108.00154, 2021

work page arXiv 2021

[59] [59]

Swin Transformer: Hierarchical vision transformer using shifted win- dows,

Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin Transformer: Hierarchical vision transformer using shifted win- dows,” inIEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10 012–10 022

work page 2021

[60] [60]

Segmenter: Trans- former for semantic segmentation,

R. Strudel, R. Garcia, I. Laptev, and C. Schmid, “Segmenter: Trans- former for semantic segmentation,” inIEEE/CVF International Confer- ence on Computer Vision (ICCV), 2021, pp. 7262–7272

work page 2021

[61] [61]

TransReID: Transformer-based object re-identification,

S. He, H. Luo, P. Wang, F. Wang, H. Li, and W. Jiang, “TransReID: Transformer-based object re-identification,” inIEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 15 013–15 022

work page 2021

[62] [62]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” inAdvances in Neural Information Processing Systems (NeurIPS), 2017

work page 2017

[63] [63]

Training data-efficient image transformers & distillation through attention,

H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, and H. Jégou, “Training data-efficient image transformers & distillation through attention,” inInternational Conference on Machine Learning (ICML), 2021, pp. 10 347–10 357

work page 2021

[64] [64]

CoAtNet: Marrying convolution and attention for all data sizes,

Z. Dai, H. Liu, Q. V . Le, and M. Tan, “CoAtNet: Marrying convolution and attention for all data sizes,”Advances in Neural Information Processing Systems (NeurIPS), pp. 3965–3977, 2021

work page 2021

[65] [65]

Bottleneck transformers for visual recognition,

A. Srinivas, T.-Y . Lin, N. Parmar, J. Shlens, P. Abbeel, and A. Vaswani, “Bottleneck transformers for visual recognition,” inIEEE/CVF Confer- ence on Computer Vision and Pattern Recognition (CVPR), 2021, pp. 16 519–16 529

work page 2021

[66] [66]

H2Former: An effi- cient hierarchical hybrid transformer for medical image segmentation,

A. He, K. Wang, T. Li, C. Du, S. Xia, and H. Fu, “H2Former: An effi- cient hierarchical hybrid transformer for medical image segmentation,” IEEE Transactions on Medical Imaging, vol. 42, no. 9, pp. 2763–2775, 2023

work page 2023

[67] [67]

TransFuse: Fusing transformers and CNNs for medical image segmentation,

Y . Zhang, H. Liu, and Q. Hu, “TransFuse: Fusing transformers and CNNs for medical image segmentation,” inInternational Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 14–24

work page 2021

[68] [68]

CoTr: Efficiently bridg- ing CNN and Transformer for 3D medical image segmentation,

Y . Xie, J. Zhang, C. Shen, and Y . Xia, “CoTr: Efficiently bridg- ing CNN and Transformer for 3D medical image segmentation,” arXiv:2103.03024, 2021

work page arXiv 2021

[69] [69]

After- Unet: Axial fusion transformer U-Net for medical image segmentation,

X. Yan, H. Tang, S. Sun, H. Ma, D. Kong, and X. Xie, “After- Unet: Axial fusion transformer U-Net for medical image segmentation,” inIEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 3971–3981

work page 2022

[70] [70]

nnformer: Interleaved transformer for volumetric segmentation.arXiv preprint arXiv:2109.03201, 2021

H.-Y . Zhou, J. Guo, Y . Zhang, L. Yu, L. Wang, and Y . Yu, “nnFormer: In- terleaved transformer for volumetric segmentation,”arXiv:2109.03201, 2021

work page arXiv 2021

[71] [71]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

J. Chen, Y . Lu, Q. Yu, X. Luo, E. Adeli, Y . Wang, L. Lu, A. L. Yuille, and Y . Zhou, “TransUNet: Transformers make strong encoders for medical image segmentation,”arXiv:2102.04306, 2021

work page internal anchor Pith review Pith/arXiv arXiv 2021

[72] [72]

Swin-Unet: Unet-like pure transformer for medical image segmenta- tion,

H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-Unet: Unet-like pure transformer for medical image segmenta- tion,” inEuropean Conference on Computer Vision (ECCV) Workshops, 2023, pp. 205–218

work page 2023

[73] [73]

UTNet: A hybrid transformer architecture for medical image segmentation,

Y . Gao, M. Zhou, and D. N. Metaxas, “UTNet: A hybrid transformer architecture for medical image segmentation,” inInternational Confer- ence on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021, pp. 61–71

work page 2021

[74] [74]

Meta-Polyp: A baseline for efficient polyp segmentation,

Q.-H. Trinh, “Meta-Polyp: A baseline for efficient polyp segmentation,” inIEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), 2023, pp. 742–747

work page 2023

[75] [75]

Attention Res-UNet with Guided Decoder for semantic segmentation of brain tumors,

D. Maji, P. Sigedar, and M. Singh, “Attention Res-UNet with Guided Decoder for semantic segmentation of brain tumors,”Biomedical Signal Processing and Control, vol. 71, p. 103077, 2022

work page 2022

[76] [76]

Bi- directional ConvLSTM U-Net with densley connected convolutions,

R. Azad, M. Asadi-Aghbolaghi, M. Fathy, and S. Escalera, “Bi- directional ConvLSTM U-Net with densley connected convolutions,” inIEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2019

work page 2019

[77] [77]

Using DUCK-Net for polyp image segmentation,

R.-G. Dumitru, D. Peteleaza, and C. Craciun, “Using DUCK-Net for polyp image segmentation,”Scientific Reports, vol. 13, no. 1, p. 9803, 2023

work page 2023

[78] [78]

TBConvL-Net: A hybrid deep learning architecture for robust medical image segmentation,

S. Iqbal, T. M. Khan, S. S. Naqvi, A. Naveed, and E. Meijering, “TBConvL-Net: A hybrid deep learning architecture for robust medical image segmentation,”Pattern Recognition, p. 111028, 2024

work page 2024

[79] [79]

Unet++: A nested U-Net architecture for medical image segmentation,

Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested U-Net architecture for medical image segmentation,” inDeep Learning in Medical Image Analysis (DLMIA) & Multimodal Learning for Clinical Decision Support (ML-CDS) Held in Conjunction with MICCAI, 2018, pp. 3–11

work page 2018

[80] [80]

FAT-Net: Feature adaptive transformers for automated skin lesion segmentation,

H. Wu, S. Chen, G. Chen, W. Wang, B. Lei, and Z. Wen, “FAT-Net: Feature adaptive transformers for automated skin lesion segmentation,” Medical Image Analysis, vol. 76, p. 102327, 2022

work page 2022