TopoMamba: Topology-Aware Scanning and Fusion for Segmenting Heterogeneous Medical Visual Media
Pith reviewed 2026-05-07 16:42 UTC · model grok-4.3
The pith
TopoMamba adds diagonal and anti-diagonal scans to state-space models and fuses them with a dependence-aware gate to improve segmentation of curved structures in medical images.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
TopoMamba augments state-space models with a TopoA-Scan branch that traverses diagonal and anti-diagonal paths to capture complementary structural priors, merges the resulting features with the standard Cross-Scan branch via an HSIC Gate that uses a Hilbert-Schmidt independence criterion scalar to control interaction, and employs ScanCache to amortize index construction across varying resolutions. Experiments on Synapse CT, ISIC 2017 dermoscopy, and CVC-ClinicDB endoscopy demonstrate consistent gains over CNN, Transformer, and baseline SSM methods, with the largest benefits appearing on thin or curved anatomical targets, while the 3D instantiation supports practical volumetric segmentation.
What carries the argument
The TopoA-Scan branch (diagonal and anti-diagonal ordering) paired with the Cross-Scan branch and regulated by the HSIC Gate, which supplies complementary priors for oblique structures and limits redundant fusion.
If this is right
- Segmentation accuracy rises on thin or curved anatomical structures across CT, dermoscopy, and endoscopy.
- The method retains favorable runtime and memory use compared with Transformer and standard SSM baselines under variable input resolutions.
- A single 3D instantiation extends the same scan-and-gate design to volumetric clinical volumes.
- The caching mechanism reduces repeated computation when input sizes recur in clinical workflows.
Where Pith is reading between the lines
- The same diagonal-scan ordering might help other vision tasks that involve non-grid-aligned features, such as vessel tracing or road segmentation.
- An HSIC-style dependence gate could be tested in multi-modal fusion settings where branch redundancy is a known issue.
- If the topology priors prove stable across modalities, training regimes might require less aggressive augmentation focused on orientation changes.
Load-bearing premise
The added diagonal and anti-diagonal scans supply genuinely new structural information that the standard scans miss, and the HSIC scalar gate can balance the branches without discarding useful detail or introducing fitting artifacts.
What would settle it
Running the same evaluation sets with the TopoA-Scan and HSIC Gate removed or replaced by a simple average fusion, then measuring whether accuracy on curved targets such as the pancreas drops back to baseline levels.
Figures
read the original abstract
Visual state-space models (SSMs) have shown strong potential for medical image segmentation, yet their effectiveness is often limited by two practical issues: axis-biased scan ordering weakens the modeling of oblique and curved structures, and naive multi-branch fusion tends to amplify redundant responses. We present TopoMamba, a topology-aware scan-and-fuse framework for segmenting heterogeneous medical visual media. The method combines a diagonal/anti-diagonal TopoA-Scan branch with the standard Cross-Scan branch to provide complementary structural priors, and introduces ScanCache, a device-aware caching mechanism that amortizes explicit scan-index construction across recurring resolutions. To fuse heterogeneous scan features efficiently, we further propose a lightweight HSIC Gate that regulates branch interaction using a dependence-aware scalar gating rule. We also instantiate a volumetric TopoMamba-3D for practical 3D clinical segmentation. Experiments on Synapse CT, ISIC 2017 dermoscopy, and CVC-ClinicDB endoscopy show that TopoMamba consistently improves segmentation quality over strong CNN, Transformer, and SSM baselines, with particularly clear gains on thin or curved targets such as the pancreas and gallbladder, while maintaining favorable deployment efficiency under dynamic input resolutions. These results suggest that topology-aware scan ordering and lightweight dependence-aware fusion form an effective and practical design for medical multimedia segmentation. The code will be made publicly available.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces TopoMamba, a topology-aware scan-and-fuse framework for medical image segmentation based on visual state-space models. It augments the standard Cross-Scan with a diagonal/anti-diagonal TopoA-Scan branch to capture oblique and curved structures, adds ScanCache for amortizing scan-index construction across resolutions, and proposes a lightweight HSIC Gate that fuses multi-branch features via a dependence-aware scalar. A 3D volumetric extension is also instantiated. Experiments on Synapse CT, ISIC 2017 dermoscopy, and CVC-ClinicDB endoscopy are reported to show consistent gains over CNN, Transformer, and SSM baselines, especially on thin/curved targets such as the pancreas and gallbladder, while preserving deployment efficiency under varying input resolutions.
Significance. If the empirical claims hold after proper validation, the work would offer a practical advance in applying SSMs to heterogeneous medical imaging by mitigating axis-biased scanning and redundant fusion. The emphasis on efficiency under dynamic resolutions and the 3D extension address real clinical constraints. The public code release is a positive factor for reproducibility.
major comments (3)
- [Abstract / Experiments] Abstract and Experiments section: The central claim of consistent outperformance and particular gains on thin/curved targets is stated without any quantitative metrics (e.g., Dice scores, IoU, Hausdorff distance), error bars, statistical tests, or detailed baseline configurations. This absence prevents evaluation of the magnitude and reliability of the reported improvements.
- [Method (TopoA-Scan / HSIC Gate)] Method (TopoA-Scan and HSIC Gate): The assumption that the diagonal/anti-diagonal TopoA-Scan supplies genuinely complementary structural priors beyond axis-aligned Cross-Scan, and that the HSIC Gate regulates interaction via a single dependence-aware scalar without introducing fitting artifacts or information loss, is load-bearing but unsupported by any ablation isolating each component or analysis of the gate's information-preservation properties.
- [Experiments] Experiments section: No ablation tables or controlled studies are referenced that would demonstrate the incremental benefit of TopoA-Scan over simply adding extra scan branches/parameters, or that observed gains on pancreas/gallbladder exceed what would be expected from increased model capacity alone.
minor comments (2)
- [Abstract] The abstract is concise but would be strengthened by including one or two key quantitative results (e.g., average Dice improvement) to ground the performance claims.
- [Method] Notation for the HSIC Gate scalar and the exact dependence measure should be defined more explicitly in the method section to aid reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, indicating the revisions we will incorporate.
read point-by-point responses
-
Referee: [Abstract / Experiments] Abstract and Experiments section: The central claim of consistent outperformance and particular gains on thin/curved targets is stated without any quantitative metrics (e.g., Dice scores, IoU, Hausdorff distance), error bars, statistical tests, or detailed baseline configurations. This absence prevents evaluation of the magnitude and reliability of the reported improvements.
Authors: We acknowledge that the abstract states the improvements qualitatively. The Experiments section presents quantitative results via tables reporting Dice, IoU, and related metrics on Synapse, ISIC 2017, and CVC-ClinicDB, with comparisons to CNN, Transformer, and SSM baselines. To address the concern directly, we will revise the abstract to include specific key numbers (e.g., mean Dice on Synapse and the improvement on pancreas). We will also add standard deviation error bars to the tables, include statistical significance tests (e.g., paired t-tests or Wilcoxon) for the main comparisons, and expand the description of baseline configurations and training protocols in the main text. revision: yes
-
Referee: [Method (TopoA-Scan / HSIC Gate)] Method (TopoA-Scan and HSIC Gate): The assumption that the diagonal/anti-diagonal TopoA-Scan supplies genuinely complementary structural priors beyond axis-aligned Cross-Scan, and that the HSIC Gate regulates interaction via a single dependence-aware scalar without introducing fitting artifacts or information loss, is load-bearing but unsupported by any ablation isolating each component or analysis of the gate's information-preservation properties.
Authors: We agree that explicit isolation of each component would strengthen the claims. We will add ablation experiments that disable the TopoA-Scan branch (replacing it with a capacity-matched extra axis-aligned branch) and report the resulting drop in performance on curved structures. For the HSIC Gate, we will add an analysis comparing HSIC dependence scores before and after gating, together with a direct comparison against simple addition and concatenation baselines to quantify information preservation and rule out fitting artifacts. These results will appear in a dedicated ablation subsection and table. revision: yes
-
Referee: [Experiments] Experiments section: No ablation tables or controlled studies are referenced that would demonstrate the incremental benefit of TopoA-Scan over simply adding extra scan branches/parameters, or that observed gains on pancreas/gallbladder exceed what would be expected from increased model capacity alone.
Authors: We will introduce new controlled ablations that match parameter count exactly: one variant adds redundant scan branches without topology awareness, and another scales the baseline SSM capacity to match TopoMamba. We will report per-organ Dice scores on pancreas and gallbladder for these capacity-controlled variants alongside the full model, demonstrating that the observed gains exceed those attributable to capacity alone. The new table and discussion will be placed in the Experiments section. revision: yes
Circularity Check
No significant circularity; method is algorithmic construction validated by experiments.
full rationale
The paper presents TopoMamba as a practical algorithmic framework (TopoA-Scan + Cross-Scan + HSIC Gate + ScanCache) for medical segmentation. No equations, derivations, or self-referential reductions appear in the abstract or described claims that would make any 'prediction' equivalent to its inputs by construction. Improvements are shown via empirical results on Synapse CT, ISIC 2017, and CVC-ClinicDB rather than fitted parameters renamed as outputs. No load-bearing self-citations, uniqueness theorems, or smuggled ansatzes are referenced in the provided text. The derivation chain is self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Segment anything in medical images,
J. Ma, Y . He, F. Li, L. Han, C. You, and B. Wang, “Segment anything in medical images,”Nature Communications, vol. 15, no. 1, p. 654, 2024
2024
-
[2]
A generalist foundation model and database for open-world medical image segmentation,
S. Zhang, Q. Zhang, S. Zhang, X. Liu, J. Yue, M. Lu, H. Xu, J. Yao, X. Wei, J. Caoet al., “A generalist foundation model and database for open-world medical image segmentation,”Nature Biomedical Engineer- ing, pp. 1–16, 2025
2025
-
[3]
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
L. Zhu, B. Liao, Q. Zhang, X. Wang, W. Liu, and X. Wang, “Vision mamba: Efficient visual representation learning with bidirectional state space model,”arXiv preprint arXiv:2401.09417, 2024
work page internal anchor Pith review arXiv 2024
-
[4]
VMamba: Visual State Space Model
Y . Liu, Y . Tian, Y . Zhao, H. Yu, L. Xie, Y . Wang, Q. Ye, and Y . Liu, “Vmamba: Visual state space model 2024,”arXiv preprint arXiv:2401.10166, 2024
work page internal anchor Pith review arXiv 2024
-
[5]
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
J. Ma, F. Li, and B. Wang, “U-mamba: Enhancing long-range dependency for biomedical image segmentation,”arXiv preprint arXiv:2401.04722, 2024
work page internal anchor Pith review arXiv 2024
-
[6]
Swin-umamba: Mamba-based unet with imagenet-based pretraining,
J. Liu, H. Yang, H.-Y . Zhou, Y . Xi, L. Yu, C. Li, Y . Liang, G. Shi, Y . Yu, S. Zhanget al., “Swin-umamba: Mamba-based unet with imagenet-based pretraining,” inInternational conference on medical image computing and computer-assisted intervention. Springer, 2024, pp. 615–625
2024
-
[7]
Vm-unet: Vision mamba unet for medical image segmentation,
J. Ruan, J. Li, and S. Xiang, “Vm-unet: Vision mamba unet for medical image segmentation,”ACM Transactions on Multimedia Computing, Communications and Applications, 2024
2024
-
[8]
Zigma: A dit-style zigzag mamba diffusion model,
V . T. Hu, S. A. Baumann, M. Gui, O. Grebenkova, P. Ma, J. Fischer, and B. Ommer, “Zigma: A dit-style zigzag mamba diffusion model,” in European conference on computer vision. Springer, 2024, pp. 148–166
2024
-
[9]
Dynamic snake convo- lution based on topological geometric constraints for tubular structure segmentation,
Y . Qi, Y . He, X. Qi, Y . Zhang, and G. Yang, “Dynamic snake convo- lution based on topological geometric constraints for tubular structure segmentation,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 6070–6079
2023
-
[10]
Temporal ensembling for semi-supervised learn- ing,
S. Laine and T. Aila, “Temporal ensembling for semi-supervised learn- ing,” inICLR, 2017
2017
-
[11]
Rsmamba: Remote sensing image classification with state space model,
K. Chen, B. Chen, C. Liu, W. Li, Z. Zou, and Z. Shi, “Rsmamba: Remote sensing image classification with state space model,”IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024
2024
-
[12]
Plainmamba: Improving non- hierarchical mamba in visual recognition
C. Yang, Z. Chen, M. Espinosa, L. Ericsson, Z. Wang, J. Liu, and E. J. Crowley, “Plainmamba: Improving non-hierarchical mamba in visual recognition,”arXiv preprint arXiv:2403.17695, 2024
-
[13]
Measuring statistical dependence with hilbert-schmidt norms,
A. Gretton, O. Bousquet, A. Smola, and B. Sch ¨olkopf, “Measuring statistical dependence with hilbert-schmidt norms,” inInternational conference on algorithmic learning theory. Springer, 2005, pp. 63– 77
2005
-
[14]
A kernel statistical test of independence,
A. Gretton, K. Fukumizu, C. Teo, L. Song, B. Sch ¨olkopf, and A. Smola, “A kernel statistical test of independence,”Advances in neural informa- tion processing systems, vol. 20, 2007
2007
-
[15]
Feature se- lection via dependence maximization,
L. Song, A. Smola, A. Gretton, J. Bedo, and K. Borgwardt, “Feature se- lection via dependence maximization,”The Journal of Machine Learning Research, vol. 13, no. 1, pp. 1393–1434, 2012
2012
-
[16]
U-net: Convolutional networks for biomedical image segmentation,
O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” inMedical image computing and computer-assisted intervention, 2015, pp. 234–241
2015
-
[17]
Unet++: Redesigning skip connections to exploit multiscale features in image segmentation,
Z. Zhou, M. M. R. Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: Redesigning skip connections to exploit multiscale features in image segmentation,”IEEE Transactions on Medical Imaging, 2019
2019
-
[18]
An image is worth 16x16 words: Trans- formers for image recognition at scale,
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, “An image is worth 16x16 words: Trans- formers for image recognition at scale,” inInternational Conference on Learning Representations, 2021
2021
-
[19]
Swin transformer: Hierarchical vision transformer using shifted windows,
Z. Liu, Y . Lin, Y . Cao, H. Hu, Y . Wei, Z. Zhang, S. Lin, and B. Guo, “Swin transformer: Hierarchical vision transformer using shifted windows,” inProceedings of the IEEE/CVF international conference on computer vision, 2021, pp. 9992–10 002
2021
-
[20]
Neural memory state space models for medical image segmentation,
Z. Wang, J. Gu, W. Zhou, Q. He, T. Zhao, J. Guo, L. Lu, T. He, and J. Bu, “Neural memory state space models for medical image segmentation,” International Journal of Neural Systems, vol. 35, no. 1, p. 2450068, 2025
2025
-
[21]
An enhanced visual state space model for myocardial pathology segmentation in multi- sequence cardiac mri,
S. Li, X. Li, P. Wang, K. Liu, B. Wei, and J. Cong, “An enhanced visual state space model for myocardial pathology segmentation in multi- sequence cardiac mri,”Medical Physics, vol. 52, no. 6, pp. 4355–4370, 2025
2025
-
[22]
Dcss-unet: Unet based on state space model for polyp segmentation,
X. Wang and B. Li, “Dcss-unet: Unet based on state space model for polyp segmentation,”Frontiers in Computing and Intelligent Systems, vol. 9, no. 3, pp. 32–39, 2024
2024
-
[23]
Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,
H. Wang, Y . Zhu, B. Green, H. Adam, A. Yuille, and L.-C. Chen, “Axial-deeplab: Stand-alone axial-attention for panoptic segmentation,” inEuropean conference on computer vision. Springer, 2020, pp. 108– 126
2020
-
[24]
Ccnet: Criss-cross attention for semantic segmentation,
Z. Huang, X. Wang, L. Huang, C. Huang, Y . Wei, and W. Liu, “Ccnet: Criss-cross attention for semantic segmentation,” inProceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 603– 612
2019
-
[25]
Rotate to scan: Unet-like mamba with triplet ssm module for medical image segmentation,
H. Tang, L. Cheng, G. Huang, Z. Tan, J. Lu, and K. Wu, “Rotate to scan: Unet-like mamba with triplet ssm module for medical image segmentation,”arXiv preprint arXiv:2403.17701, 2024
-
[26]
H. Huang, P. Liang, N. Lin, L. Wang, B. Pu, J. Chen, Q. Chang, X. Shen, and G. Ran, “Topology-aware wavelet mamba for airway structure segmentation in postoperative recurrent nasopharyngeal carcinoma ct scans,”CoRR, vol. abs/2502.14363, 2025. [Online]. Available: https://arxiv.org/abs/2502.14363
-
[27]
Jpeg2000: Standard for interactive imaging,
D. S. Taubman and M. W. Marcellin, “Jpeg2000: Standard for interactive imaging,”Proceedings of the IEEE, vol. 90, no. 8, pp. 1336–1357, 2002
2002
-
[28]
W. B. Pennebaker and J. L. Mitchell,JPEG: Still image data compres- sion standard. Springer Science & Business Media, 1992
1992
-
[29]
Cuda c++ programming guide,
D. Guide, “Cuda c++ programming guide,”NVIDIA, July, 2020
2020
-
[30]
Nvidia tensor core programmability, performance & precision,
S. Markidis, S. W. Der Chien, E. Laure, I. B. Peng, and J. S. Vetter, “Nvidia tensor core programmability, performance & precision,” in 2018 IEEE international parallel and distributed processing symposium workshops (IPDPSW). IEEE, 2018, pp. 522–531
2018
-
[31]
Scfmunet: A fusion architecture based on multi-scale state space model and channel attention for medical image segmentation,
Z. Huang, Z. Zhao, Z. Yu, M. Hou, S. Zhou, J. Wang, Y . Yan, Y . Liu, and H. Gregersen, “Scfmunet: A fusion architecture based on multi-scale state space model and channel attention for medical image segmentation,”Neural Networks, vol. 192, p. 107919, 2025
2025
-
[32]
A dual-branch network for lesion segmentation in medical images using state space models,
H. Chen, B.-W. Min, and H. Zhang, “A dual-branch network for lesion segmentation in medical images using state space models,”Quantitative Imaging in Medicine and Surgery, vol. 15, no. 12, pp. 11 977–11 991, 2025
2025
-
[33]
Toposegnet: Scalable topology preser- vation in image segmentation via critical points,
M. Ahmadkhani and E. Shook, “Toposegnet: Scalable topology preser- vation in image segmentation via critical points,”Computer Vision and Image Understanding, vol. 262, p. 104564, 2025
2025
-
[34]
{ARC}: A{Self-Tuning}, low overhead replacement cache,
N. Megiddo and D. S. Modha, “{ARC}: A{Self-Tuning}, low overhead replacement cache,” in2nd USENIX Conference on File and Storage Technologies (FAST 03), 2003
2003
-
[35]
Extensions of lipschitz mappings into a hilbert space,
W. B. Johnson, J. Lindenstrausset al., “Extensions of lipschitz mappings into a hilbert space,”Contemporary mathematics, vol. 26, no. 189-206, p. 1, 1984
1984
-
[36]
Bader,Space-filling curves: an introduction with applications in scientific computing
M. Bader,Space-filling curves: an introduction with applications in scientific computing. Springer Science & Business Media, 2012, vol. 9
2012
-
[37]
The jpeg still picture compression standard,
G. K. Wallace, “The jpeg still picture compression standard,”IEEE transactions on consumer electronics, vol. 38, no. 1, pp. xviii–xxxiv, 2002
2002
-
[38]
Understanding the effective receptive field in deep convolutional neural networks,
W. Luo, Y . Li, R. Urtasun, and R. Zemel, “Understanding the effective receptive field in deep convolutional neural networks,”Advances in neural information processing systems, vol. 29, 2016
2016
-
[39]
Database-friendly random projections: Johnson- lindenstrauss with binary coins,
D. Achlioptas, “Database-friendly random projections: Johnson- lindenstrauss with binary coins,”Journal of computer and System Sciences, vol. 66, no. 4, pp. 671–687, 2003
2003
-
[40]
Sch ¨olkopf and A
B. Sch ¨olkopf and A. J. Smola,Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press, 2002
2002
-
[41]
Shawe-Taylor and N
J. Shawe-Taylor and N. Cristianini,Kernel methods for pattern analysis. Cambridge university press, 2004
2004
-
[42]
Camps-Valls and L
G. Camps-Valls and L. Bruzzone,Kernel methods for remote sensing data analysis. John Wiley & Sons, 2009
2009
-
[43]
nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,
F. Isensee, P. F. Jaeger, S. A. Kohl, J. Petersen, and K. H. Maier-Hein, “nnu-net: a self-configuring method for deep learning-based biomedical image segmentation,”Nature methods, vol. 18, no. 2, pp. 203–211, 2021
2021
-
[44]
Medsegdiff: Medical image segmentation with diffusion probabilistic model,
J. Wu, R. Fu, H. Fang, Y . Zhang, Y . Yang, H. Xiong, H. Liu, and Y . Xu, “Medsegdiff: Medical image segmentation with diffusion probabilistic model,” inMedical Imaging with Deep Learning. PMLR, 2024, pp. 1623–1639
2024
-
[45]
Self-supervised pre-training of swin transformers for 3d medical image analysis,
Y . Tang, D. Yang, W. Li, H. R. Roth, B. Landman, D. Xu, V . Nath, and A. Hatamizadeh, “Self-supervised pre-training of swin transformers for 3d medical image analysis,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2022, pp. 20 730–20 740
2022
-
[46]
Segment anything,
A. Kirillov, E. Mintun, N. Ravi, H. Mao, C. Rolland, L. Gustafson, T. Xiao, S. Whitehead, A. C. Berg, W.-Y . Loet al., “Segment anything,” inProceedings of the IEEE/CVF international conference on computer vision, 2023, pp. 4015–4026
2023
-
[47]
Unleashing the potential of sam for medical adaptation via hierarchical decoding,
Z. Cheng, Q. Wei, H. Zhu, Y . Wang, L. Qu, W. Shao, and Y . Zhou, “Unleashing the potential of sam for medical adaptation via hierarchical decoding,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2024, pp. 3511–3522
2024
-
[48]
Swin-unet: Unet-like pure transformer for medical image segmenta- tion,
H. Cao, Y . Wang, J. Chen, D. Jiang, X. Zhang, Q. Tian, and M. Wang, “Swin-unet: Unet-like pure transformer for medical image segmenta- tion,” inECCV, 2022, pp. 205–218
2022
-
[49]
Mamba-unet: Unet- like pure visual mamba for medical image segmentation,
Z. Wang, J.-Q. Zheng, Y . Zhang, G. Cui, and L. Li, “Mamba-unet: Unet- like pure visual mamba for medical image segmentation,”arXiv preprint arXiv:2402.05079, 2024
-
[50]
Medical image computing and computer-assisted intervention multi- atlas labeling beyond the cranial vault–workshop and challenge,
B. Landman, Z. Xu, J. Igelsias, M. Styner, T. Langerak, and A. Klein, “Medical image computing and computer-assisted intervention multi- atlas labeling beyond the cranial vault–workshop and challenge,” in Medical image computing and computer-assisted intervention, vol. 5, 2015, p. 12
2015
-
[51]
N. C. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittleret al., “Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi), hosted by the international skin imaging collaboration (isic),” in2018 IEEE 15th international sy...
2017
-
[52]
Cvc-clinicdb,
J. Bernal, F. J. S ´anchez, G. Fern ´andez-Esparrach, D. Gil, C. Rodr ´ıguez, and F. Vilari ˜no, “Cvc-clinicdb,” 2015. [Online]. Available: https: //polyp.grand-challenge.org/CVCClinicDB/
2015
-
[53]
Topology-aware focal loss for 3d image segmentation,
A. Demir, E. Massaad, and B. Kiziltan, “Topology-aware focal loss for 3d image segmentation,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 580–589
2023
-
[54]
Squeeze-and-excitation networks,
J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2018, pp. 7132–7141
2018
-
[55]
Backpropagation-free network for 3d test-time adaptation,
Y . Wang, A. Cheraghian, Z. Hayder, J. Hong, S. Ramasinghe, S. Rah- man, D. Ahmedt-Aristizabal, X. Li, L. Petersson, and M. Harandi, “Backpropagation-free network for 3d test-time adaptation,” inPro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 23 231–23 241
2024
-
[56]
Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders,
R. Zhang, L. Wang, Y . Qiao, P. Gao, and H. Li, “Learning 3d representations from 2d pre-trained models via image-to-point masked autoencoders,” inProceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2023, pp. 21 769–21 780
2023
-
[57]
Convolutional fine-grained classification with self-supervised target relation regularization,
K. Liu, K. Chen, and K. Jia, “Convolutional fine-grained classification with self-supervised target relation regularization,”IEEE Transactions on Image Processing, vol. 31, pp. 5570–5584, 2022. APPENDIX A. Preliminaries of SSM State-space models (SSMs) describe sequential processing through a hidden-state evolution: dh(t) dt =Ah(t) +Bx(t),(9) y(t) =Ch(t) ...
2022
-
[58]
These are the only two cases
At the boundary between two consecutive diagonal segments, the alternating reversal ensures that the terminal point of one segment and the initial point of the next segment differ by either(1,0)or(0,1), hence they are 4-neighbors and their distance is1. These are the only two cases. Corollary 1(Extension to anti-diagonal and reversed TopoA sequences).The ...
-
[59]
Apply Johnson-Lindenstrauss projection [35], [39] to re- duce the sequence dimension before kernel construction
-
[60]
Build RBF kernels [40] with a median bandwidth heuris- tic on the projected channel descriptors
-
[61]
Center the kernels by row mean, column mean, and global mean before computing the normalized Frobenius inner product
-
[62]
Convert the resulting HSIC score into a sigmoid gate and retain a TopoA-biased residual shortcut for stability. We intentionally avoid stronger claims such as target-variable dependence, mutual-information approximation, or topology guarantees, because the gate is used here purely as a compact dependence-aware fusion rule. In the paper setting,αis initial...
2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.