pith. sign in

arxiv: 2509.22307 · v2 · pith:T4SKF4RGnew · submitted 2025-09-26 · 💻 cs.CV

Johnson-Lindenstrauss Lemma Guided Network for Efficient 3D Medical Segmentation

classification 💻 cs.CV
keywords veloxsegnetworksegmentationarchitecturedual-streamheterogeneousimageinference
0
0 comments X
read the original abstract

Lightweight 3D medical image segmentation remains constrained by a fundamental \textit{``efficiency / robustness conflict''}, particularly when processing complex anatomical structures and heterogeneous modalities. In this paper, we study how to redesign the framework based on the characteristics of high-dimensional 3D images, and explore data synergy to overcome the fragile representation of lightweight methods. Our approach, VeloxSeg, begins with a deployable and extensible dual-stream CNN-Transformer architecture composed of Paired Window Attention (PWA) and Johnson-Lindenstrauss lemma-guided convolution (JLC). For each 3D image, we invoke a ``glance-and-focus'' principle, where PWA rapidly retrieves multi-scale information, and JLC ensures robust local feature extraction with minimal parameters, significantly enhancing the model's ability to operate with low computational budget. Followed by an extension of the dual-stream architecture that incorporates modal interaction into the multi-scale image-retrieval process, VeloxSeg efficiently models heterogeneous modalities. Finally, Spatially Decoupled Knowledge Transfer (SDKT) via Gram matrices injects the texture prior extracted by a self-supervised network into the segmentation network, yielding stronger representations than baselines at no extra inference cost. Experimental results on multimodal benchmarks show that VeloxSeg achieves a 26\% Dice improvement, alongside increasing GPU throughput by 11$\times$, CPU by 48$\times$, and reducing training peak GPU memory usage by $1/20$, inference by $1/24$. Code is available at https://github.com/JinPLu/VeloxSeg.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.