RPG-SAM: Reliability-Weighted Prototypes and Geometric Adaptive Threshold Selection for Training-Free One-Shot Polyp Segmentation
Pith reviewed 2026-05-15 15:15 UTC · model grok-4.3
The pith
RPG-SAM improves one-shot polyp segmentation by weighting reliable support features and adapting thresholds to morphological agreement.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RPG-SAM systematically handles multi-layered heterogeneity by first mining reliability-weighted prototypes from support features to suppress noise via background contrast, then applying geometric adaptive selection to dynamically choose thresholds that maximize morphological consistency in the query output, followed by an iterative loop that refines anatomical edges until convergence.
What carries the argument
Reliability-Weighted Prototype Mining paired with Geometric Adaptive Selection, which together prioritize high-fidelity support regions and recalibrate thresholds based on candidate shape agreement.
If this is right
- Segmentation of polyps becomes feasible with only one annotated example per new imaging condition.
- Boundary errors decrease because thresholds are chosen by shape agreement rather than fixed intensity cutoffs.
- Background noise is reduced by treating support pixels as contrastive anchors instead of uniform references.
- The framework scales to other one-shot medical segmentation tasks where support-query heterogeneity appears.
- Iterative polishing produces smoother anatomical contours without additional model training.
Where Pith is reading between the lines
- The same weighting and consensus steps could be tested on other foundation models to check if gains transfer beyond the current backbone.
- If the reliability scores correlate with expert annotations on held-out data, they might serve as a cheap proxy for active learning sample selection.
- Extending the geometric consensus check to three-dimensional volumes would test whether the method applies to volumetric CT or MRI polyp data.
- Comparing the iterative loop's convergence speed against non-iterative baselines would quantify the added computational cost of refinement.
Load-bearing premise
That weighting features by reliability and selecting thresholds by morphological consensus will consistently pick accurate regions without creating new selection bias or needing per-dataset tuning of the refinement loop.
What would settle it
Running the method unchanged on a second polyp dataset such as CVC-ClinicDB and finding no mIoU gain or a performance drop would indicate the heterogeneity-handling steps do not generalize as claimed.
Figures
read the original abstract
Training-free one-shot segmentation offers a scalable alternative to expert annotations where knowledge is often transferred from support images and foundation models. But existing methods often treat all pixels in support images and query response intensities models in a homogeneous way. They ignore the regional heterogeity in support images and response heterogeity in query.To resolve this, we propose RPG-SAM, a framework that systematically tackles these heterogeneity gaps. Specifically, to address regional heterogeneity, we introduce Reliability-Weighted Prototype Mining (RWPM) to prioritize high-fidelity support features while utilizing background anchors as contrastive references for noise suppression. To address response heterogeneity, we develop Geometric Adaptive Selection (GAS) to dynamically recalibrate binarization thresholds by evaluating the morphological consensus of candidates. Finally, an iterative refinement loop method is designed to polishes anatomical boundaries. By accounting for multi-layered information heterogeneity, RPG-SAM achieves a 5.56\% mIoU improvement on the Kvasir dataset. Code will be released.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces RPG-SAM, a training-free one-shot polyp segmentation method based on SAM. It proposes Reliability-Weighted Prototype Mining (RWPM) to address regional heterogeneity in support images by prioritizing high-fidelity features with reliability weighting and background anchors for contrast, Geometric Adaptive Selection (GAS) to handle response heterogeneity via dynamic threshold selection based on morphological consensus of candidate masks, and an iterative refinement loop to improve anatomical boundaries. The central claim is a 5.56% mIoU improvement on the Kvasir dataset achieved by accounting for multi-layered information heterogeneity.
Significance. If the reported gains prove robust under proper baselines, component ablations, and statistical controls, the framework could offer a practical advance in training-free medical segmentation by explicitly targeting heterogeneity without retraining. The approach builds on foundation models in a way that could generalize to other domains with scarce annotations, provided the mechanisms are shown to be non-heuristic and free of hidden dataset-specific tuning.
major comments (2)
- [Abstract] Abstract: The 5.56% mIoU improvement on Kvasir is stated without identifying the baseline method or its score, without component ablations for RWPM or GAS, and without variance or statistical tests across support images or runs. This prevents verification that the gain arises from the proposed reliability weighting and geometric consensus rather than prompt choice or favorable data splits.
- [Methods] Methods (RWPM and GAS descriptions): The reliability weighting and background-anchor contrast in RWPM, together with the morphological-consensus rule in GAS, are introduced as new heuristics without explicit equations demonstrating reduction to the target mIoU metric or guarantees against selection bias in the iterative loop. The absence of these derivations leaves open whether the mechanisms are parameter-free or require dataset-specific tuning, directly undermining attribution of the claimed improvement.
minor comments (2)
- [Abstract] Abstract contains repeated spelling errors ('heterogeity' for 'heterogeneity') and a grammatical issue ('polishes' should be 'polish').
- [Abstract] The statement 'Code will be released' should be accompanied by a repository link or DOI at submission time to support reproducibility claims.
Simulated Author's Rebuttal
We thank the referee for the constructive comments. We address each major point below and indicate the corresponding revisions to strengthen clarity, formalization, and validation of the reported gains.
read point-by-point responses
-
Referee: [Abstract] Abstract: The 5.56% mIoU improvement on Kvasir is stated without identifying the baseline method or its score, without component ablations for RWPM or GAS, and without variance or statistical tests across support images or runs. This prevents verification that the gain arises from the proposed reliability weighting and geometric consensus rather than prompt choice or favorable data splits.
Authors: We agree that the abstract should provide more context. In the revised manuscript we will explicitly name the baseline (standard one-shot SAM), report its mIoU, reference the component ablations for RWPM and GAS that appear in the experiments, and add variance statistics together with significance tests across multiple support images and random seeds. revision: yes
-
Referee: [Methods] Methods (RWPM and GAS descriptions): The reliability weighting and background-anchor contrast in RWPM, together with the morphological-consensus rule in GAS, are introduced as new heuristics without explicit equations demonstrating reduction to the target mIoU metric or guarantees against selection bias in the iterative loop. The absence of these derivations leaves open whether the mechanisms are parameter-free or require dataset-specific tuning, directly undermining attribution of the claimed improvement.
Authors: The weighting in RWPM is computed from per-region feature fidelity scores and background anchors are fixed contrast references; the GAS threshold is obtained from the intersection-over-union of morphologically dilated candidate masks. These steps contain no learned parameters or dataset-specific constants. To address the request for formalization we will insert the explicit equations for both modules and a short bias analysis of the consensus rule in the revised Methods section. revision: partial
Circularity Check
No circularity: heuristic methods with no self-referential derivations or fitted predictions
full rationale
The paper presents RPG-SAM as a new framework introducing Reliability-Weighted Prototype Mining (RWPM) and Geometric Adaptive Selection (GAS) to handle regional and response heterogeneity in training-free one-shot polyp segmentation. No equations, derivations, or parameter-fitting steps are described in the provided text that would reduce any claimed prediction or result back to the inputs by construction. The 5.56% mIoU improvement is stated as an empirical outcome of the proposed heuristics and iterative refinement loop, without any self-citation load-bearing premises, uniqueness theorems, or renaming of known results. The central claims rest on novel algorithmic choices rather than tautological redefinitions, making the derivation chain self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
IEEE transactions on pattern analysis and machine intelligence34(11), 2274–2282 (2012)
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., Süsstrunk, S.: Slic superpix- els compared to state-of-the-art superpixel methods. IEEE transactions on pattern analysis and machine intelligence34(11), 2274–2282 (2012)
work page 2012
-
[2]
Scientific Data10(1), 75 (2023)
Ali,S.,Jha,D.,Ghatwary,N.,Realdon,S.,Cannizzaro,R.,Salem,O.E.,Lamarque, D., Daul, C., Riegler, M.A., Anonsen, K.V., et al.: A multi-centre polyp detection and segmentation dataset for generalisability assessment. Scientific Data10(1), 75 (2023)
work page 2023
-
[3]
arXiv preprint arXiv:2407.07042 (2024)
Ayzenberg, L., Giryes, R., Greenspan, H.: Protosam: One-shot medical image seg- mentation with foundational models. arXiv preprint arXiv:2407.07042 (2024)
-
[4]
Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C., Vilar- iño, F.: Wm-dova maps for accurate polyp highlighting in colonoscopy: Validation vs. saliency maps from physicians. Computerized medical imaging and graphics 43, 99–111 (2015)
work page 2015
-
[5]
In: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI)
Fan, D.P., Ji, G.P., Zhou, T., Chen, G., Fu, H., Shen, J., Shao, L.: Pranet: Parallel reverse attention network for polyp segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI). pp. 263–273 (2020)
work page 2020
-
[6]
arXiv preprint arXiv:2101.07172 (2021)
Huang, C.H., Wu, H.Y., Lin, Y.L.: Hardnet-mseg: A simple encoder-decoder polyp segmentation neural network that achieves over 0.9 mean dice and 86 fps. arXiv preprint arXiv:2101.07172 (2021)
-
[7]
IEEE Transactions on Medical Imaging42(12), 3987–4000 (2023)
Jain,S.,Atale,R.,Gupta,A.,Mishra,U.,Seal,A.,Ojha,A.,Jaworek-Korjakowska, J., Krejcar, O.: Coinnet: A convolution-involution network with a novel statisti- cal attention for automatic polyp segmentation. IEEE Transactions on Medical Imaging42(12), 3987–4000 (2023)
work page 2023
-
[8]
Jha, D.: Kvasir-seg: A segmented polyp dataset. in multimedia modeling: 26th international conference, mmm 2020, daejeon, south korea, january 5-8 (2020)
work page 2020
-
[9]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)
Kirillov, A., Mintun, E., Ravi, N., Mao, H., Rolland, C., Gustafson, L., Xiao, T., Whitehead, S., Berg, A.C., Lo, W.Y., et al.: Segment anything. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 4015– 4026 (2023)
work page 2023
-
[10]
arXiv preprint arXiv:2305.13310 (2023)
Liu, Y., Zhu, M., Li, H., Chen, H., Wang, X., Shen, C.: Matcher: Segment anything with one shot using all-purpose feature matching. arXiv preprint arXiv:2305.13310 (2023)
-
[11]
In: Proceedings of the IEEE/CVF International Conference on Computer Vision
Mao, X., Xing, X., Meng, F., Liu, J., Bai, F., Nie, Q., Meng, M.: One polyp iden- tifies all: One-shot polyp segmentation with sam via cascaded priors and iterative prompt evolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. pp. 24182–24191 (2025)
work page 2025
-
[12]
In: European Conference on Computer Vision
Meng, L., Lan, S., Li, H., Alvarez, J.M., Wu, Z., Jiang, Y.G.: Segic: Unleashing the emergent correspondence for in-context segmentation. In: European Conference on Computer Vision. pp. 203–220 (2024)
work page 2024
-
[13]
DINOv2: Learning Robust Visual Features without Supervision
Oquab, M., Darcet, T., Moutakanni, T., Vo, H., Szafraniec, M., Khalidov, V., Fernandez, P., Haziza, D., Massa, F., El-Nouby, A., et al.: Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[14]
In: International conference on machine learning
Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual models from natural language supervision. In: International conference on machine learning. pp. 8748–8763 (2021) 10 W. Lin, Y. Bai et al
work page 2021
-
[15]
In: Proceedings of the IEEE/CVF winter conference on applications of computer vision
Rahman, M.M., Marculescu, R.: Medical image segmentation via cascaded atten- tion decoding. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision. pp. 6222–6231 (2023)
work page 2023
-
[16]
SAM 2: Segment Anything in Images and Videos
Ravi, N., Gabeur, V., Hu, Y.T., Hu, R., Koutra, C., Whitehead, S., Wang, X., Kirillov, A., Krahenbuhl, P., Feichtenhofer, C.: Sam 2: Segment anything in images and videos. arXiv preprint arXiv:2408.00714 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[17]
CA: A Cancer Journal for Clinicians 71(3), 209–249 (2021)
Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., Bray,F.:Globalcancerstatistics2020:Globocanestimatesofincidenceandmortal- ity worldwide for 36 cancers in 185 countries. CA: A Cancer Journal for Clinicians 71(3), 209–249 (2021)
work page 2021
-
[18]
IEEE transactions on medical imaging 35(2), 630–644 (2015)
Tajbakhsh, N., Gurudu, S.R., Liang, J.: Automated polyp detection in colonoscopy videos using shape and context information. IEEE transactions on medical imaging 35(2), 630–644 (2015)
work page 2015
-
[19]
IEEE Transactions on Image Processing33, 6204–6215 (2024)
Xu, Y., Tang, J., Men, A., Chen, Q.: Eviprompt: A training-free evidential prompt generation method for adapting segment anything model in medical images. IEEE Transactions on Image Processing33, 6204–6215 (2024)
work page 2024
-
[20]
New England Journal of Medicine366(8), 687–696 (2012)
Zauber, A.G., Winawer, S.J., O’Brien, M.J., Lansdorp-Vogelaar, I., van Ballegooi- jen, M., Hankey, B.F., Zauber, S.D., Burt, R.W., Bond, J.H., Lowery, M., et al.: Colonoscopic polypectomy and long-term prevention of colorectal-cancer deaths. New England Journal of Medicine366(8), 687–696 (2012)
work page 2012
-
[21]
Personalize segment anything model with one shot
Zhang, R., Jiang, Z., Guo, Z., Yan, S., Pan, J., Ma, X., Dong, H., Gao, P., Li, H.: Personalize segment anything model with one shot. arXiv preprint arXiv:2305.03048 (2023)
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.