VoxShield: Protecting 3D Medical Datasets from Unauthorized Training via Frequency-Aware Inter-Slice Disruption
Pith reviewed 2026-05-20 13:49 UTC · model grok-4.3
The pith
VoxShield protects 3D medical datasets from unauthorized training by adding small perturbations that break inter-slice frequency consistency.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
VoxShield is an unlearnable-examples framework for 3D volumes that uses Inter-Slice Frequency Consistency Disruption to maximize spectral divergence between adjacent slices along the z-axis and Semantic Prediction Disruption to maximize L1 divergence between clean and perturbed logits, thereby impairing the spatial aggregation process of 3D segmentation networks.
What carries the argument
Inter-Slice Frequency Consistency Disruption mechanism that maximizes spectral divergence between adjacent slices to inject structural incoherence along the z-axis.
If this is right
- 3D segmentation models trained on protected data drop from 80.0 percent DSC to near 0.0 percent on BraTS19.
- The same protection reduces DSC from 88.6 percent to 6.8 percent on FLARE21.
- All effects occur with a perturbation budget of epsilon equal to 4 over 255 while preserving high visual fidelity.
- The framework specifically addresses volumetric inductive biases that prior 2D unlearnable-example methods ignore.
Where Pith is reading between the lines
- Dataset owners could apply this protection before public release to enable open sharing with reduced misuse risk.
- The approach highlights that current 3D networks remain vulnerable to targeted inter-slice attacks even when visual quality is maintained.
- Similar frequency-disruption ideas might be tested on other volumetric modalities such as video sequences.
Load-bearing premise
That 3D segmentation networks depend on cross-slice anatomical consistency and frequency coherence as primary learning priors and cannot adapt effectively when those are maximized for divergence.
What would settle it
Training any standard 3D segmentation network on the VoxShield-protected BraTS19 or FLARE21 volumes and obtaining a Dice score above 50 percent would falsify the central claim.
Figures
read the original abstract
The release of public 3D medical image segmentation (MIS) datasets accelerates clinical research but simultaneously heightens risks of unauthorized AI model training. While Unlearnable Examples (UE) offer protection by injecting imperceptible perturbations to prevent effective model learning, existing methods primarily target 2D scenarios. They neglect the volumetric spatial correlations and inter-slice anatomical consistency inherent in 3D medical volumes, which serve as critical learning priors for 3D segmentation networks. To bridge this gap, we propose VoxShield, a UE framework that explicitly targets the volumetric inductive biases of 3D networks. Our core insight is that by systematically dismantling the cross-slice continuity that 3D architectures rely on, we can fundamentally impair their spatial aggregation process. Specifically, we introduce an Inter-Slice Frequency Consistency Disruption mechanism that maximizes the spectral divergence between adjacent slices, injecting structural incoherence along the $z$-axis. Complementing this structural attack, a Semantic Prediction Disruption module is incorporated. By maximizing the $\ell_1$ divergence between clean and perturbed logits, it forces the injected noise to penetrate the entire network and corrupt the final semantic mapping. Experiments on BraTS19 and FLARE21 demonstrate that VoxShield successfully degrades 3D segmentation performance, reducing the DSC from 80.0% to near 0.0% and from 88.6% to 6.8%, respectively. All protections are achieved with minimal perturbation ($\epsilon=4/255$) to preserve high visual fidelity. The code is available at https://github.com/KK266299/VoxShield.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes VoxShield, an unlearnable examples framework for 3D medical image segmentation datasets. It injects imperceptible perturbations (epsilon=4/255) via an Inter-Slice Frequency Consistency Disruption module that maximizes spectral divergence between adjacent slices along the z-axis, combined with a Semantic Prediction Disruption module that maximizes l1 divergence on logits. This is claimed to impair the volumetric inductive biases of 3D networks, reducing DSC from 80.0% to near 0.0% on BraTS19 and from 88.6% to 6.8% on FLARE21 while preserving visual fidelity. Code is released at the cited GitHub repository.
Significance. If the central results hold under broader evaluation, the work would be significant as the first explicit extension of unlearnable examples to 3D volumetric medical data, targeting cross-slice anatomical consistency that 2D methods ignore. The public code release is a clear strength for reproducibility and allows direct verification of the frequency-aware disruption approach.
major comments (3)
- The central claim that the method 'fundamentally impairs' spatial aggregation (Abstract) rests on the assumption that 3D networks cannot fall back on intra-slice cues or training mitigations; however, no experiments evaluate adaptive attackers using random z-axis cropping, slice shuffling, or 2.5D ensembling, which directly tests whether the reported DSC collapses (80% to 0% on BraTS19) are robust.
- No ablation results are reported to quantify the individual contributions of the Inter-Slice Frequency Consistency Disruption versus the Semantic Prediction Disruption module, making it impossible to determine which component drives the performance drops or whether both are necessary for the claimed effect.
- The experiments section provides no statistical significance tests, standard deviations across multiple runs, or direct comparisons against adapted 2D unlearnable example baselines on the same 3D datasets, weakening support for the claim that the frequency-aware approach is superior for volumetric data.
minor comments (1)
- The abstract and method description would benefit from explicit notation for the spectral divergence objective and the precise definition of adjacent-slice pairs used in the frequency consistency term.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. The comments identify key areas where additional evaluation would strengthen the manuscript. We address each major comment below and will incorporate revisions to improve the robustness and clarity of our claims.
read point-by-point responses
-
Referee: The central claim that the method 'fundamentally impairs' spatial aggregation (Abstract) rests on the assumption that 3D networks cannot fall back on intra-slice cues or training mitigations; however, no experiments evaluate adaptive attackers using random z-axis cropping, slice shuffling, or 2.5D ensembling, which directly tests whether the reported DSC collapses (80% to 0% on BraTS19) are robust.
Authors: We agree that testing against adaptive attackers is necessary to fully support the claim of fundamentally impairing spatial aggregation in 3D networks. VoxShield is designed to inject structural incoherence along the z-axis by maximizing spectral divergence between adjacent slices, which targets a core inductive bias of volumetric architectures. Nevertheless, we acknowledge that experiments with random z-axis cropping, slice shuffling, or 2.5D ensembling are absent from the current manuscript. In the revised version we will add these evaluations to demonstrate whether the observed performance degradation persists under such mitigation strategies. revision: yes
-
Referee: No ablation results are reported to quantify the individual contributions of the Inter-Slice Frequency Consistency Disruption versus the Semantic Prediction Disruption module, making it impossible to determine which component drives the performance drops or whether both are necessary for the claimed effect.
Authors: We concur that ablation studies are required to isolate the contribution of each module. The Inter-Slice Frequency Consistency Disruption specifically addresses cross-slice anatomical continuity, while the Semantic Prediction Disruption ensures the perturbation propagates to the final logits. To address this gap, the revised manuscript will include ablation experiments that separately disable each module and report the resulting DSC on BraTS19 and FLARE21, thereby clarifying their individual and joint roles. revision: yes
-
Referee: The experiments section provides no statistical significance tests, standard deviations across multiple runs, or direct comparisons against adapted 2D unlearnable example baselines on the same 3D datasets, weakening support for the claim that the frequency-aware approach is superior for volumetric data.
Authors: We recognize the need for statistical rigor and explicit baseline comparisons. The current results report point estimates without variance or significance testing, and no adapted 2D unlearnable-example methods are evaluated on the volumetric datasets. In the revision we will report standard deviations over multiple random seeds, include appropriate statistical tests, and provide direct comparisons with 2D baselines adapted to BraTS19 and FLARE21 to better substantiate the advantages of the frequency-aware inter-slice approach. revision: yes
Circularity Check
VoxShield defines explicit disruption objectives and evaluates them empirically; no derivation reduces to inputs by construction.
full rationale
The paper introduces Inter-Slice Frequency Consistency Disruption by maximizing spectral divergence between adjacent slices and Semantic Prediction Disruption by maximizing ℓ1 logit divergence. These are optimization objectives chosen to target assumed 3D network priors, then tested via experiments on BraTS19 and FLARE21. No equation or claim reduces a result to a fitted parameter or self-citation chain by definition; the performance drops are reported outcomes of the chosen perturbations under fixed training regimes, not tautological predictions. The central claim rests on the empirical effectiveness of the crafted perturbations rather than any self-referential derivation.
Axiom & Free-Parameter Ledger
free parameters (1)
- perturbation budget epsilon =
4/255
axioms (1)
- domain assumption 3D segmentation networks rely on volumetric spatial correlations and inter-slice anatomical consistency as critical learning priors
Reference graph
Works this paper leans on
-
[1]
The cancer imaging archive (tcia) (2025),https://www.cancerimagingarchive. net/, accessed: 2025-01-15
work page 2025
-
[2]
Pubmed (2025),https://pubmed.ncbi.nlm.nih.gov/, accessed: 2025-01-15
work page 2025
-
[3]
Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation (2021),https://arxiv.org/abs/2102.04306
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[4]
Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-net: Learning densevolumetricsegmentation fromsparse annotation.In:Medical Image Computing and Computer-Assisted Intervention (MICCAI). pp. 424–432 (2016). https://doi.org/10.1007/978-3-319-46723-8\_49
-
[5]
Dong, Z., He, Y., Qi, X., Chen, Y., Shu, H., Coatrieux, J.L., Yang, G., Li, S.: Mnet: Rethinking 2D/3D networks for anisotropic medical image segmentation. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI). pp. 870–876 (2022). https://doi.org/10.24963/ijcai.2022/122
-
[6]
European Parliament and Council of the European Union: Regulation (eu) 2016/679 (general data protection regulation). Official Journal of the European Union, L 119, 1–88 (2016),https://eur-lex.europa.eu/eli/reg/2016/679/oj
work page 2016
-
[7]
In: Advances in Neural Information Pro- cessing Systems (NeurIPS) (2021)
Fowl, L., Goldblum, M., Chiang, P.y., Geiping, J., Czaja, W., Goldstein, T.: Ad- versarial examples make strong poisons. In: Advances in Neural Information Pro- cessing Systems (NeurIPS) (2021)
work page 2021
-
[8]
In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., Xu, D.: UNETR: Transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 574–584 (2022)
work page 2022
-
[9]
Huang, H., Ma, X., Erfani, S.M., Bailey, J., Wang, Y.: Unlearnable examples: Mak- ing personal data unexploitable. In: International Conference on Learning Repre- sentations (ICLR) (2021),https://openreview.net/forum?id=iAmZUo0DxC0 10 X. Liu et al
work page 2021
-
[10]
Nature Methods18(2), 203–211 (2021)
Isensee, F., Jaeger, P.F., Kohl, S.A.A., Petersen, J., Maier-Hein, K.H.: nnu-net: A self-configuring method for deep learning-based biomedical image segmenta- tion. Nature Methods18(2), 203–211 (2021). https://doi.org/10.1038/s41592-020- 01008-z
-
[11]
Nature Ma- chine Intelligence2(6), 305–311 (2020)
Kaissis, G.A., Makowski, M.R., Rückert, D., Braren, R.F.: Secure, privacy- preserving and federated machine learning in medical imaging. Nature Ma- chine Intelligence2(6), 305–311 (2020). https://doi.org/10.1038/s42256-020-0186- 1,https://doi.org/10.1038/s42256-020-0186-1
-
[12]
https://doi.org/10.48550/arXiv.2403.14250,https:// arxiv.org/abs/2403.14250
Lin, X., Yu, Y., Xia, S., Jiang, J., Wang, H., Yu, Z., Liu, Y., Fu, Y., Wang, S., Tang, W., Kot, A.: Safeguarding medical image segmentation datasets against unauthorizedtrainingviacontour-andtexture-awareperturbations.arXivpreprint arXiv:2403.14250 (2024). https://doi.org/10.48550/arXiv.2403.14250,https:// arxiv.org/abs/2403.14250
-
[13]
In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)
Liu, Y., Xu, K., Chen, X., Sun, L.: Stable unlearnable example: Enhancing the ro- bustness of unlearnable examples via stable error-minimizing noise. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). vol. 38, pp. 3783–3791 (2024). https://doi.org/10.1609/aaai.v38i4.28169
-
[14]
Medical Image Analysis82, 102616 (2022)
Ma, J., et al.: Fast and low-gpu-memory abdomen CT organ segmenta- tion: The FLARE challenge. Medical Image Analysis82, 102616 (2022). https://doi.org/10.1016/j.media.2022.102616
-
[15]
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS),
Menze, B.H., Jakab, A., Bauer, S., et al.: The multimodal brain tumor image seg- mentation benchmark (BRATS). IEEE Transactions on Medical Imaging34(10), 1993–2024 (2015). https://doi.org/10.1109/TMI.2014.2377694
-
[16]
In: 2016 Fourth International Confer- ence on 3D Vision (3DV)
Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Confer- ence on 3D Vision (3DV). pp. 565–571. IEEE (2016)
work page 2016
-
[17]
In: Medical Imaging with Deep Learning (2018),https://openreview.net/forum?id=Skft7cijM
Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., Glocker, B., Rueckert, D.: Attention u-net: Learning where to look for the pancreas. In: Medical Imaging with Deep Learning (2018),https://openreview.net/forum?id=Skft7cijM
work page 2018
-
[18]
arXiv preprint arXiv:2303.04278 , year=
Sadasivan, V.S., Soltanolkotabi, M., Feizi, S.: CUDA: Convolution-based unlearn- able datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR). pp. 3862–3871 (2023),https://arxiv. org/abs/2303.04278
-
[19]
In: Advances in Neural Information Processing Systems (NeurIPS)
Sun, Y., Zhang, H., Zhang, T., Ma, X., Jiang, Y.G.: UnSeg: One universal unlearn- able example generator is enough against all image segmentation. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)
work page 2024
-
[20]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Tang, Y., Yang, D., Li, W., Roth, H.R., Landman, B., Xu, D., Nath, V., Hatamizadeh, A.: Self-supervised pre-training of swin transformers for 3D med- ical image analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20730–20740 (2022)
work page 2022
-
[21]
Wang, D., Xue, M., Li, B., Camtepe, S., Zhu, L.: Provably unlearnable data ex- amples. In: Proceedings of the Network and Distributed System Security Sym- posium(NDSS)(2025).https://doi.org/10.14722/ndss.2025.240886,https://www. ndss-symposium.org/ndss-paper/provably-unlearnable-data-examples/
-
[22]
In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition
Wang, H., Wu, X., Huang, Z., Xing, E.P.: High-frequency component helps ex- plain the generalization of convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8684–8694 (2020) VoxShield 11
work page 2020
-
[23]
In: Advances in Neural Information Processing Systems (NeurIPS)
Wang, X., Li, M., Liu, W., Zhang, H., Hu, S., Zhang, Y., Zhou, Z., Jin, H.: Un- learnable 3D point clouds: Class-wise transformation is all you need. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)
work page 2024
-
[24]
In: Advances in Neural Information Process- ing Systems (NeurIPS)
Yin, D., Lopes, R.G., Shlens, J., Cubuk, E.D., Gilmer, J.: A Fourier perspective on model robustness in computer vision. In: Advances in Neural Information Process- ing Systems (NeurIPS). pp. 13255–13265 (2019),https://arxiv.org/abs/1906. 08988
work page 2019
-
[25]
In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
Yu, D., Zhang, H., Chen, W., Yin, J., Liu, T.Y.: Availability attacks cre- ate shortcuts. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. pp. 2367–2376. KDD ’22, ACM (Aug 2022). https://doi.org/10.1145/3534678.3539241,http://dx.doi.org/10.1145/ 3534678.3539241
-
[26]
UNet++: A Nested U-Net Architecture for Medical Image Segmentation
Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: A nested U-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (DLMIA/ML- CDS @ MICCAI). pp. 3–11 (2018),https://arxiv.org/abs/1807.10165
work page internal anchor Pith review Pith/arXiv arXiv 2018
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.