pith. sign in

arxiv: 2605.17345 · v1 · pith:2LDUGXU5new · submitted 2026-05-17 · 💻 cs.CV

VoxShield: Protecting 3D Medical Datasets from Unauthorized Training via Frequency-Aware Inter-Slice Disruption

Pith reviewed 2026-05-20 13:49 UTC · model grok-4.3

classification 💻 cs.CV
keywords 3D medical image segmentationunlearnable examplesdataset protectionadversarial perturbationsfrequency consistencyinter-slice disruptionvolumetric networks
0
0 comments X

The pith

VoxShield protects 3D medical datasets from unauthorized training by adding small perturbations that break inter-slice frequency consistency.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces VoxShield to let owners of public 3D medical image datasets share them while blocking effective training of segmentation models. It works by injecting imperceptible changes that specifically destroy the cross-slice continuity and frequency coherence that 3D networks use as learning priors. The method combines frequency-based disruption to create spectral divergence between adjacent slices with a semantic module that corrupts final predictions. If the approach holds, dataset releases can continue without enabling unauthorized or commercial model development on the same volumes. Results on BraTS19 and FLARE21 show performance collapsing from high accuracy to near zero under a small perturbation budget.

Core claim

VoxShield is an unlearnable-examples framework for 3D volumes that uses Inter-Slice Frequency Consistency Disruption to maximize spectral divergence between adjacent slices along the z-axis and Semantic Prediction Disruption to maximize L1 divergence between clean and perturbed logits, thereby impairing the spatial aggregation process of 3D segmentation networks.

What carries the argument

Inter-Slice Frequency Consistency Disruption mechanism that maximizes spectral divergence between adjacent slices to inject structural incoherence along the z-axis.

If this is right

  • 3D segmentation models trained on protected data drop from 80.0 percent DSC to near 0.0 percent on BraTS19.
  • The same protection reduces DSC from 88.6 percent to 6.8 percent on FLARE21.
  • All effects occur with a perturbation budget of epsilon equal to 4 over 255 while preserving high visual fidelity.
  • The framework specifically addresses volumetric inductive biases that prior 2D unlearnable-example methods ignore.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Dataset owners could apply this protection before public release to enable open sharing with reduced misuse risk.
  • The approach highlights that current 3D networks remain vulnerable to targeted inter-slice attacks even when visual quality is maintained.
  • Similar frequency-disruption ideas might be tested on other volumetric modalities such as video sequences.

Load-bearing premise

That 3D segmentation networks depend on cross-slice anatomical consistency and frequency coherence as primary learning priors and cannot adapt effectively when those are maximized for divergence.

What would settle it

Training any standard 3D segmentation network on the VoxShield-protected BraTS19 or FLARE21 volumes and obtaining a Dice score above 50 percent would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.17345 by Haolin Wang, Wenhan Jiang, Xinyao Liu, Xun Lin, Yafei Ou, Yefeng Zheng, Zhipeng Deng.

Figure 1
Figure 1. Figure 1: The defense mechanism of VoxShield. Left: A model trained on clean data learns coherent volumetric representations and segments accurately on clean test vol￾umes. Right: VoxShield disrupts inter-slice consistency by injecting spectrally diverse perturbations across adjacent slices. Consequently, a model trained on such protected data fails to extract coherent cross-slice features, yielding severely degrade… view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the proposed VoxShield framework. A noise generator G produces perturbations that are added to clean 3D medical images to create protected volumes. A surrogate model Fs computes LSPD and L ′ seg to drive protected images away from the original decision boundaries. An inter-slice frequency consistency disruption module enforces LISC by maximizing spectral discrepancy between adjacent slices, des… view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of BraTS19 noise patterns across consecutive z-slices [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
read the original abstract

The release of public 3D medical image segmentation (MIS) datasets accelerates clinical research but simultaneously heightens risks of unauthorized AI model training. While Unlearnable Examples (UE) offer protection by injecting imperceptible perturbations to prevent effective model learning, existing methods primarily target 2D scenarios. They neglect the volumetric spatial correlations and inter-slice anatomical consistency inherent in 3D medical volumes, which serve as critical learning priors for 3D segmentation networks. To bridge this gap, we propose VoxShield, a UE framework that explicitly targets the volumetric inductive biases of 3D networks. Our core insight is that by systematically dismantling the cross-slice continuity that 3D architectures rely on, we can fundamentally impair their spatial aggregation process. Specifically, we introduce an Inter-Slice Frequency Consistency Disruption mechanism that maximizes the spectral divergence between adjacent slices, injecting structural incoherence along the $z$-axis. Complementing this structural attack, a Semantic Prediction Disruption module is incorporated. By maximizing the $\ell_1$ divergence between clean and perturbed logits, it forces the injected noise to penetrate the entire network and corrupt the final semantic mapping. Experiments on BraTS19 and FLARE21 demonstrate that VoxShield successfully degrades 3D segmentation performance, reducing the DSC from 80.0% to near 0.0% and from 88.6% to 6.8%, respectively. All protections are achieved with minimal perturbation ($\epsilon=4/255$) to preserve high visual fidelity. The code is available at https://github.com/KK266299/VoxShield.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The paper proposes VoxShield, an unlearnable examples framework for 3D medical image segmentation datasets. It injects imperceptible perturbations (epsilon=4/255) via an Inter-Slice Frequency Consistency Disruption module that maximizes spectral divergence between adjacent slices along the z-axis, combined with a Semantic Prediction Disruption module that maximizes l1 divergence on logits. This is claimed to impair the volumetric inductive biases of 3D networks, reducing DSC from 80.0% to near 0.0% on BraTS19 and from 88.6% to 6.8% on FLARE21 while preserving visual fidelity. Code is released at the cited GitHub repository.

Significance. If the central results hold under broader evaluation, the work would be significant as the first explicit extension of unlearnable examples to 3D volumetric medical data, targeting cross-slice anatomical consistency that 2D methods ignore. The public code release is a clear strength for reproducibility and allows direct verification of the frequency-aware disruption approach.

major comments (3)
  1. The central claim that the method 'fundamentally impairs' spatial aggregation (Abstract) rests on the assumption that 3D networks cannot fall back on intra-slice cues or training mitigations; however, no experiments evaluate adaptive attackers using random z-axis cropping, slice shuffling, or 2.5D ensembling, which directly tests whether the reported DSC collapses (80% to 0% on BraTS19) are robust.
  2. No ablation results are reported to quantify the individual contributions of the Inter-Slice Frequency Consistency Disruption versus the Semantic Prediction Disruption module, making it impossible to determine which component drives the performance drops or whether both are necessary for the claimed effect.
  3. The experiments section provides no statistical significance tests, standard deviations across multiple runs, or direct comparisons against adapted 2D unlearnable example baselines on the same 3D datasets, weakening support for the claim that the frequency-aware approach is superior for volumetric data.
minor comments (1)
  1. The abstract and method description would benefit from explicit notation for the spectral divergence objective and the precise definition of adjacent-slice pairs used in the frequency consistency term.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments identify key areas where additional evaluation would strengthen the manuscript. We address each major comment below and will incorporate revisions to improve the robustness and clarity of our claims.

read point-by-point responses
  1. Referee: The central claim that the method 'fundamentally impairs' spatial aggregation (Abstract) rests on the assumption that 3D networks cannot fall back on intra-slice cues or training mitigations; however, no experiments evaluate adaptive attackers using random z-axis cropping, slice shuffling, or 2.5D ensembling, which directly tests whether the reported DSC collapses (80% to 0% on BraTS19) are robust.

    Authors: We agree that testing against adaptive attackers is necessary to fully support the claim of fundamentally impairing spatial aggregation in 3D networks. VoxShield is designed to inject structural incoherence along the z-axis by maximizing spectral divergence between adjacent slices, which targets a core inductive bias of volumetric architectures. Nevertheless, we acknowledge that experiments with random z-axis cropping, slice shuffling, or 2.5D ensembling are absent from the current manuscript. In the revised version we will add these evaluations to demonstrate whether the observed performance degradation persists under such mitigation strategies. revision: yes

  2. Referee: No ablation results are reported to quantify the individual contributions of the Inter-Slice Frequency Consistency Disruption versus the Semantic Prediction Disruption module, making it impossible to determine which component drives the performance drops or whether both are necessary for the claimed effect.

    Authors: We concur that ablation studies are required to isolate the contribution of each module. The Inter-Slice Frequency Consistency Disruption specifically addresses cross-slice anatomical continuity, while the Semantic Prediction Disruption ensures the perturbation propagates to the final logits. To address this gap, the revised manuscript will include ablation experiments that separately disable each module and report the resulting DSC on BraTS19 and FLARE21, thereby clarifying their individual and joint roles. revision: yes

  3. Referee: The experiments section provides no statistical significance tests, standard deviations across multiple runs, or direct comparisons against adapted 2D unlearnable example baselines on the same 3D datasets, weakening support for the claim that the frequency-aware approach is superior for volumetric data.

    Authors: We recognize the need for statistical rigor and explicit baseline comparisons. The current results report point estimates without variance or significance testing, and no adapted 2D unlearnable-example methods are evaluated on the volumetric datasets. In the revision we will report standard deviations over multiple random seeds, include appropriate statistical tests, and provide direct comparisons with 2D baselines adapted to BraTS19 and FLARE21 to better substantiate the advantages of the frequency-aware inter-slice approach. revision: yes

Circularity Check

0 steps flagged

VoxShield defines explicit disruption objectives and evaluates them empirically; no derivation reduces to inputs by construction.

full rationale

The paper introduces Inter-Slice Frequency Consistency Disruption by maximizing spectral divergence between adjacent slices and Semantic Prediction Disruption by maximizing ℓ1 logit divergence. These are optimization objectives chosen to target assumed 3D network priors, then tested via experiments on BraTS19 and FLARE21. No equation or claim reduces a result to a fitted parameter or self-citation chain by definition; the performance drops are reported outcomes of the chosen perturbations under fixed training regimes, not tautological predictions. The central claim rests on the empirical effectiveness of the crafted perturbations rather than any self-referential derivation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the domain assumption that 3D networks exploit inter-slice continuity as a key prior, plus the choice of frequency-domain divergence and logit divergence as attack objectives. No new physical entities are postulated.

free parameters (1)
  • perturbation budget epsilon = 4/255
    Fixed at 4/255 to balance protection strength against visual fidelity; value chosen to keep changes imperceptible.
axioms (1)
  • domain assumption 3D segmentation networks rely on volumetric spatial correlations and inter-slice anatomical consistency as critical learning priors
    Explicitly stated in the abstract as the gap targeted by the method.

pith-pipeline@v0.9.0 · 5842 in / 1267 out tokens · 57600 ms · 2026-05-20T13:49:55.447365+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

26 extracted references · 26 canonical work pages · 2 internal anchors

  1. [1]

    net/, accessed: 2025-01-15

    The cancer imaging archive (tcia) (2025),https://www.cancerimagingarchive. net/, accessed: 2025-01-15

  2. [2]

    Pubmed (2025),https://pubmed.ncbi.nlm.nih.gov/, accessed: 2025-01-15

  3. [3]

    Chen, J., Lu, Y., Yu, Q., Luo, X., Adeli, E., Wang, Y., Lu, L., Yuille, A.L., Zhou, Y.: Transunet: Transformers make strong encoders for medical image segmentation (2021),https://arxiv.org/abs/2102.04306

  4. [4]

    Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O.: 3D U-net: Learning densevolumetricsegmentation fromsparse annotation.In:Medical Image Computing and Computer-Assisted Intervention (MICCAI). pp. 424–432 (2016). https://doi.org/10.1007/978-3-319-46723-8\_49

  5. [5]

    In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI)

    Dong, Z., He, Y., Qi, X., Chen, Y., Shu, H., Coatrieux, J.L., Yang, G., Li, S.: Mnet: Rethinking 2D/3D networks for anisotropic medical image segmentation. In: Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI). pp. 870–876 (2022). https://doi.org/10.24963/ijcai.2022/122

  6. [6]

    Official Journal of the European Union, L 119, 1–88 (2016),https://eur-lex.europa.eu/eli/reg/2016/679/oj

    European Parliament and Council of the European Union: Regulation (eu) 2016/679 (general data protection regulation). Official Journal of the European Union, L 119, 1–88 (2016),https://eur-lex.europa.eu/eli/reg/2016/679/oj

  7. [7]

    In: Advances in Neural Information Pro- cessing Systems (NeurIPS) (2021)

    Fowl, L., Goldblum, M., Chiang, P.y., Geiping, J., Czaja, W., Goldstein, T.: Ad- versarial examples make strong poisons. In: Advances in Neural Information Pro- cessing Systems (NeurIPS) (2021)

  8. [8]

    In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

    Hatamizadeh, A., Tang, Y., Nath, V., Yang, D., Myronenko, A., Landman, B., Roth, H.R., Xu, D.: UNETR: Transformers for 3D medical image segmentation. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV). pp. 574–584 (2022)

  9. [9]

    In: International Conference on Learning Repre- sentations (ICLR) (2021),https://openreview.net/forum?id=iAmZUo0DxC0 10 X

    Huang, H., Ma, X., Erfani, S.M., Bailey, J., Wang, Y.: Unlearnable examples: Mak- ing personal data unexploitable. In: International Conference on Learning Repre- sentations (ICLR) (2021),https://openreview.net/forum?id=iAmZUo0DxC0 10 X. Liu et al

  10. [10]

    Nature Methods18(2), 203–211 (2021)

    Isensee, F., Jaeger, P.F., Kohl, S.A.A., Petersen, J., Maier-Hein, K.H.: nnu-net: A self-configuring method for deep learning-based biomedical image segmenta- tion. Nature Methods18(2), 203–211 (2021). https://doi.org/10.1038/s41592-020- 01008-z

  11. [11]

    Nature Ma- chine Intelligence2(6), 305–311 (2020)

    Kaissis, G.A., Makowski, M.R., Rückert, D., Braren, R.F.: Secure, privacy- preserving and federated machine learning in medical imaging. Nature Ma- chine Intelligence2(6), 305–311 (2020). https://doi.org/10.1038/s42256-020-0186- 1,https://doi.org/10.1038/s42256-020-0186-1

  12. [12]

    https://doi.org/10.48550/arXiv.2403.14250,https:// arxiv.org/abs/2403.14250

    Lin, X., Yu, Y., Xia, S., Jiang, J., Wang, H., Yu, Z., Liu, Y., Fu, Y., Wang, S., Tang, W., Kot, A.: Safeguarding medical image segmentation datasets against unauthorizedtrainingviacontour-andtexture-awareperturbations.arXivpreprint arXiv:2403.14250 (2024). https://doi.org/10.48550/arXiv.2403.14250,https:// arxiv.org/abs/2403.14250

  13. [13]

    In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI)

    Liu, Y., Xu, K., Chen, X., Sun, L.: Stable unlearnable example: Enhancing the ro- bustness of unlearnable examples via stable error-minimizing noise. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI). vol. 38, pp. 3783–3791 (2024). https://doi.org/10.1609/aaai.v38i4.28169

  14. [14]

    Medical Image Analysis82, 102616 (2022)

    Ma, J., et al.: Fast and low-gpu-memory abdomen CT organ segmenta- tion: The FLARE challenge. Medical Image Analysis82, 102616 (2022). https://doi.org/10.1016/j.media.2022.102616

  15. [15]

    The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS),

    Menze, B.H., Jakab, A., Bauer, S., et al.: The multimodal brain tumor image seg- mentation benchmark (BRATS). IEEE Transactions on Medical Imaging34(10), 1993–2024 (2015). https://doi.org/10.1109/TMI.2014.2377694

  16. [16]

    In: 2016 Fourth International Confer- ence on 3D Vision (3DV)

    Milletari, F., Navab, N., Ahmadi, S.A.: V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: 2016 Fourth International Confer- ence on 3D Vision (3DV). pp. 565–571. IEEE (2016)

  17. [17]

    In: Medical Imaging with Deep Learning (2018),https://openreview.net/forum?id=Skft7cijM

    Oktay, O., Schlemper, J., Folgoc, L.L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N.Y., Kainz, B., Glocker, B., Rueckert, D.: Attention u-net: Learning where to look for the pancreas. In: Medical Imaging with Deep Learning (2018),https://openreview.net/forum?id=Skft7cijM

  18. [18]

    arXiv preprint arXiv:2303.04278 , year=

    Sadasivan, V.S., Soltanolkotabi, M., Feizi, S.: CUDA: Convolution-based unlearn- able datasets. In: Proceedings of the IEEE/CVF Conference on Computer Vi- sion and Pattern Recognition (CVPR). pp. 3862–3871 (2023),https://arxiv. org/abs/2303.04278

  19. [19]

    In: Advances in Neural Information Processing Systems (NeurIPS)

    Sun, Y., Zhang, H., Zhang, T., Ma, X., Jiang, Y.G.: UnSeg: One universal unlearn- able example generator is enough against all image segmentation. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)

  20. [20]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Tang, Y., Yang, D., Li, W., Roth, H.R., Landman, B., Xu, D., Nath, V., Hatamizadeh, A.: Self-supervised pre-training of swin transformers for 3D med- ical image analysis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 20730–20740 (2022)

  21. [21]

    In: Proceedings of the Network and Distributed System Security Sym- posium(NDSS)(2025).https://doi.org/10.14722/ndss.2025.240886,https://www

    Wang, D., Xue, M., Li, B., Camtepe, S., Zhu, L.: Provably unlearnable data ex- amples. In: Proceedings of the Network and Distributed System Security Sym- posium(NDSS)(2025).https://doi.org/10.14722/ndss.2025.240886,https://www. ndss-symposium.org/ndss-paper/provably-unlearnable-data-examples/

  22. [22]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Wang, H., Wu, X., Huang, Z., Xing, E.P.: High-frequency component helps ex- plain the generalization of convolutional neural networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 8684–8694 (2020) VoxShield 11

  23. [23]

    In: Advances in Neural Information Processing Systems (NeurIPS)

    Wang, X., Li, M., Liu, W., Zhang, H., Hu, S., Zhang, Y., Zhou, Z., Jin, H.: Un- learnable 3D point clouds: Class-wise transformation is all you need. In: Advances in Neural Information Processing Systems (NeurIPS). vol. 37 (2024)

  24. [24]

    In: Advances in Neural Information Process- ing Systems (NeurIPS)

    Yin, D., Lopes, R.G., Shlens, J., Cubuk, E.D., Gilmer, J.: A Fourier perspective on model robustness in computer vision. In: Advances in Neural Information Process- ing Systems (NeurIPS). pp. 13255–13265 (2019),https://arxiv.org/abs/1906. 08988

  25. [25]

    In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

    Yu, D., Zhang, H., Chen, W., Yin, J., Liu, T.Y.: Availability attacks cre- ate shortcuts. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. pp. 2367–2376. KDD ’22, ACM (Aug 2022). https://doi.org/10.1145/3534678.3539241,http://dx.doi.org/10.1145/ 3534678.3539241

  26. [26]

    UNet++: A Nested U-Net Architecture for Medical Image Segmentation

    Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., Liang, J.: UNet++: A nested U-net architecture for medical image segmentation. In: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (DLMIA/ML- CDS @ MICCAI). pp. 3–11 (2018),https://arxiv.org/abs/1807.10165