pith. sign in

arxiv: 2412.04880 · v3 · submitted 2024-12-06 · 💻 cs.CV · eess.IV

MozzaVID: Mozzarella Volumetric Image Dataset

Pith reviewed 2026-05-23 07:59 UTC · model grok-4.3

classification 💻 cs.CV eess.IV
keywords volumetric datasetCT imagingmozzarella microstructurecheese classification3D deep learningfood structure analysisX-ray tomography
0
0 comments X

The pith

MozzaVID supplies X-ray CT volumes of 25 mozzarella types to train and benchmark volumetric deep-learning models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces MozzaVID, a dataset of computed tomography scans that capture the internal microstructure of mozzarella cheese samples. It covers 149 samples from 25 distinct cheese types and supplies the volumes at three different resolutions, yielding between 591 and 37,824 images per version. The work responds to the scarcity of clean, labeled volumetric datasets that researchers can use to compare and improve three-dimensional deep-learning architectures. Beyond model development, the scans also let users examine how cheese microstructure varies across types and imaging scales.

Core claim

The authors acquire and label X-ray CT volumes of mozzarella, organize them into a classification task over 25 cheese types and 149 samples, and release the data at three resolutions so that volumetric networks can be trained and evaluated on food microstructures that are both complex and disordered.

What carries the argument

The MozzaVID dataset of labeled 3D CT volumes, provided at multiple resolutions, that serves as both a classification benchmark and a microstructural reference.

If this is right

  • Volumetric deep-learning models gain a standardized benchmark for comparing architectures on three-dimensional data.
  • Researchers can measure how spatial resolution affects classification performance on disordered food structures.
  • The dataset supports studies that link visible microstructure features to cheese type without destructive sampling.
  • Algorithms developed on MozzaVID can be tested for robustness on other complex, non-periodic materials.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same acquisition and labeling approach could be repeated for other food products to create comparable volumetric benchmarks.
  • Success on MozzaVID may indicate whether a model will handle similar tasks in medical or materials CT imaging.
  • The multi-resolution design allows direct experiments on the trade-off between detail and computational cost in 3D classification.

Load-bearing premise

The CT volumes and their cheese-type labels are clean enough and representative enough to support reliable distinction among the 25 varieties.

What would settle it

Any classifier trained on the training split fails to exceed random-guess accuracy on a held-out test set of the same cheese types.

Figures

Figures reproduced from arXiv: 2412.04880 by Anders Bjorholm Dahl, Anders Nymark Christensen, Carsten Gundlach, Jeppe Revall Frisvad, Pawel Tomasz Pieta, Peter Winkel Rasmussen, Siavash Arjomand Bigdeli.

Figure 1
Figure 1. Figure 1: Comparison of typical volumetric and 2D dataset sizes. [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Mozzarella samples wrapped in parafilm and mounted [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Sketch of the three proposed dataset configurations. The [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: UMAP generated from second-to-last layer feature representations of the best-performing model in the coarse-grained classifi [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Overview of the variation in the normalized experimental design parameters in the first 24 cheese types (coarse-grained classes). [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: PCA of the experimental design parameters used to de [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: UMAP generated from second-to-last layer feature representations of the best-performing model in the fine-grained classification [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Overview of slices from each cheese type, forming the 25 coarse-grained classes. [PITH_FULL_IMAGE:figures/full_fig_p018_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Example slices from the fine-grained classes. Each row represents a set of six samples from one cheese type (coarse-grained [PITH_FULL_IMAGE:figures/full_fig_p019_9.png] view at source ↗
read the original abstract

Influenced by the complexity of volumetric imaging, there is a shortage of established datasets useful for benchmarking volumetric deep-learning models. As a consequence, new and existing models are not easily comparable, limiting the development of architectures optimized specifically for volumetric data. To counteract this trend, we introduce MozzaVID -- a large, clean, and versatile volumetric classification dataset. Our dataset contains X-ray computed tomography (CT) images of mozzarella microstructure and enables the classification of 25 cheese types and 149 cheese samples. We provide data in three different resolutions, resulting in three dataset instances containing from 591 to 37,824 images. While targeted for developing general-purpose volumetric algorithms, the dataset also facilitates investigating the properties of mozzarella microstructure. The complex and disordered nature of food structures brings a unique challenge, where a choice of appropriate imaging method, scale, and sample size is not trivial. With this dataset, we aim to address these complexities, contributing to more robust structural analysis models and a deeper understanding of food structure. The dataset can be explored through: https://papieta.github.io/MozzaVID/

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 0 minor

Summary. The paper introduces MozzaVID, a volumetric classification dataset of X-ray CT images of mozzarella cheese microstructures. It enables classification of 25 cheese types across 149 samples and supplies the data in three resolutions, yielding dataset instances with 591 to 37,824 images. The work targets benchmarking of volumetric deep-learning models while also supporting investigation of food microstructure properties.

Significance. If the released volumes and labels prove clean and representative, the dataset would address the noted shortage of volumetric imaging benchmarks and provide a testbed for algorithms handling complex, disordered structures. The multi-resolution releases and public exploration link constitute concrete strengths for reproducibility and downstream use in both general volumetric methods and food-science applications.

major comments (2)
  1. [Abstract] Abstract: the assertion that the dataset is 'large, clean, and versatile' supplies no quantitative evidence (label accuracy, inter-sample consistency, exclusion criteria, or diversity statistics) to support the 25-class classification claim.
  2. [Abstract] Abstract, paragraph 2: the central assumption that the CT volumes and labels are 'sufficiently clean and representative' for reliable classification is load-bearing yet unsupported by any description of imaging parameters, sample preparation, labeling protocol, or quality-control steps.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed feedback on the abstract. The comments correctly identify that the abstract makes claims without direct supporting evidence or references. We address each point below and will revise the abstract in the next version.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion that the dataset is 'large, clean, and versatile' supplies no quantitative evidence (label accuracy, inter-sample consistency, exclusion criteria, or diversity statistics) to support the 25-class classification claim.

    Authors: We agree that the abstract does not supply quantitative evidence for these descriptors. The main text provides sample counts, resolution variants, and class distribution, but the abstract itself does not reference specific metrics such as label accuracy or exclusion criteria. We will revise the abstract to remove or qualify the phrase 'large, clean, and versatile' and add a brief pointer to the relevant statistics and methods sections. revision: yes

  2. Referee: [Abstract] Abstract, paragraph 2: the central assumption that the CT volumes and labels are 'sufficiently clean and representative' for reliable classification is load-bearing yet unsupported by any description of imaging parameters, sample preparation, labeling protocol, or quality-control steps.

    Authors: The manuscript body contains dedicated sections on imaging parameters, sample preparation, and labeling. However, the abstract does not cite or summarize these elements. We will revise the abstract to include a short reference to the methods and quality-control procedures so that the assumption is no longer unsupported within the abstract itself. revision: yes

Circularity Check

0 steps flagged

No derivations, predictions or fitted quantities; dataset release paper

full rationale

The manuscript is a pure data-release contribution. It describes the acquisition, labeling and formatting of CT volumes of mozzarella samples for 25-class classification but contains no equations, no model derivations, no parameter fitting, no predictions, and no load-bearing self-citations. The central claim is simply the existence and utility of the released dataset files, which are evaluated externally rather than by internal logical reduction. No step reduces to its own inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Dataset release paper; contains no mathematical model, fitted parameters, axioms, or postulated entities.

pith-pipeline@v0.9.0 · 5754 in / 946 out tokens · 29861 ms · 2026-05-23T07:59:02.421167+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

100 extracted references · 100 canonical work pages · 1 internal anchor

  1. [1]

    Mariam Andersson, Hans Martin Kjer, Jonathan Rafael- Patino, Alexandra Pacureanu, Bente Pakkenberg, Jean- Philippe Thiran, Maurice Ptito, Martin Bech, Anders Bjorholm Dahl, Vedrana Andersen Dahl, and Tim B. Dyrby. Axon morphology is modulated by the local environment and impacts the noninvasive investigation of its struc- ture–function relationship. Proce...

  2. [2]

    Armato, III, Geoffrey McLennan, Luc Bidaut, Michael F

    Samuel G. Armato, III, Geoffrey McLennan, Luc Bidaut, Michael F. McNitt-Gray, Charles R. Meyer, Anthony P. Reeves, Binsheng Zhao, Denise R. Aberle, Claudia I. Hen- schke, Eric A. Hoffman, Ella A. Kazerooni, Heber MacMa- hon, Edwin J. R. Van Beek, David Yankelevitz, Alberto M. Biancardi, Peyton H. Bland, Matthew S. Brown, Roger M Engelmann, Gary E. Laderac...

  3. [3]

    Freymann, Justin S

    Ujjwal Baid, Satyam Ghodasara, Suyash Mohan, Michel Bilello, Evan Calabrese, Errol Colak, Keyvan Farahani, Jayashree Kalpathy-Cramer, Felipe Campos Kitamura, Sarthak Pati, Luciano Prevedello, Jeffrey Rudie, Chiharu Sako, Russell Shinohara, Timothy Bergquist, Rong Chai, James Eddy, Julia Elliott, Walter Reade, Thomas Schaffter, Thomas Yu, Jiaxin Zheng, Chr...

  4. [4]

    Kirby, John B

    Spyridon Bakas, Hamed Akbari, Aristeidis Sotiras, Michel Bilello, Martin Rozycki, Justin S. Kirby, John B. Freymann, Keyvan Farahani, and Christos Davatzikos. Advancing the cancer genome atlas glioma MRI collections with expert seg- mentation labels and radiomic features.Scientific Data, 4(1): 170117, 2017. 3

  5. [5]

    Ramona Bast, Prateek Sharma, Hannah K. B. Easton, Tzvetelin T. Dessev, Mita Lad, and Peter A. Munro. Ten- sile testing to quantitate the anisotropy and strain hardening of mozzarella cheese. International Dairy Journal, 44:6–14,

  6. [6]

    Prospectus of cultured meat - advancing meat alternatives

    Zuhaib Fayaz Bhat and Hina Fayaz. Prospectus of cultured meat - advancing meat alternatives. Journal of Food Science and Technology, 48(2):125–140, 2011. 2

  7. [7]

    Patrick Bilic, Patrick Christ, Hongwei Bran Li, Eugene V orontsov, Avi Ben-Cohen, Georgios Kaissis, Adi Szeskin, Colin Jacobs, Gabriel Efrain Humpire Mamani, Gabriel Chartrand, Fabian Loh ¨ofer, Julian Walter Holch, Wieland Sommer, Felix Hofmann, Alexandre Hostettler, Naama Lev- Cohain, Michal Drozdzal, Michal Marianne Amitai, Re- fael Vivanti, Jacob Sosn...

  8. [8]

    ViT-V-Net: Vision transformer for unsupervised volumetric medical image registration

    Junyu Chen, Yufan He, Eric Frey, Ye Li, and Yong Du. ViT-V-Net: Vision transformer for unsupervised volumetric medical image registration. In Medical Imaging with Deep Learning, 2021. 8

  9. [9]

    Lungren, Shaoting Zhang, Lei Xing, Le Lu, Alan Yuille, and Yuyin Zhou

    Jieneng Chen, Jieru Mei, Xianhang Li, Yongyi Lu, Qihang Yu, Qingyue Wei, Xiangde Luo, Yutong Xie, Ehsan Adeli, Yan Wang, Matthew P. Lungren, Shaoting Zhang, Lei Xing, Le Lu, Alan Yuille, and Yuyin Zhou. TransUNet: rethinking the u-net architecture design for medical image segmentation through the lens of transformers. Medical Image Analysis , 97:103280, 2024. 8

  10. [10]

    Schwing, Alexan- der Kirillov, and Rohit Girdhar

    Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexan- der Kirillov, and Rohit Girdhar. Masked-attention mask transformer for universal image segmentation. Proceedings of the Ieee Computer Society Conference on Computer Vi- sion and Pattern Recognition, 2022-:1280–1289, 2022. 8

  11. [11]

    Determination of hip-joint loading patterns of living and extinct mammals using an inverse Wolff’s law approach

    Patrik Christen, Keita Ito, Frietson Galis, and Bert van Riet- bergen. Determination of hip-joint loading patterns of living and extinct mammals using an inverse Wolff’s law approach. Biomechanics and Modeling in Mechanobiology, 14(2):427– 432, 2015. 2

  12. [12]

    Lienkamp, Thomas Brox, and Olaf Ronneberger

    ¨Ozg¨un C ¸ ic ¸ek, Ahmed Abdulkadir, Soeren S. Lienkamp, Thomas Brox, and Olaf Ronneberger. 3D U-Net: Learn- ing dense volumetric segmentation from sparse annotation. In Medical Image Computing and Computer-Assisted Inter- 9 vention (MICCAI 2016) , pages 424–432. Lecture Notes in Computer Science, V ol. 9901. Springer, 2016. 2

  13. [13]

    Crippa, E

    M. Crippa, E. Solazzo, D. Guizzardi, F. Monforti-Ferrario, F. N. Tubiello, and A. Leip. Food systems are responsible for a third of global anthropogenic GHG emissions. Nature Food, 2(3):198–209, 2021. 2

  14. [14]

    Cunningham, Imran A

    John A. Cunningham, Imran A. Rahman, Stephan Lauten- schlager, Emily J. Rayfield, and Philip C. J. Donoghue. A virtual world of paleontology. Trends in Ecology and Evolu- tion, 29(6):347–357, 2014. 1

  15. [15]

    Ching, K

    Francesco De Carlo, Do ˘ga G¨ursoy, Daniel J. Ching, K. Joost Batenburg, Wolfgang Ludwig, Lucia Mancini, Federica Marone, Rajmund Mokso, Dani ¨el M. Pelt, Jan Sijbers, and Mark Rivers. TomoBank: A tomographic data repository for computational x-ray science. Measurement Science and Technology, 29(3):034004, 2018. 2

  16. [16]

    ImageNet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large-scale hierarchical image database. In Proceedings of Computer Vision and Pattern Recognition (CVPR). IEEE, 2009. 3

  17. [17]

    The MNIST database of handwritten digit images for machine learning research [best of the web].IEEE Signal Processing Magazine, 29(6):141–142, 2012

    Li Deng. The MNIST database of handwritten digit images for machine learning research [best of the web].IEEE Signal Processing Magazine, 29(6):141–142, 2012. 1, 3

  18. [18]

    Kevin Zhou

    Yang Deng, Ce Wang, Yuan Hui, Qian Li, Jun Li, Shiwei Luo, Mengke Sun, Quan Quan, Shuxin Yang, You Hao, Pengbo Liu, Honghu Xiao, Chunpeng Zhao, Xinbao Wu, and S. Kevin Zhou. CTSpine1K: A large-scale dataset for spinal vertebrae segmentation in computed tomography. arXiv:2105.14711 [eess.IV], 2024. 2, 3

  19. [19]

    Dobson and A

    S. Dobson and A. G. Marangoni. Methodology and develop- ment of a high-protein plant-based cheese alternative. Cur- rent Research in Food Science, 7:100632, 2023. 2

  20. [20]

    Computer-aided diagnosis in medical imaging: Historical review, current status and future potential

    Kunio Doi. Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Com- puterized Medical Imaging and Graphics, 31(4-5):198–211,

  21. [21]

    An image is worth 16x16 words: Transformers for image recognition at scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, Jakob Uszkoreit, and Neil Houlsby. An image is worth 16x16 words: Transformers for image recognition at scale. Iclr 2021 - 9th International Conference on Learning Representations, ...

  22. [22]

    X-ray computed tomography for quality inspection of agricultural products: A review

    Zhe Du, Yongguang Hu, Noman Ali Buttar, and Ashraf Mah- mood. X-ray computed tomography for quality inspection of agricultural products: A review. Food Science & Nutrition, 7(10):3146–3160, 2019. 1

  23. [23]

    Anton du Plessis and William P. Boshoff. A review of X-ray computed tomography of concrete and asphalt construction materials. Construction and Building Materials , 199:637– 651, 2019. 1

  24. [24]

    Ran Feng, Sylvain Barjon, Frans W. J. van den Berg, Søren Kristian Lillevang, and Lilia Ahrn ´e. Effect of res- idence time in the cooker-stretcher on mozzarella cheese composition, structure and functionality. Journal of Food Engineering, 309:110690, 2021. 4

  25. [25]

    van der Berg, Rajmund Mokso, Søren Kristian Lillevang, and Lilia Ahrn ´e

    Ran Feng, Franciscus Winfried J. van der Berg, Rajmund Mokso, Søren Kristian Lillevang, and Lilia Ahrn ´e. Struc- tural, rheological and functional properties of extruded moz- zarella cheese influenced by the properties of the renneted casein gels. Food Hydrocolloids, 137:108322, 2023. 2, 4

  26. [26]

    Frisullo, J

    P. Frisullo, J. Laverse, R. Marino, and M.A. Del Nobile. X-ray computed tomography to study processed meat mi- crostructure. Journal of Food Engineering , 94(3–4):283– 289, 2009. 2

  27. [27]

    A re- view of applications of CT imaging on fiber reinforced com- posites

    Yantao Gao, Wenfeng Hu, Sanfa Xin, and Lijuan Sun. A re- view of applications of CT imaging on fiber reinforced com- posites. Journal of Composite Materials , 56(1):133–164,

  28. [28]

    S. C. Garcea, Y . Wang, and P. J. Withers. X-ray computed to- mography of polymer composites. Composites Science and Technology, 156:305–319, 2018. 2

  29. [29]

    Godoi, Sangeeta Prakash, and Bhesh R

    Fernanda C. Godoi, Sangeeta Prakash, and Bhesh R. Bhan- dari. 3D printing technologies applied for food design: Sta- tus and prospects. Journal of Food Engineering, 179:44–54,

  30. [30]

    3D semantic segmentation with submanifold sparse convolutional networks

    Benjamin Graham, Martin Engelcke, and Laurens Van Der Maaten. 3D semantic segmentation with submanifold sparse convolutional networks. Proceedings of Computer Vision and Pattern Recognition (CVPR) , pages 9224–9232,

  31. [31]

    Granlund

    Goesta H. Granlund. In search of a general picture process- ing operator. Computer Graphics and Image Processing , 8 (2):155–173, 1978. 8

  32. [32]

    Haralick, Karthikeyan Shanmugam, and Its’Hak Dinstein

    Robert M. Haralick, Karthikeyan Shanmugam, and Its’Hak Dinstein. Textural features for image classification. IEEE Transactions on Systems, Man, and Cybernetics , SMC3(6): 610–621, 1973. 8

  33. [33]

    Roth, and Daguang Xu

    Ali Hatamizadeh, Vishwesh Nath, Yucheng Tang, Dong Yang, Holger R. Roth, and Daguang Xu. Swin UNETR: Swin transformers for semantic segmentation of brain tu- mors in MRI images. In Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries (BrainLes 2021), pages 272–284. Lecture Notes in Computer Science, V ol. 12962. Springer, 2022. 2, 5, 8

  34. [34]

    Deep residual learning for image recognition

    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. InProceedings of Computer Vision and Pattern Recognition (CVPR), pages 770–778. IEEE, 2016. 2, 5

  35. [35]

    Statistical shape models for 3D medical image segmentation: A review.Med- ical Image Analysis, 13(4):543–563, 2009

    Tobias Heimann and Hans Peter Meinzer. Statistical shape models for 3D medical image segmentation: A review.Med- ical Image Analysis, 13(4):543–563, 2009. 1

  36. [36]

    The KiTS19 challenge data: 300 kidney tumor cases with clinical context, CT semantic segmenta- tions, and surgical outcomes

    Nicholas Heller, Niranjan Sathianathen, Arveen Kalapara, Edward Walczak, Keenan Moore, Heather Kaluzniak, Joel Rosenberg, Paul Blake, Zachary Rengel, Makinna Oestre- ich, Joshua Dean, Michael Tradewell, Aneri Shah, Resha Tejpaul, Zachary Edgerton, Matthew Peterson, Shaneab- bas Raza, Subodh Regmi, Nikolaos Papanikolopoulos, and Christopher Weight. The KiT...

  37. [37]

    Nicholas Heller, Fabian Isensee, Klaus H. Maier-Hein, Xi- aoshuai Hou, Chunmei Xie, Fengyi Li, Yang Nan, Guangrui 10 Mu, Zhiyong Lin, Miofei Han, Guang Yao, Yaozong Gao, Yao Zhang, Yixin Wang, Feng Hou, Jiawei Yang, Guang- wei Xiong, Jiang Tian, Cheng Zhong, Jun Ma, Jack Rick- man, Joshua Dean, Bethany Stai, Resha Tejpaul, Makinna Oestreich, Paul Blake, H...

  38. [38]

    BugNIST a large volumetric dataset for object detection under domain shift

    Patrick Møller Jensen, Vedrana Andersen Dahl, Rebecca Engberg, Carsten Gundlach, Hans Marin Kjer, and An- ders Bjorholm Dahl. BugNIST a large volumetric dataset for object detection under domain shift. Proceedings of the 18th European Conference on Computer Vision – Eccv 2024, 15090:18–36, 2025. 3

  39. [39]

    Christensen, Vedrana A

    Niels Jeppesen, Anders N. Christensen, Vedrana A. Dahl, and Anders B. Dahl. Sparse layered graphs for multi-object segmentation. In Proceedings of Computer Vision and Pat- tern Recognition (CVPR), pages 12774–12782. IEEE, 2020. 2

  40. [40]

    Jeppesen, L

    N. Jeppesen, L. P. Mikkelsen, A. B. Dahl, A. N. Christensen, and V . A. Dahl. Quantifying effects of manufacturing meth- ods on fiber orientation in unidirectional composites using structure tensor analysis. Composites Part A: Applied Sci- ence and Manufacturing, 149:106541, 2021. 8

  41. [41]

    Martulli, Martin Kerschbaum, Ivan Sergeichev, Yentl Swolfs, and Stepan V

    Radmir Karamov, Luca M. Martulli, Martin Kerschbaum, Ivan Sergeichev, Yentl Swolfs, and Stepan V . Lomov. Micro- CT based structure tensor analysis of fibre orientation in ran- dom fibre composites versus high-fidelity fibre identification methods. Composite Structures, 235:111818, 2020. 2

  42. [42]

    Anderson, Jimmy Huynh, Jeff Gelb, Jouni Freund, and Alp Karakoc ¸

    ¨Ozg¨ur Keles ¸, Eric H. Anderson, Jimmy Huynh, Jeff Gelb, Jouni Freund, and Alp Karakoc ¸. Stochastic fracture of addi- tively manufactured porous composites. Scientific Reports, 8 (1):15437, 2018. 2

  43. [43]

    NeuralVDB: High-resolution sparse volume representation using hierar- chical neural networks

    Doyub Kim, Minjae Lee, and Ken Museth. NeuralVDB: High-resolution sparse volume representation using hierar- chical neural networks. ACM Transactions on Graphics, 43 (2):20, 2024. 2

  44. [44]

    Muscle struc- ture assessment using synchrotron radiation x-ray micro- computed tomography in murine with cerebral ischemia.Sci- entific Reports, 14(1):26825, 2024

    Subok Kim, Sanghun Jang, and Onseok Lee. Muscle struc- ture assessment using synchrotron radiation x-ray micro- computed tomography in murine with cerebral ischemia.Sci- entific Reports, 14(1):26825, 2024. 2

  45. [45]

    The open images dataset V4: Unified image classification, object detection, and visual relationship detection at scale

    Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Ui- jlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, Tom Duerig, and Vittorio Ferrari. The open images dataset V4: Unified image classification, object detection, and visual relationship detection at scale. International Journal of Computer Vision, ...

  46. [46]

    Vision transformer for small-size datasets, 2021

    Seung Hoon Lee, Seunghyun Lee, and Byung Cheol Song. Vision transformer for small-size datasets, 2021. 8

  47. [47]

    Lo, Miranda R

    Sook-Lei Liew, Bethany P. Lo, Miranda R. Donnelly, Artemis Zavaliangos-Petropulu, Jessica N. Jeong, Giuseppe Barisano, Alexandre Hutton, Julia P. Simon, Julia M. Ju- liano, Anisha Suri, Zhizhuo Wang, Aisha Abdullah, Jun Kim, Tyler Ard, Nerisa Banaj, Michael R. Borich, Lara A. Boyd, Amy Brodtmann, Cathrin M. Buetefisch, Lei Cao, Jessica M. Cassidy, Valenti...

  48. [48]

    Lawrence Zitnick

    Tsung Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll´ar, and C. Lawrence Zitnick. Microsoft COCO: Common objects in context. Lec- ture Notes in Computer Science (including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinfor- matics), 8693(5):740–755, 2014. 1, 3

  49. [49]

    Geert Litjens, Thijs Kooi, Babak Ehteshami Bejnordi, Ar- naud Arindra Adiyoso Setio, Francesco Ciompi, Mohsen Ghafoorian, Jeroen A. W. M. van der Laak, Bram van Gin- neken, and Clara I. S ´anchez. A survey on deep learning in medical image analysis. Medical Image Analysis, 42:60–88,

  50. [50]

    Kevin Zhou

    Pengbo Liu, Hu Han, Yuanqi Du, Heqin Zhu, Yinhao Li, Feng Gu, Honghu Xiao, Jun Li, Chunpeng Zhao, Li Xiao, Xinbao Wu, and S. Kevin Zhou. Deep Learning to Segment Pelvic Bones: Large-scale CT Datasets and Baseline Mod- els, 2021. 3

  51. [51]

    Deep learning face attributes in the wild

    Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. In Proceedings of International Conference on Computer Vision (ICCV), pages 3730–3738. IEEE, 2015. 1, 3

  52. [52]

    Swin transformer: Hierarchical vision transformer using shifted windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. Pro- ceedings of International Conference on Computer Vision (ICCV), pages 9992–10002, 2021. 2, 5, 8

  53. [53]

    A ConvNet for the 2020s

    Zhuang Liu, Hanzi Mao, Chao Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A ConvNet for the 2020s. In Proceedings of Computer Vision and Pattern Recognition (CVPR), pages 11966–11976. IEEE, 2022. 2, 5

  54. [54]

    Decoupled weight de- cay regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight de- cay regularization. 7th International Conference on Learn- ing Representations, Iclr 2019, 2019. 5

  55. [55]

    3D-CNN- PyTorch: PyTorch implementation for 3dCNNs for medical images

    3D-CNN-PyTorch maintainers and contributors. 3D-CNN- PyTorch: PyTorch implementation for 3dCNNs for medical images. https://github.com/xmuyzz/3D-CNN- PyTorch, 2022. 5 11

  56. [56]

    TorchVision: Py- Torch’s computer vision library

    TorchVision maintainers and contributors. TorchVision: Py- Torch’s computer vision library. https://github. com/pytorch/vision, 2016. 5

  57. [57]

    Maire and P

    E. Maire and P. J. Withers. Quantitative X-ray tomography. International Materials Reviews, 59(1):1–43, 2014. 1

  58. [58]

    Marcus, Tracy H

    Daniel S. Marcus, Tracy H. Wang, Jamie Parker, John G. Csernansky, John C. Morris, and Randy L. Buckner. Open access series of imaging studies (OASIS): Cross-sectional MRI data in young, middle aged, nondemented, and de- mented older adults. Journal of Cognitive Neuroscience, 19 (9):1498–1507, 2007. 3

  59. [59]

    Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images

    Javier Marin, Aritro Biswas, Ferda Ofli, Nicholas Hynes, Amaia Salvador, Yusuf Aytar, Ingmar Weber, and Antonio Torralba. Recipe1m+: A dataset for learning cross-modal embeddings for cooking recipes and food images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(1):187–203, 2021. 3

  60. [60]

    Comparing vision transformers and convolutional neural net- works for image classification: A literature review

    Jos ´e Maur ´ıcio, In ˆes Domingues, and Jorge Bernardino. Comparing vision transformers and convolutional neural net- works for image classification: A literature review. Applied Sciences (switzerland), 13(9):5521, 2023. 8

  61. [61]

    UMAP: Uniform manifold approximation and projection

    Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. UMAP: Uniform manifold approximation and projection. Journal of Open Source Software , 3(29):861,

  62. [62]

    SANet: A slice-aware network for pulmonary nodule detection

    Jie Mei, Ming-Ming Cheng, Gang Xu, Lan-Ruo Wan, and Huan Zhang. SANet: A slice-aware network for pulmonary nodule detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(8):4374–4387, 2021. 1, 3

  63. [63]

    Bjoern H. Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, Levente Lanczi, Elizabeth Gerstner, Marc-Andre Weber, Tal Arbel, Brian B. Avants, Nicholas Ayache, Patricia Buendia, D. Louis Collins, Nicolas Cordier, Jason J. Corso, Antonio Criminisi, Tilak ...

  64. [64]

    Morozov, Anna E

    Sergey P. Morozov, Anna E. Andreychenko, Ivan A. Blokhin, Pavel B. Gelezhe, Anna P. Gonchar, Alexander E. Nikolaev, Nikolay A. Pavlov, Valeria Yu. Chernina, and Vic- tor A. Gombolevskiy. MosMedData: data set of 1110 chest CT scans performed during the covid-19 epidemic. Digital Diagnostics, 1(1):49–59, 2020. 3

  65. [65]

    Rita Verdelho, Diogo J

    Tiago Mota, M. Rita Verdelho, Diogo J. Ara ´ujo, Alceu Bis- soto, Carlos Santiago, and Catarina Barata. MMIST-ccRCC: A real world medical dataset for the development of multi- modal systems. In Proceedings of Computer Vision and Pat- tern Recognition (CVPR), pages 2395–2403. IEEE, 2024. 1

  66. [66]

    M. F. Mridha, Akibur Rahman Prodeep, A. S.M.Morshedul Hoque, Md Rashedul Islam, Aklima Akter Lima, Muham- mad Mohsin Kabir, Md Abdul Hamid, and Yutaka Watanobe. A comprehensive survey on the progress, process, and chal- lenges of lung cancer detection and classification.Journal of Healthcare Engineering, 2022(1):5905230, 2022. 1

  67. [67]

    Rusch, and R

    National Academies of Sciences, Engineering, and Medicine, Health and Medicine Division, Board on Population Health, Public Health Practice, Roundtable on Environmental Health Sciences, Research, and Medicine, E. Rusch, and R. Pool. Principles and Obstacles for Sharing Data from Environmental Health Research: Workshop Summary. National Academies Press, 2016. 1, 3

  68. [68]

    Applications of x-ray micro-computed tomography and small-angle x-ray scattering techniques in food systems: A concise review

    Sunday Olakanmi, Chithra Karunakaran, and Digvir Jayas. Applications of x-ray micro-computed tomography and small-angle x-ray scattering techniques in food systems: A concise review. Journal of Food Engineering, 342:111355,

  69. [69]

    Feature-centered first order structure tensor scale-space in 2d and 3d

    Pawel Tomasz Pieta, Anders Bjorholm Dahl, Jeppe Revall Frisvad, Siavash Arjomand Bigdeli, and Anders Nymark Christensen. Feature-centered first order structure tensor scale-space in 2d and 3d. Ieee Access, 13:9766–9779, 2025. 8

  70. [70]

    3D imaging in material science: Application of X-ray tomography

    Luc Salvo, Michel Su ´ery, Ariane Marmottant, Nathalie Limodin, and Dominique Bernard. 3D imaging in material science: Application of X-ray tomography. Comptes Rendus Physique, 11(9-10):641–649, 2010. 1

  71. [71]

    MobileNetV2: Inverted residuals and linear bottlenecks

    Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zh- moginov, and Liang Chieh Chen. MobileNetV2: Inverted residuals and linear bottlenecks. Proceedings of Computer Vision and Pattern Recognition (CVPR) , pages 4510–4520,

  72. [72]

    Correlative imaging of the murine hind limb vasculature and muscle tissue by microCT and light mi- croscopy

    Laura Schaad, Ruslan Hlushchuk, S ´ebastien Barr´e, Roberto Gianni-Barrera, David Haberth ¨ur, Andrea Banfi, and Valentin Djonov. Correlative imaging of the murine hind limb vasculature and muscle tissue by microCT and light mi- croscopy. Scientific Reports, 7(1):41842, 2017. 2

  73. [73]

    X-ray micro-computed tomography ( µCT) for non-destructive characterisation of food microstructure

    Letitia Schoeman, Paul Williams, Anton du Plessis, and Marena Manley. X-ray micro-computed tomography ( µCT) for non-destructive characterisation of food microstructure. Trends in Food Science & Technology, 47:10–24, 2016. 1

  74. [74]

    Arnaud Arindra Adiyoso Setio, Alberto Traverso, Thomas de Bel, Moira S. N. Berens, Cas van den Bogaard, Pier- giorgio Cerello, Hao Chen, Qi Dou, Maria Evelina Fan- tacci, Bram Geurts, Robbert van der Gugten, Pheng Ann Heng, Bart Jansen, Michael M. J. de Kaste, Valentin Ko- tov, Jack Yu Hung Lin, Jeroen T. M. C. Manders, Alexan- der S ´o˜nora-Mengana, Juan...

  75. [75]

    Deep learning in medical image analysis

    Dinggang Shen, Guorong Wu, and Heung-Il Suk. Deep learning in medical image analysis. Annual Review of Biomedical Engineering, 19(1):221–248, 2017. 1

  76. [76]

    Direct ob- servation and measurement of fiber architecture in short fiber-polymer composite foam through micro-CT imaging

    Hongbin Shen, Steven Nutt, and David Hull. Direct ob- servation and measurement of fiber architecture in short fiber-polymer composite foam through micro-CT imaging. Composites Science and Technology, 64(13-14):2113–2120,

  77. [77]

    Singh, Lipo Wang, Sukrit Gupta, Haveesh Goli, Parasuraman Padmanabhan, and Bal ´azs Guly ´as

    Satya P. Singh, Lipo Wang, Sukrit Gupta, Haveesh Goli, Parasuraman Padmanabhan, and Bal ´azs Guly ´as. 3D deep learning on medical images: A review. Sensors, 20(18):1– 24, 2020. 1

  78. [78]

    Mark D. Sutton. Tomographic techniques for the study of exceptionally preserved fossils. Proceedings of the Royal Society B: Biological Sciences, 275(1643):1587–1593, 2008. 1

  79. [79]

    EfficientNet: Rethinking model scaling for convolutional neural networks

    Mingxing Tan and Quoc Le. EfficientNet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning , pages 6105–6114. PMLR, 2019. 2

  80. [80]

    Roth, Bennett Landman, Daguang Xu, Vishwesh Nath, and Ali Hatamizadeh

    Yucheng Tang, Dong Yang, Wenqi Li, Holger R. Roth, Bennett Landman, Daguang Xu, Vishwesh Nath, and Ali Hatamizadeh. Self-supervised pre-training of swin trans- formers for 3D medical image analysis. In Proceedings of Computer Vision and Pattern Recognition (CVPR) , pages 20698–20708. IEEE, 2022. 2

Showing first 80 references.