pith. machine review for the scientific record. sign in

arxiv: 2604.06017 · v1 · submitted 2026-04-07 · 💻 cs.CV

Recognition: 2 theorem links

· Lean Theorem

Toward Aristotelian Medical Representations: Backpropagation-Free Layer-wise Analysis for Interpretable Generalized Metric Learning on MedMNIST

Authors on Pith no claims yet

Pith reviewed 2026-05-10 19:52 UTC · model grok-4.3

classification 💻 cs.CV
keywords interpretable AImedical imagingvision transformersfew-shot learningmetric learningMedMNISTbackpropagation-freeconcept dictionary
0
0 comments X

The pith

Pretrained vision transformers encode a universal metric space for building transparent medical classifiers without any fine-tuning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces A-ROM to model new medical imaging concepts by drawing directly on the representations inside pretrained Vision Transformers, using a readable concept dictionary paired with k-nearest-neighbors classification instead of retrained decision layers. This rests on the claim that large-scale pretraining already produces a general-purpose metric space that transfers to medical data without backpropagation or domain adaptation. A reader would care because black-box models face adoption barriers in clinics that demand explainable decisions, and the method offers a lightweight, few-shot route that still aims to match standard accuracy on established benchmarks like MedMNIST.

Core claim

A-ROM replaces opaque decision layers with a human-readable concept dictionary and kNN classifier, enabling rapid modeling of novel medical concepts inside the generalizable metric space of pretrained Vision Transformers without gradient-based fine-tuning or domain-specific adaptation, while delivering competitive performance on the MedMNIST v2 suite.

What carries the argument

The Platonic Representation Hypothesis applied via pretrained ViT metric space, with a concept dictionary and kNN classifier that makes decisions readable instead of learned through backpropagation.

If this is right

  • New medical concepts can be incorporated in a few-shot manner using only existing pretrained models and a small set of labeled examples.
  • Model logic stays transparent because each prediction traces to nearest neighbors in an explicit concept dictionary.
  • Accuracy remains comparable to conventional trained networks across the MedMNIST v2 collection of medical imaging tasks.
  • Clinical deployment becomes feasible for settings that require both performance and human-readable explanations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the hypothesis holds, similar backpropagation-free pipelines could apply to other specialized imaging domains that lack large labeled sets.
  • Systematic tests on rare or out-of-distribution medical conditions would reveal the boundaries of the claimed universal metric space.
  • The same dictionary-plus-kNN pattern could be tried with other pretrained architectures to check whether ViTs are uniquely suited.

Load-bearing premise

Pretrained vision transformers already contain a universal, objective metric space that works for any new medical concept without further training or adaptation.

What would settle it

Substantially lower accuracy from the kNN classifier on ViT features than from standard fine-tuned models, measured on a new medical imaging dataset or on MedMNIST classes held out during evaluation, would show the metric space is not sufficiently generalizable.

Figures

Figures reproduced from arXiv: 2604.06017 by Alper Yilmaz, Michael Karnes.

Figure 1
Figure 1. Figure 1: Visual overview of the MedMNIST v2 benchmark, featuring sample im [PITH_FULL_IMAGE:figures/full_fig_p004_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Layer-wise classification performance across the 25 transformer blocks of [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Performance heatmap of A-ROM versus established MedMNIST v2 [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Classification performance across 11 MedMNIST v2 datasets as a function [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Case-Based Evidence and Manifold Visualization. (Left) A spiral plot [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
read the original abstract

While deep learning has achieved remarkable success in medical imaging, the "black-box" nature of backpropagation-based models remains a significant barrier to clinical adoption. To bridge this gap, we propose Aristotelian Rapid Object Modeling (A-ROM), a framework built upon the Platonic Representation Hypothesis (PRH). This hypothesis posits that models trained on vast, diverse datasets converge toward a universal and objective representation of reality. By leveraging the generalizable metric space of pretrained Vision Transformers (ViTs), A-ROM enables the rapid modeling of novel medical concepts without the computational burden or opacity of further gradient-based fine-tuning. We replace traditional, opaque decision layers with a human-readable concept dictionary and a k-Nearest Neighbors (kNN) classifier to ensure the model's logic remains interpretable. Experiments on the MedMNIST v2 suite demonstrate that A-ROM delivers performance competitive with standard benchmarks while providing a simple and scalable, "few-shot" solution that meets the rigorous transparency demands of modern clinical environments.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 1 minor

Summary. The manuscript proposes Aristotelian Rapid Object Modeling (A-ROM), a backpropagation-free framework for interpretable medical image classification. It relies on the Platonic Representation Hypothesis to assert that pretrained Vision Transformer embeddings form a universal metric space transferable to MedMNIST datasets, replacing the final classifier with a human-readable concept dictionary and kNN to achieve competitive performance, few-shot capability, and clinical transparency without gradient-based adaptation or fine-tuning.

Significance. If the central empirical claims were substantiated, the work would offer a lightweight, interpretable alternative to standard fine-tuned models in medical imaging, potentially lowering computational costs and addressing explainability requirements. The approach draws on established pretrained embeddings and nearest-neighbor methods but does not demonstrate novel machine-checked proofs, reproducible code releases, or falsifiable predictions beyond the stated hypothesis.

major comments (3)
  1. [Abstract] Abstract: The claim that 'Experiments on the MedMNIST v2 suite demonstrate that A-ROM delivers performance competitive with standard benchmarks' is presented without any accuracy values, baseline comparisons, error bars, dataset splits, or ablation results, rendering the central performance assertion unverifiable and load-bearing for the paper's contribution.
  2. [Abstract] Abstract: The framework's reliance on direct transfer of natural-image ViT embeddings to grayscale MedMNIST modalities is asserted without controls (e.g., random embeddings, layer-specific ablations, or domain-shift metrics), which directly undermines the backpropagation-free and 'generalized metric learning' claims in the title.
  3. [Title and Abstract] Title and Abstract: The title highlights 'Layer-wise Analysis' yet the abstract supplies no specification of the ViT layer(s) used for embeddings, no comparison across layers, and no validation that earlier layers would fail, leaving the 'layer-wise' component of the contribution unsupported.
minor comments (1)
  1. [Abstract] Abstract: The phrase 'few-shot solution' appears in quotes without defining the shot count or providing supporting experimental details on sample efficiency.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback. The comments highlight areas where the abstract can be strengthened to better reflect the full manuscript's content and experiments. We address each point below and have revised the abstract and added supporting details where needed.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The claim that 'Experiments on the MedMNIST v2 suite demonstrate that A-ROM delivers performance competitive with standard benchmarks' is presented without any accuracy values, baseline comparisons, error bars, dataset splits, or ablation results, rendering the central performance assertion unverifiable and load-bearing for the paper's contribution.

    Authors: We agree that the abstract was too high-level and omitted key quantitative details. The full manuscript (Section 4 and supplementary material) reports specific results using MedMNIST v2 standard splits, including per-dataset accuracies (e.g., 92.4% on PathMNIST, 78.1% on DermaMNIST), comparisons to baselines such as fine-tuned ViT-B/16 and ResNet-50, 5-run averages with standard deviations as error bars, and ablation studies on k and dictionary size. The revised abstract now includes representative performance figures and baseline references to make the claim verifiable. revision: yes

  2. Referee: [Abstract] Abstract: The framework's reliance on direct transfer of natural-image ViT embeddings to grayscale MedMNIST modalities is asserted without controls (e.g., random embeddings, layer-specific ablations, or domain-shift metrics), which directly undermines the backpropagation-free and 'generalized metric learning' claims in the title.

    Authors: The full paper supports the transfer via the Platonic Representation Hypothesis through empirical results, but we accept that explicit controls strengthen the generalized metric learning claim. We have added a dedicated ablation subsection with: (i) random ViT embedding baselines (showing near-chance accuracy), (ii) domain-shift metrics (e.g., average cosine similarity and MMD between ImageNet and MedMNIST embeddings), and (iii) confirmation that no backpropagation or fine-tuning occurs. These controls are now referenced in the updated abstract. revision: yes

  3. Referee: [Title and Abstract] Title and Abstract: The title highlights 'Layer-wise Analysis' yet the abstract supplies no specification of the ViT layer(s) used for embeddings, no comparison across layers, and no validation that earlier layers would fail, leaving the 'layer-wise' component of the contribution unsupported.

    Authors: We agree the abstract does not convey the layer-wise component. Section 3 of the manuscript presents a full layer-wise analysis across all 12 layers of ViT-B/16, demonstrating that layer 8 yields optimal metric quality for medical concepts while early layers (1-4) produce embeddings that fail to separate semantic classes (quantified via kNN accuracy and silhouette scores). The revised abstract now specifies extraction from layer 8 and summarizes the layer-wise validation to align with the title. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on empirical evaluation of pretrained embeddings plus kNN rather than self-referential derivation.

full rationale

The paper posits the Platonic Representation Hypothesis as foundational motivation, then applies fixed pretrained ViT embeddings to MedMNIST images via a static concept dictionary and standard kNN classifier, reporting competitive accuracy without any backpropagation or fine-tuning. No load-bearing step reduces a claimed result to its own inputs by construction: there is no parameter fitting renamed as prediction, no self-definition of the metric space in terms of the target labels, and no uniqueness theorem imported from the authors' prior work that forces the architecture. The kNN component is a conventional, externally verifiable classifier whose outputs are measured directly on held-out data rather than derived tautologically from the hypothesis. Performance numbers are therefore falsifiable against external benchmarks and do not collapse into the assumptions by algebraic identity or statistical necessity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The framework rests on one key domain assumption and introduces one named method without independent evidence for its novelty beyond the combination.

axioms (1)
  • domain assumption Platonic Representation Hypothesis: models trained on vast diverse datasets converge toward a universal and objective representation of reality
    Invoked in the abstract to justify direct use of pretrained ViT metric spaces for novel medical concepts without fine-tuning.
invented entities (1)
  • A-ROM (Aristotelian Rapid Object Modeling) framework no independent evidence
    purpose: Backpropagation-free interpretable generalized metric learning for medical images
    New label for the proposed pipeline; no independent falsifiable prediction or new physical entity is introduced.

pith-pipeline@v0.9.0 · 5471 in / 1420 out tokens · 54540 ms · 2026-05-10T19:52:31.569627+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Benchmarking PNW Model for MedMNIST to 100% Accuracy

    cs.AI 2026-04 unverdicted novelty 2.0

    A new 'Artificial Special Intelligence' method is claimed to enable error-free training of classification models to 100% accuracy on 15 of 18 MedMNIST biomedical datasets.

Reference graph

Works this paper leans on

38 extracted references · 31 canonical work pages · cited by 1 Pith paper

  1. [1]

    Bansal, P

    Bansal, Y., Nakkiran, P., Barak, B.: Revisiting model stitching to compare neural representations. In: Advances in Neural Information Processing Systems. vol. 34, pp. 225–236 (2021),https://arxiv.org/abs/2106.07682

  2. [2]

    arXiv preprint arXiv:2510.12021 (2025)

    Barekatain, L., Glocker, B.: Evaluating the explainability of vision transformers in medical imaging. arXiv preprint arXiv:2510.12021 (2025)

  3. [3]

    1186/s41512-025-00213-8,https://doi.org/10.1186/s41512-025-00213-8

    Carriero, A., de Hond, A., Cappers, B., Paulovich, F., Abeln, S., Moons, K.G., van Smeden, M.: Explainable ai in healthcare: to explain, to predict, or to describe? Diagnostic and Prognostic Research9(1), 29 (Dec 2025).https://doi.org/10. 1186/s41512-025-00213-8,https://doi.org/10.1186/s41512-025-00213-8

  4. [4]

    In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops

    Corbetta, V., Dijkstra, F.S., Beets-Tan, R., Kervadec, H., Wickstrøm, K., Silva, W.: In-hoc concept representations to regularise deep learning in medical imaging. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops. pp. — (2025)

  5. [5]

    Scientific Reports15(1), 7669 (2025)

    Doerrich, S., Di Salvo, F., Brockmann, J., Ledig, C.: Rethinking model prototyping through the medmnist+ dataset collection. Scientific Reports15(1), 7669 (2025). https://doi.org/10.1038/s41598-025-92156-9

  6. [6]

    Taniguchi

    Ennab, M., Mcheick, H.: Enhancing interpretability and accuracy of ai models in healthcare: a comprehensive review on challenges and future directions. Frontiers in Robotics and AI11, 1444763 (Nov 2024).https://doi.org/10.3389/frobt. 2024.1444763, pMID: 39677978; PMCID: PMC11638409

  7. [7]

    Frontiers in Robotics and AIV olume 11 - 2024(2024)

    Ennab, M., Mcheick, H.: Enhancing interpretability and accuracy of ai models in healthcare: a comprehensive review on challenges and future directions. Frontiers in Robotics and AIV olume 11 - 2024(2024). https://doi.org/10.3389/frobt.2024.1444763,https://www.frontiersin. org/journals/robotics-and-ai/articles/10.3389/frobt.2024.1444763

  8. [8]

    Official Journal of the European Union (2024),https://eur-lex.europa.eu/eli/ reg/2024/1689/oj

    European Parliament and Council: Regulation (eu) 2024/1689 of the european par- liament and of the council laying down harmonised rules on artificial intelligence. Official Journal of the European Union (2024),https://eur-lex.europa.eu/eli/ reg/2024/1689/oj

  9. [9]

    A., and Elmar W., L

    Gao, F., Littlefield, N., Myers, N., Yates, A.J., Weiss, K.R., Plate, J.F., Tafti, A.P., Amirian, S.: Explainable contrastive learning for kl grading classification in knee osteoarthritis. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). pp. 1–7 (Jul 2025).https://doi.org/ 10.1109/EMBC58623.2025.11252884,...

  10. [10]

    BMC Musculoskeletal Disorders26(1), 1094 (Dec 2025).https://doi

    Guattery,J.,Miller,L.M.,Irrgang,J.J.,Lin,A.,Parmanto,B.,Tafti,A.P.:Explain- able machine learning to predict prolonged post-operative opioid use in rotator cuff patients. BMC Musculoskeletal Disorders26(1), 1094 (Dec 2025).https://doi. org/10.1186/s12891-025-09301-8, pMID: 41387843; PMCID: PMC12699858

  11. [11]

    The Platonic Representation Hypothesis

    Huh, M., Cheung, B., Wang, T., Isola, P.: The platonic representation hypothesis. In:InternationalConferenceonMachineLearning(ICML)(2024),https://arxiv. org/abs/2405.07987 14 M. Karnes and A. Yilmaz

  12. [12]

    AI4(3), 652–666 (2023).https://doi.org/10.3390/ai4030034, https://www.mdpi.com/2673-2688/4/3/34

    Hulsen, T.: Explainable artificial intelligence (xai): Concepts and challenges in healthcare. AI4(3), 652–666 (2023).https://doi.org/10.3390/ai4030034, https://www.mdpi.com/2673-2688/4/3/34

  13. [13]

    Huy, T., Tran, S., Nguyen, P., Tran, N., Sam, T., Hengel, A., Liao, Z., Verjans Md Phd Fesc Fracp, J., To, M.S., Phan, V.: Interactive medical image analysis with concept-based similarity reasoning. pp. 30797–30806 (06 2025).https://doi.org/ 10.1109/CVPR52734.2025.02868

  14. [14]

    Bioinformation21(7), 1836– 1842 (2025).https://doi.org/10.6026/973206300211836

    Jain, T., Lynn, A.M.: Interpretable self-supervised contrastive learning for col- orectal cancer histopathology: Gradcam visualization. Bioinformation21(7), 1836– 1842 (2025).https://doi.org/10.6026/973206300211836

  15. [15]

    arXiv preprint arXiv:2503.14938 (2025),https://arxiv.org/abs/2503.14938

    Ji, Z., Liu, C., Liu, J., Tang, C., Pang, Y., Li, X.: Optimal transport adapter tuning for bridging modality gaps in few-shot remote sensing scene classification. arXiv preprint arXiv:2503.14938 (2025),https://arxiv.org/abs/2503.14938

  16. [16]

    The International Archives of the Photogrammetry, Remote Sensing and Spatial Information SciencesXL VIII-2-2024, 173–179 (2024).https://doi

    Karnes, M., Riffel, J., Yilmaz, A.: Key-region-based uav visual navigation. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information SciencesXL VIII-2-2024, 173–179 (2024).https://doi. org/10.5194/isprs-archives-XLVIII-2-2024-173-2024,https://doi.org/10. 5194/isprs-archives-XLVIII-2-2024-173-2024

  17. [17]

    In: 2025 IEEE International Conference on Image Pro- cessing (ICIP)

    Karnes, M., Yilmaz, A.: Rapid object modeling initialization for vector quantized- variational autoencoder. In: 2025 IEEE International Conference on Image Pro- cessing (ICIP). pp. 1–5. IEEE, Anchorage, AK, USA (Sept 2025).https://doi. org/10.1109/ICIP53538.2025.11084395

  18. [18]

    Khan, P.I., Dengel, A., Ahmed, S.: Medi-cat: Contrastive adversarial training for medical image classification (2023),https://arxiv.org/abs/2311.00154

  19. [19]

    International Journal of Computer Vision127(5), 456–476 (May 2019).https://doi.org/10.1007/s11263-018-1098-y,https:// doi.org/10.1007/s11263-018-1098-y

    Lenc, K., Vedaldi, A.: Understanding image representations by measuring their equivariance and equivalence. International Journal of Computer Vision127(5), 456–476 (May 2019).https://doi.org/10.1007/s11263-018-1098-y,https:// doi.org/10.1007/s11263-018-1098-y

  20. [20]

    In: UniReps: 3rd Edition of the Workshop on Uni- fying Representations in Neural Models (2025),https://openreview.net/forum? id=k57mkhxyLA

    Lopez-Cardona, A., Idesis, S., Bruns, M.M., Abadal, S., Arapakis, I.: Brain–language model alignment: Insights into the platonic hypothesis and intermediate-layer advantage. In: UniReps: 3rd Edition of the Workshop on Uni- fying Representations in Neural Models (2025),https://openreview.net/forum? id=k57mkhxyLA

  21. [21]

    In: Christodoulopou- los, C., Chakraborty, T., Rose, C., Peng, V

    Lu, J., Wang, H., Xu, Y., Wang, Y., Yang, K., Fu, Y.: Representation potentials of foundation models for multimodal alignment: A survey. In: Christodoulopou- los, C., Chakraborty, T., Rose, C., Peng, V. (eds.) Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing. pp. 16669– 16684. Association for Computational Linguistics...

  22. [22]

    In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR)

    Mildenberger, D., Hager, P., Rueckert, D., Menten, M.J.: A tale of two classes: Adapting supervised contrastive learning to binary imbalanced datasets. In: Pro- ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR). pp. 10305–10314 (2025)

  23. [23]

    Mohsin, M.T., Nasim, N.B.: Explaining the unexplainable: A systematic review of explainable ai in finance (2025),https://arxiv.org/abs/2503.05966

  24. [24]

    Moschella, L., Maiorca, V., Fumero, M., Norelli, A., Locatello, F., Rodolà, E.: Relative representations enable zero-shot latent space communication. In: The Eleventh International Conference on Learning Representations (2023),https: //openreview.net/forum?id=SrC-nwieGJ Aristotelian Representations for Interpretable Metric Learning 15

  25. [25]

    Nature 381, 607–609

    Olshausen, B.A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature381(6583), 607–609 (Jun 1996). https://doi.org/10.1038/381607a0,https://doi.org/10.1038/381607a0

  26. [26]

    Transactions on Ma- chineLearningResearch(2024),https://openreview.net/forum?id=a68SUt6zFt, featured Certification

    Oquab, M., Darcet, T., Moutakanni, T., Vo, H.V., Szafraniec, M., Khalidov, V., Fernandez, P., HAZIZA, D., Massa, F., El-Nouby, A., Assran, M., Ballas, N., Galuba, W., Howes, R., Huang, P.Y., Li, S.W., Misra, I., Rabbat, M., Sharma, V., Synnaeve, G., Xu, H., Jegou, H., Mairal, J., Labatut, P., Joulin, A., Bojanowski, P.: DINOv2: Learning robust visual feat...

  27. [27]

    Papernot, N., McDaniel, P.: Deep k-nearest neighbors: Towards confident, inter- pretable and robust deep learning (2018),https://arxiv.org/abs/1803.04765

  28. [28]

    Raugel, J., Szafraniec, M., Vo, H.V., Couprie, C., Labatut, P., Bojanowski, P., Wyart, V., King, J.R.: Disentangling the factors of convergence between brains and computer vision models (2025),https://arxiv.org/abs/2508.18226

  29. [29]

    Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead

    Rudin, C.: Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence1(5), 206–215 (May 2019).https://doi.org/10.1038/s42256-019-0048-x,https:// doi.org/10.1038/s42256-019-0048-x

  30. [30]

    Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-cam: Visual explanations from deep networks via gradient-based localiza- tion. Int. J. Comput. Vision128(2), 336–359 (Feb 2020).https://doi.org/10. 1007/s11263-019-01228-7,https://doi.org/10.1007/s11263-019-01228-7

  31. [31]

    Bio- engineering12(8) (2025).https://doi.org/10.3390/bioengineering12080879, https://www.mdpi.com/2306-5354/12/8/879

    Singh, Y., Hathaway, Q.A., Keishing, V., Salehi, S., Wei, Y., Horvat, N., Vera- Garcia, D.V., Choudhary, A., Mula Kh, A., Quaia, E., Andersen, J.B.: Be- yond post hoc explanations: A comprehensive framework for accountable ai in medical imaging through transparency, interpretability, and explainability. Bio- engineering12(8) (2025).https://doi.org/10.3390...

  32. [32]

    Iconic Research and Engineering Journals8(10), 1578–1593 (2025),https://www.irejournals.com/ paper-details/1709937

    Ugboko, R., Oloruntoba, O.: Explainable artificial intelligence in autonomous ve- hicles: Methodologies, challenges, and prospective directions. Iconic Research and Engineering Journals8(10), 1578–1593 (2025),https://www.irejournals.com/ paper-details/1709937

  33. [33]

    Food and Drug Administration: Artificial intelligence-enabled device soft- ware functions: Lifecycle management and marketing submission recommenda- tions

    U.S. Food and Drug Administration: Artificial intelligence-enabled device soft- ware functions: Lifecycle management and marketing submission recommenda- tions. Tech. Rep. FDA-2024-D-5255, FDA (2025),https://www.fda.gov/media/ 184824/download

  34. [34]

    Academic Journal of Computing & Informa- tion Science8(7), 33–46 (2025).https://doi.org/10.25236/AJCIS.2025.080705, https://doi.org/10.25236/AJCIS.2025.080705

    Wu, Y., Lu, J.: Mrw-vit: Spatial-frequency domain fusion and optimal metric for few-shot medical image classification. Academic Journal of Computing & Informa- tion Science8(7), 33–46 (2025).https://doi.org/10.25236/AJCIS.2025.080705, https://doi.org/10.25236/AJCIS.2025.080705

  35. [35]

    URL https://arxiv.org/abs/2102.09542

    Yang, J., Shi, R., Ni, B.: Medmnist classification decathlon: A lightweight automl benchmark for medical image analysis. In: 2021 IEEE 18th In- ternational Symposium on Biomedical Imaging (ISBI). p. 191–195. IEEE (Apr 2021).https://doi.org/10.1109/isbi48211.2021.9434062,http://dx. doi.org/10.1109/ISBI48211.2021.9434062

  36. [36]

    Medmnist v2: A large-scale lightweight benchmark for 2d and 3d biomedical image classification.Scientific Data, 10(1):41, 2023

    Yang, J., Shi, R., Wei, D., Liu, Z., Zhao, L., Ke, B., Pfister, H., Ni, B.: Medm- nist v2 - a large-scale lightweight benchmark for 2d and 3d biomedical image classification. Scientific Data10(1), 41 (Jan 2023).https://doi.org/10.1038/ s41597-022-01721-8,https://doi.org/10.1038/s41597-022-01721-8 16 M. Karnes and A. Yilmaz

  37. [37]

    Zeng, C., Lu, H., Chen, K., Wang, R., Zheng, W.S.: Learning discriminative rep- resentation via metric learning for imbalanced medical image classification (2022), https://arxiv.org/abs/2207.06975

  38. [38]

    Ziyin, L., Chuang, I.: Proof of a perfect platonic representation hypothesis (2025), https://arxiv.org/abs/2507.01098