pith. machine review for the scientific record. sign in

arxiv: 2605.09485 · v1 · submitted 2026-05-10 · 💻 cs.LG · stat.ML

Recognition: no theorem link

SEMASIA: A Large-Scale Dataset of Semantically Structured Latent Representations

Authors on Pith no claims yet

Pith reviewed 2026-05-12 04:18 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords latent representationssemantic structuremodel alignmentvision modelsembedding geometrydatasetinterpretabilitytransfer learning
0
0 comments X

The pith

SEMASIA supplies latent embeddings from roughly 1700 vision models across eight benchmarks together with structured metadata on architectures and training.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents SEMASIA as a large collection of latent representations drawn from about 1700 pretrained vision models on eight standard image-classification tasks. Each embedding set is accompanied by metadata describing the model's architecture, pretraining source, objective, scale, and other training details. The central goal is to overcome the difficulty of comparing latent spaces that contain similar semantic content yet differ in geometry because of changes in model design or training. A reader would care because the resource makes it possible to study how concepts cluster in embedding space, to test methods that align one model's space to another's, and to examine which training factors shape those geometric properties. Demonstrations in the work include observations of consistent prototype-like clustering, benchmarks of supervised alignment, and regression linking pretraining choices to embedding characteristics.

Core claim

SEMASIA is a dataset of latent representations extracted from approximately 1,700 pretrained vision models on eight image-classification benchmarks, paired with structured metadata on architectures, pretraining regimes, data sources, and model scale. The resource reveals consistent prototype-like clustering and hierarchical semantic neighborhoods within individual latent spaces. It supports benchmarking of supervised alignment mappings via reconstruction error and downstream task performance, and it enables regression analysis relating pretraining-data complexity, specialization, transfer learning, augmentation, and model scale to geometric and probing properties of the embeddings.

What carries the argument

The SEMASIA dataset, which pairs large-scale latent embeddings with standardized metadata on model architecture, training regime, pretraining source, and scale.

If this is right

  • Latent spaces exhibit consistent prototype-like clustering and hierarchical semantic neighborhoods across models and datasets.
  • Supervised alignment mappings between spaces can be evaluated using reconstruction error together with downstream task performance.
  • Pretraining-data complexity, specialization, transfer learning, augmentation, and model scale each influence geometric and probing properties of embeddings.
  • The resource supplies a reproducible basis for research on latent geometry, alignment techniques, and interoperable AI systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The dataset could serve as a shared reference point for developing alignment techniques that work across independently trained models without requiring joint retraining.
  • Similar metadata-rich collections in language or multimodal domains might expose whether the observed semantic clustering patterns hold beyond vision.
  • Practitioners could leverage the precomputed embeddings to prototype heterogeneous system components while controlling for scale and training variables.

Load-bearing premise

The chosen set of roughly 1,700 models and eight benchmarks is diverse and representative enough to support general statements about latent-space properties and alignment across vision models.

What would settle it

A new collection of models outside the current selection that produces latent spaces lacking the reported prototype clustering and hierarchical neighborhoods, or where alignment mappings fail to generalize beyond the included benchmarks, would undermine the claim that the dataset provides a foundation for broad conclusions.

Figures

Figures reproduced from arXiv: 2605.09485 by Enrico Grimaldi, Leonardo Di Nino, Lorenzo Marinucci, Mario Edoardo Pandolfo, Paolo Di Lorenzo, Sergio Barbarossa, Simone Fiorellino.

Figure 1
Figure 1. Figure 1: Illustration of the semiotic pipeline underlying modern neural models: raw perceptual [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Two-dimensional t-SNE projection of the aimv2_1b_patch14_224.apple_pt 2048- dimensional latent space, populated with samples from seven SEMASIA benchmarks. Each benchmark forms its own cluster, but semantically overlapping concepts collapse onto shared neighborhoods regardless of source: flower images from Oxford Flowers and CIFAR-100 occupy the same region, as do large mammals and vehicles drawn from CIFA… view at source ↗
Figure 3
Figure 3. Figure 3: Concept clustering in the aimv2_1b_patch14_224.apple_pt latent space, projected along six principal components of UMAP. Each axis encodes a latent feature, and examples organize along it according to how the feature is realized. (a) Distribution of latent representations from the full SEMASIA collection, showing how each benchmark forms its own cluster (complementing [PITH_FULL_IMAGE:figures/full_fig_p006… view at source ↗
Figure 4
Figure 4. Figure 4: Comparison of three supervised alignment methods on every model pair from Figure [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Forest plot of pooled OLS regression coefficients [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Exploratory analysis of the SEMASIA model registry. Left: joint distribution of the number of trainable parameters and the latent space dimensionality, shown on a log–log scale. Center: marginal distribution of the number of parameters across models. Right: marginal distribution of the latent space dimensionality. The bottom row aggregates models by architectural macro-family, while the top row provides a … view at source ↗
Figure 7
Figure 7. Figure 7: Evolution of the representation space learned by the convolutional classifier across three [PITH_FULL_IMAGE:figures/full_fig_p021_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Evolution of the representation space learned by the convolutional autoencoder across [PITH_FULL_IMAGE:figures/full_fig_p022_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Parallel coordinates of the t-SNE projection of the latent space of [PITH_FULL_IMAGE:figures/full_fig_p024_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Parallel coordinates of the t-SNE projection of the latent space on Fashion-MNIST. [PITH_FULL_IMAGE:figures/full_fig_p025_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: 2D UMAP projection of latent representations for CelebA. Points correspond to images. [PITH_FULL_IMAGE:figures/full_fig_p026_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: Cross-correlation heatmaps between basis vectors extracted from pairs of CIFAR-10 latent [PITH_FULL_IMAGE:figures/full_fig_p027_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Jaccard similarity heatmaps between prototypical anchors extracted from two ViT models [PITH_FULL_IMAGE:figures/full_fig_p027_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Prototype correspondences between vit_base_patch16_224.augreg_in1k (left) and vit_base_patch16_224.augreg_in21k (right) on CIFAR-10. Each row represents a matched pair of clusters, with green connectors indicating semantically coherent correspondences and red connectors indicating mismatches. The three panels show results for Hungarian matching (top), injected matching (middle), and spectral matching (bot… view at source ↗
Figure 15
Figure 15. Figure 15: Prototype correspondences between vit_base_patch16_224.augreg_in1k (left) and vit_base_patch16_224.augreg_in21k (right) on a multi-dataset benchmark combining CIFAR￾10, MNIST, and Fashion-MNIST. Each row represents a matched pair of clusters, with green connec￾tors indicating semantically coherent correspondences and red connectors indicating mismatches. At this scale, individual datasets act as macro-con… view at source ↗
Figure 16
Figure 16. Figure 16: The three-stage alignment pipeline shared by all methods considered in this work. A test [PITH_FULL_IMAGE:figures/full_fig_p031_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: Violin plots of five latent graph signatures across model macro-families on CIFAR-10. [PITH_FULL_IMAGE:figures/full_fig_p037_17.png] view at source ↗
read the original abstract

Latent representations learned by neural networks often exhibit semantic structure, where concept similarity is reflected by geometric proximity in embedding space. However, comparing such spaces across models remains difficult: changes in architecture, pretraining data, objective, or random seed can yield embeddings with similar content but incompatible geometry. This latent space alignment problem is central to interpretability, transfer and multimodal learning, federated systems, and semantic communication; however, progress remains limited by the lack of large-scale, model-diverse, and metadata-rich benchmarks. To address this gap, we introduce SEMASIA, a large-scale collection of latent representations extracted from approximately 1,700 pretrained vision models across eight standard image-classification benchmarks. SEMASIA pairs embeddings with structured metadata describing architectures, training regimes, pretraining sources, and model scale. We demonstrate three applications of the resource. First, we analyze the conceptual organization of individual latent spaces, showing consistent prototype-like clustering and hierarchical semantic neighborhoods across models and datasets. Second, we benchmark supervised alignment mappings between latent spaces using reconstruction error and downstream task performance. Third, we perform a large-scale regression analysis of how pretraining-data complexity, specialization, transfer learning, augmentation, and model scale relate to geometric and probing properties of embeddings. By coupling representational scale with standardized metadata, SEMASIA provides a reproducible foundation for studying latent geometry, evaluating alignment methods, and developing next-generation heterogeneous and interoperable AI systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 4 minor

Summary. The manuscript introduces SEMASIA, a large-scale dataset of latent representations extracted from approximately 1,700 pretrained vision models across eight standard image-classification benchmarks. Embeddings are paired with structured metadata on architectures, training regimes, pretraining sources, and model scale. Three applications are demonstrated: analysis of conceptual organization via prototype-like clustering and hierarchical semantic neighborhoods; benchmarking of supervised alignment mappings between spaces using reconstruction error and downstream task performance; and regression analysis relating pretraining-data complexity, specialization, transfer learning, augmentation, and model scale to geometric and probing properties of the embeddings. The central claim is that coupling this representational scale with standardized metadata supplies a reproducible foundation for studying latent geometry, evaluating alignment methods, and developing heterogeneous interoperable AI systems.

Significance. If the extraction procedures, metadata schema, and data release are fully documented and accessible, SEMASIA would be a valuable community resource. The combination of ~1,700 models with rich, standardized metadata directly addresses the current scarcity of comparable, large-scale latent-space collections, enabling systematic cross-model studies that are otherwise infeasible. The three empirical demonstrations provide concrete starting points for alignment benchmarking and property regression, and the scale itself constitutes a strength that can support future work even if the sampled models are not exhaustive.

minor comments (4)
  1. [Abstract and Dataset Construction] Abstract and §3 (Dataset Construction): the exact total number of models and embeddings should be stated precisely (rather than 'approximately 1,700') together with a breakdown by architecture family and pretraining corpus so readers can immediately assess coverage.
  2. [Alignment Benchmarking] §4.2 (Alignment Benchmarking): specify the exact form of the supervised mappings (linear, CCA, or learned nonlinear) and report the precise train/validation/test splits and hyper-parameter selection protocol used for the reconstruction-error and downstream-task evaluations.
  3. [Regression Analysis] §5 (Regression Analysis): include the full list of predictors, their definitions, and any multicollinearity diagnostics; also state whether the reported relationships are robust to alternative model-selection criteria or to subsampling the 1,700 models.
  4. [Figures] All figures: ensure every panel contains axis labels, units, legends, and (where appropriate) error bars or statistical annotations so that the clustering, alignment, and regression results can be interpreted without reference to the main text.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of SEMASIA's potential value to the community, and recommendation for minor revision. We are pleased that the scale, metadata richness, and demonstrated applications are viewed as strengths.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper is a dataset contribution paper that collects latent representations from ~1700 pretrained vision models, attaches structured metadata, and reports three empirical demonstrations (clustering, alignment benchmarks, and regression on model properties). No mathematical derivations, first-principles predictions, or equations are present that could reduce to fitted inputs or self-citations by construction. The central claims rest on the release of the dataset and the reproducibility of the described extraction and analysis procedures, which are independent of any internal circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No mathematical derivations, free parameters, or new entities are introduced; the contribution rests on standard practices for extracting and analyzing neural embeddings from pretrained models.

pith-pipeline@v0.9.0 · 5577 in / 1146 out tokens · 61563 ms · 2026-05-12T04:18:27.054302+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

98 extracted references · 98 canonical work pages · 2 internal anchors

  1. [1]

    The perceptron: A probabilistic model for information storage and organiza- tion in the brain.Psychological Review, 65(6):386–408, 1958

    Frank Rosenblatt. The perceptron: A probabilistic model for information storage and organiza- tion in the brain.Psychological Review, 65(6):386–408, 1958. 2

  2. [2]

    Deep learning.Nature, 521(7553):436–444,

    Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning.Nature, 521(7553):436–444,

  3. [3]

    Learning representations by back-propagating errors.nature, 323(6088):533–536, 1986

    David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning representations by back-propagating errors.nature, 323(6088):533–536, 1986. 2

  4. [4]

    Indiana University Press, 1979

    Umberto Eco.A theory of semiotics, volume 217. Indiana University Press, 1979. 2

  5. [5]

    MIT Press, 2000

    Peter Gärdenfors.Conceptual Spaces: The Geometry of Thought. MIT Press, 2000. 2, 5

  6. [6]

    MIT Press, 2014

    Peter Gärdenfors.The Geometry of Meaning: Semantics Based on Conceptual Spaces. MIT Press, 2014. 2

  7. [7]

    Osgood, George J

    Charles E. Osgood, George J. Suci, and Percy H. Tannenbaum.The Measurement of Meaning. University of Illinois Press, 1957. 2, 5

  8. [8]

    Efficient estimation of word representations in vector space

    Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. InProceedings of the International Conference on Learning Representations (ICLR) Workshop, 2013. 2

  9. [9]

    Jeffrey Pennington, Richard Socher, and Christopher D. Manning. GloVe: Global vectors for word representation. InProceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1532–1543, 2014. 2, 23

  10. [10]

    Representation learning: A review and new perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798– 1828, 2013

    Yoshua Bengio, Aaron Courville, and Pascal Vincent. Representation learning: A review and new perspectives.IEEE Transactions on Pattern Analysis and Machine Intelligence, 35(8):1798– 1828, 2013. 2, 5

  11. [11]

    Kingma and Max Welling

    Diederik P. Kingma and Max Welling. Auto-encoding variational Bayes. InProceedings of the International Conference on Learning Representations (ICLR), 2014. 2

  12. [12]

    U-Net: Convolutional networks for biomedical image segmentation

    Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: Convolutional networks for biomedical image segmentation. InMedical Image Computing and Computer-Assisted Inter- vention (MICCAI), pages 234–241, 2015. 2

  13. [13]

    Hopfield

    John J. Hopfield. Neural networks and physical systems with emergent collective computational abilities.Proceedings of the National Academy of Sciences, 79(8):2554–2558, 1982. 2 10

  14. [14]

    Svcca: Singular vector canonical correlation analysis for deep learning dynamics and interpretability.Advances in neural information processing systems, 30, 2017

    Maithra Raghu, Justin Gilmer, Jason Yosinski, and Jascha Sohl-Dickstein. Svcca: Singular vector canonical correlation analysis for deep learning dynamics and interpretability.Advances in neural information processing systems, 30, 2017. 2, 7

  15. [15]

    Insights on representational similarity in neural networks with canonical correlation.Advances in neural information processing systems, 31,

    Ari Morcos, Maithra Raghu, and Samy Bengio. Insights on representational similarity in neural networks with canonical correlation.Advances in neural information processing systems, 31,

  16. [16]

    Similarity of neural network representations revisited

    Simon Kornblith, Mohammad Norouzi, Honglak Lee, and Geoffrey Hinton. Similarity of neural network representations revisited. InInternational conference on machine learning, pages 3519–3529. PMlR, 2019. 2

  17. [17]

    Mechanistic interpretability for ai safety-a review.Trans- actions on Machine Learning Research, 2024

    Leonard Bereska and Stratis Gavves. Mechanistic interpretability for ai safety-a review.Trans- actions on Machine Learning Research, 2024. 2

  18. [18]

    Position: The platonic representation hypothesis

    Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Position: The platonic representation hypothesis. InProceedings of the 41st International Conference on Machine Learning (ICML), 2024. 3

  19. [19]

    Peirce’s theory of signs.A perfusion of signs, pages 22–39, 1977

    Jay Zeman. Peirce’s theory of signs.A perfusion of signs, pages 22–39, 1977. 3

  20. [20]

    WW Norton & Company, 2011

    Paul Watzlawick, Janet Beavin Bavelas, and Don D Jackson.Pragmatics of human communica- tion. WW Norton & Company, 2011. 3

  21. [21]

    A gauge theory of superposition: Toward a sheaf-theoretic atlas of neural representations.arXiv preprint arXiv:2603.00824, 2026

    Hossein Javidnia. A gauge theory of superposition: Toward a sheaf-theoretic atlas of neural representations.arXiv preprint arXiv:2603.00824, 2026. 3, 26

  22. [22]

    A new communication paradigm: From bit accuracy to semantic fidelity

    Guangming Shi, Dahua Gao, Xiaodan Song, Jingxuan Chai, Minxi Yang, Xuemei Xie, Leida Li, and Xuyang Li. A new communication paradigm: From bit accuracy to semantic fidelity. arXiv preprint arXiv:2101.12649, 2021. 3

  23. [23]

    Beyond transmitting bits: Context, semantics, and task-oriented communications.IEEE Journal on Selected Areas in Communications, 41(1):5–41, 2022

    Deniz Gündüz, Zhijin Qin, Inaki Estella Aguerri, Harpreet S Dhillon, Zhaohui Yang, Aylin Yener, Kai Kit Wong, and Chan-Byoung Chae. Beyond transmitting bits: Context, semantics, and task-oriented communications.IEEE Journal on Selected Areas in Communications, 41(1):5–41, 2022. 3

  24. [24]

    Goal-oriented and semantic communication in 6g ai-native networks: The 6g-goals approach

    Emilio Calvanese Strinati, Paolo Di Lorenzo, Vincenzo Sciancalepore, Adnan Aijaz, Marios Kountouris, Deniz Gündüz, Petar Popovski, Mohamed Sana, Photios A Stavrou, Beatriz Soret, et al. Goal-oriented and semantic communication in 6g ai-native networks: The 6g-goals approach. In2024 Joint European Conference on Networks and Communications & 6G Summit (EuCN...

  25. [25]

    Deep learning enabled semantic communication systems.IEEE transactions on signal processing, 69:2663–2675,

    Huiqiang Xie, Zhijin Qin, Geoffrey Ye Li, and Biing-Hwang Juang. Deep learning enabled semantic communication systems.IEEE transactions on signal processing, 69:2663–2675,

  26. [26]

    Semantic channel equalizer: Modelling language mismatch in multi-user semantic communications

    Mohamed Sana and Emilio Calvanese Strinati. Semantic channel equalizer: Modelling language mismatch in multi-user semantic communications. InGLOBECOM 2023-2023 IEEE Global Communications Conference, pages 2221–2226. IEEE, 2023. 3

  27. [27]

    Semantic communications: Overview, open issues, and future research directions.IEEE Wireless communications, 29(1):210–219, 2022

    Xuewen Luo, Hsiao-Hwa Chen, and Qing Guo. Semantic communications: Overview, open issues, and future research directions.IEEE Wireless communications, 29(1):210–219, 2022. 3

  28. [28]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. InProceed- ings of the 38th International Conference on Machine Learning (ICML), pages 8748–8763,

  29. [29]

    ImageBind: One embedding space to bind them all

    Rohit Girdhar, Alaaeldin El-Nouby, Zhuang Liu, Mannat Singh, Kalyan Vasudev Alwala, Armand Joulin, and Ishan Misra. ImageBind: One embedding space to bind them all. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 15180–15190, 2023. 3 11

  30. [30]

    Gramian multimodal representation learning and alignment

    Giordano Cicchetti, Eleonora Grassucci, Luigi Sigillo, Danilo Comminiello, et al. Gramian multimodal representation learning and alignment. InProceedings of International Conference on Learning Representations (ICLR 2025). ICLR, 2025. 3

  31. [31]

    FedProto: Federated prototype learning across heterogeneous clients

    Yue Tan, Guodong Long, Lu Liu, Tianyi Zhou, Qinghua Lu, Jing Jiang, and Chengqi Zhang. FedProto: Federated prototype learning across heterogeneous clients. InProceedings of the 36th AAAI Conference on Artificial Intelligence, pages 8432–8440, 2022. 3

  32. [32]

    FedFed: Feature distillation against data heterogeneity in federated learning

    Zheng Yang, Yuexing Zhang, Yan Zheng, Xinwei Tian, Hao Peng, Tengfei Liu, and Chunlin Chen. FedFed: Feature distillation against data heterogeneity in federated learning. InAdvances in Neural Information Processing Systems, volume 36, 2023. 3

  33. [33]

    Khalil, and Hongliang Li

    Mehdi Setayesh, Mahdi Beitollahi, Yasser H. Khalil, and Hongliang Li. Toward enhancing representation learning in federated multi-task settings. InInternational Conference on Learning Representations (ICLR), 2026. 3

  34. [34]

    Communication-efficient and robust multi-modal federated learning via latent-space consensus.IEEE Wireless Communications Letters, 15:2298–2302, 2026

    Mohamed Badi, Chaouki Ben Issaid, and Mehdi Bennis. Communication-efficient and robust multi-modal federated learning via latent-space consensus.IEEE Wireless Communications Letters, 15:2298–2302, 2026. 3

  35. [35]

    Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen

    Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. LoRA: Low-rank adaptation of large language models. In Proceedings of the International Conference on Learning Representations (ICLR), 2022. 3

  36. [36]

    Representation Engineering: A Top-Down Approach to AI Transparency

    Andy Zou, Long Phan, Sarah Chen, James Campbell, Phillip Guo, Richard Ren, Alexander Pan, Xuwang Yin, Mantas Mazeika, Ann-Kathrin Dombrowski, Shashwat Goel, Nathaniel Li, Michael J. Byun, Zifan Wang, Alex Mallen, Steven Basart, Sanmi Koyejo, Dawn Song, Matt Fredrikson, J. Zico Kolter, and Dan Hendrycks. Representation engineering: A top-down approach to A...

  37. [37]

    Gluing local contexts into global meaning: A sheaf-theoretic decomposition of transformer representations

    Bryce Grant and Peng Wang. Gluing local contexts into global meaning: A sheaf-theoretic decomposition of transformer representations. InICLR 2026 Workshop on Unifying Concept Representation Learning, 2026. 3

  38. [38]

    Towards understanding sensitive and decisive patterns in explainable ai: a case study of model interpretation in geometric deep learning.arXiv preprint arXiv:2407.00849, 2024

    Jiajun Zhu, Siqi Miao, Rex Ying, and Pan Li. Towards understanding sensitive and decisive patterns in explainable ai: a case study of model interpretation in geometric deep learning.arXiv preprint arXiv:2407.00849, 2024. 3

  39. [39]

    Navigating the latent space dynamics of neural models.arXiv preprint arXiv:2505.22785, 2025

    Marco Fumero, Luca Moschella, Emanuele Rodolà, and Francesco Locatello. Navigating the latent space dynamics of neural models.arXiv preprint arXiv:2505.22785, 2025. 3

  40. [40]

    A path towards autonomous machine intelligence version 0.9

    Yann LeCun et al. A path towards autonomous machine intelligence version 0.9. 2, 2022-06-27. Open Review, 62(1):1–62, 2022. 3

  41. [41]

    Wightman

    R. Wightman. Pytorch image models. https://github.com/rwightman/ pytorch-image-models, 2019. 3

  42. [42]

    Learning multiple layers of features from tiny images

    Alex Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. 4, 18

  43. [43]

    Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998

    Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 1998. 4, 17

  44. [44]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    Han Xiao, Kashif Rasul, and Roland V ollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms.CoRR, abs/1708.07747, 2017. 4, 17

  45. [45]

    Automated flower classification over a large number of classes

    Maria-Elena Nilsback and Andrew Zisserman. Automated flower classification over a large number of classes. InProceedings of the Indian Conference on Computer Vision, Graphics and Image Processing, December 2008. 4, 18

  46. [46]

    ImageNet: A large- scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A large- scale hierarchical image database. In2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 248–255, 2009. 4, 18 12

  47. [47]

    Berg, and Li Fei-Fei

    Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. ImageNet large scale visual recognition challenge.International Journal of Computer Vision, 115(3):211–252, 2015. 4, 18

  48. [48]

    Tiny ImageNet visual recognition challenge.CS 231N, 7(7):3, 2015

    Ya Le and Xuan Yang. Tiny ImageNet visual recognition challenge.CS 231N, 7(7):3, 2015. 4, 18

  49. [49]

    Deep learning face attributes in the wild

    Ziwei Liu, Ping Luo, Xiaogang Wang, and Xiaoou Tang. Deep learning face attributes in the wild. InProceedings of the IEEE International Conference on Computer Vision (ICCV), pages 3730–3738, December 2015. 4, 18

  50. [50]

    Multimodal autoregressive pre-training of large vision encoders

    Enrico Fini, Mustafa Shukor, Xiujun Li, Philipp Dufter, Michal Klein, David Haldimann, Sai Aitharaju, Victor G Turrisi da Costa, Louis Béthune, Zhe Gan, et al. Multimodal autoregressive pre-training of large vision encoders. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9641–9654, 2025. 5

  51. [51]

    Visualizing data using t-sne.Journal of machine learning research, 9(11), 2008

    Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne.Journal of machine learning research, 9(11), 2008. 5, 23

  52. [52]

    Neural population geometry and optimal coding of tasks with shared latent structure.Nature Neuroscience, pages 1–11, 2026

    Albert J Wakhloo, Will Slatton, and SueYeon Chung. Neural population geometry and optimal coding of tasks with shared latent structure.Nature Neuroscience, pages 1–11, 2026. 5

  53. [53]

    Frame-based zero-shot semantic channel equalization for AI-native communications.IEEE Transactions on Cognitive Communications and Networking, 2026

    Simone Fiorellino et al. Frame-based zero-shot semantic channel equalization for AI-native communications.IEEE Transactions on Cognitive Communications and Networking, 2026. 6, 7, 28, 32

  54. [54]

    The plane with parallel coordinates.The visual computer, 1(2):69–91, 1985

    Alfred Inselberg. The plane with parallel coordinates.The visual computer, 1(2):69–91, 1985. 6, 23

  55. [55]

    Umap: Uniform manifold approximation and projection.Journal of Open Source Software, 3(29), 2018

    Leland McInnes, John Healy, Nathaniel Saul, and Lukas Großberger. Umap: Uniform manifold approximation and projection.Journal of Open Source Software, 3(29), 2018. 6, 23

  56. [56]

    Topological deep learning challenge 2025: Expanding the data landscape

    Guillermo Bernárdez, Lev Telyatnikov, Mathilde Papillon, Marco Montagna, Raffael Theiler, Louisa Cornelis, Johan Mathe, Miquel Ferriol, Pavlo Vasylenko, Jan-Willem Van Looy, et al. Topological deep learning challenge 2025: Expanding the data landscape. InTopology, Algebra, and Geometry in Data Science (TAG-DS 2025), pages 4–14. PMLR, 2026. 7

  57. [57]

    Latent space alignment for ai-native mimo semantic communications

    Mario Edoardo Pandolfo, Simone Fiorellino, Emilio Calvanese Strinati, and Paolo Di Lorenzo. Latent space alignment for ai-native mimo semantic communications. In2025 International Joint Conference on Neural Networks (IJCNN), pages 1–8. IEEE, 2025. 7, 30

  58. [58]

    Indexing by latent semantic analysis.Journal of the American society for information science, 41(6):391–407, 1990

    Scott Deerwester, Susan T Dumais, George W Furnas, Thomas K Landauer, and Richard Harshman. Indexing by latent semantic analysis.Journal of the American society for information science, 41(6):391–407, 1990. 23

  59. [59]

    A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge.Psychological review, 104(2):211, 1997

    Thomas K Landauer and Susan T Dumais. A solution to plato’s problem: The latent semantic analysis theory of acquisition, induction, and representation of knowledge.Psychological review, 104(2):211, 1997. 23

  60. [60]

    Unsupervised learning by probabilistic latent semantic analysis.Machine learning, 42(1):177–196, 2001

    Thomas Hofmann. Unsupervised learning by probabilistic latent semantic analysis.Machine learning, 42(1):177–196, 2001. 23

  61. [61]

    Latent dirichlet allocation.Journal of machine Learning research, 3(Jan):993–1022, 2003

    David M Blei, Andrew Y Ng, and Michael I Jordan. Latent dirichlet allocation.Journal of machine Learning research, 3(Jan):993–1022, 2003. 23

  62. [62]

    Distributed repre- sentations of words and phrases and their compositionality.Advances in neural information processing systems, 26, 2013

    Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. Distributed repre- sentations of words and phrases and their compositionality.Advances in neural information processing systems, 26, 2013. 23

  63. [63]

    Neural word embedding as implicit matrix factorization

    Omer Levy and Yoav Goldberg. Neural word embedding as implicit matrix factorization. Advances in neural information processing systems, 27, 2014. 23 13

  64. [64]

    How contextual are contextualized word representations? comparing the geometry of bert, elmo, and gpt-2 embeddings

    Kawin Ethayarajh. How contextual are contextualized word representations? comparing the geometry of bert, elmo, and gpt-2 embeddings. InProceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP), pages 55–65, 2019. 23

  65. [65]

    Visualizing and measuring the geometry of bert.Advances in neural information processing systems, 32, 2019

    Emily Reif, Ann Yuan, Martin Wattenberg, Fernanda B Viegas, Andy Coenen, Adam Pearce, and Been Kim. Visualizing and measuring the geometry of bert.Advances in neural information processing systems, 32, 2019. 23

  66. [66]

    Isotropy in the contextual embed- ding space: Clusters and manifolds

    Xingyu Cai, Jiaji Huang, Yuchen Bian, and Kenneth Church. Isotropy in the contextual embed- ding space: Clusters and manifolds. InInternational conference on learning representations,

  67. [67]

    Face recognition using eigenfaces

    Matthew A Turk, Alex Pentland, et al. Face recognition using eigenfaces. InCVPR, volume 91, pages 586–591, 1991. 23, 24

  68. [68]

    The unrea- sonable effectiveness of deep features as a perceptual metric

    Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. The unrea- sonable effectiveness of deep features as a perceptual metric. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 586–595, 2018. 23

  69. [69]

    Do better imagenet models transfer better? InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2661–2671, 2019

    Simon Kornblith, Jonathon Shlens, and Quoc V Le. Do better imagenet models transfer better? InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2661–2671, 2019. 23

  70. [70]

    Unsupervised discovery of interpretable directions in the gan latent space

    Andrey V oynov and Artem Babenko. Unsupervised discovery of interpretable directions in the gan latent space. InInternational conference on machine learning, pages 9786–9796. PMLR,

  71. [71]

    Ganspace: Discovering interpretable gan controls.Advances in neural information processing systems, 33:9841–9850,

    Erik Härkönen, Aaron Hertzmann, Jaakko Lehtinen, and Sylvain Paris. Ganspace: Discovering interpretable gan controls.Advances in neural information processing systems, 33:9841–9850,

  72. [72]

    Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav)

    Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, et al. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). InInternational conference on machine learning, pages 2668–2677. PMLR, 2018. 23

  73. [73]

    Concept bottleneck models

    Pang Wei Koh, Thao Nguyen, Yew Siang Tang, Stephen Mussmann, Emma Pierson, Been Kim, and Percy Liang. Concept bottleneck models. InInternational conference on machine learning, pages 5338–5348. PMLR, 2020. 23, 24

  74. [74]

    Etienne Becht, Leland McInnes, John Healy, Charles-Antoine Dutertre, Immanuel W. H. Kwok, Lai Guan Ng, Florent Ginhoux, and Evan W. Newell. Dimensionality reduction for visualizing single-cell data using UMAP.Nature Biotechnology, 37(1):38–44, 2019. 23

  75. [75]

    The art of using t-SNE for single-cell transcriptomics

    Dmitry Kobak and Philipp Berens. The art of using t-SNE for single-cell transcriptomics. Nature Communications, 10(1):5416, 2019. 23

  76. [76]

    Functional maps: a flexible representation of maps between shapes.ACM Transactions on Graphics (ToG), 31(4):1–11, 2012

    Maks Ovsjanikov, Mirela Ben-Chen, Justin Solomon, Adrian Butscher, and Leonidas Guibas. Functional maps: a flexible representation of maps between shapes.ACM Transactions on Graphics (ToG), 31(4):1–11, 2012. 27

  77. [77]

    Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003

    Mikhail Belkin and Partha Niyogi. Laplacian eigenmaps for dimensionality reduction and data representation.Neural computation, 15(6):1373–1396, 2003. 26

  78. [78]

    Learning network sheaves for ai-native semantic communication

    Enrico Grimaldi, Mario Edoardo Pandolfo, Gabriele D’Acunto, Sergio Barbarossa, and Paolo Di Lorenzo. Learning network sheaves for ai-native semantic communication. In2025 59th Asilomar Conference on Signals, Systems, and Computers, pages 1692–1696. IEEE, 2025. 26, 30

  79. [79]

    The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97, 1955

    Harold W Kuhn. The hungarian method for the assignment problem.Naval research logistics quarterly, 2(1-2):83–97, 1955. 27

  80. [80]

    Relative representations enable zero-shot latent space communication

    Luca Moschella, Valentino Maiorca, Marco Fumero, Antonio Norelli, Francesco Locatello, and Emanuele Rodolà. Relative representations enable zero-shot latent space communication. In ICLR, 2023. 30 14

Showing first 80 references.