HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

Paul Rosen; Sudhanva Manjunath Athreya

arxiv: 2512.07988 · v3 · submitted 2025-12-08 · 💻 cs.LG · cs.GR· cs.HC

HOLE: Homological Observation of Latent Embeddings for Neural Network Interpretability

Sudhanva Manjunath Athreya , Paul Rosen This is my paper

Pith reviewed 2026-05-16 23:51 UTC · model grok-4.3

classification 💻 cs.LG cs.GRcs.HC

keywords persistent homologyneural network interpretabilitylatent embeddingstopological data analysisclass separationmodel robustnessfeature disentanglement

0 comments

The pith

Persistent homology on neural network activations reveals topological patterns tied to class separation and robustness.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Deep learning models succeed at tasks but keep their internal representations hard to inspect. This paper presents HOLE, which computes persistent homology on the activations inside network layers to extract shape-based features such as connected components and holes. These features are shown through visualizations including cluster flow diagrams, blob graphs, and heatmap dendrograms that track how data structure changes from layer to layer. The work finds that the resulting topological signatures align with improved class separation, disentangled features, and greater resistance to input changes or compression. This supplies a geometric perspective that can complement existing ways of probing model behavior.

Core claim

HOLE extracts topological features from intermediate activations using persistent homology and visualizes them with cluster flow diagrams, blob graphs, and heatmap dendrograms. Evaluation on discriminative models shows these features associate with class separation, feature disentanglement, and robustness to perturbations and compression.

What carries the argument

Persistent homology applied directly to the intermediate activations of a neural network to track topological evolution across layers.

Load-bearing premise

That the topological invariants computed on activations reflect meaningful semantic properties such as class separation instead of unrelated geometric artifacts of the embedding spaces.

What would settle it

A test on matched models known to differ sharply in class separation that finds no corresponding difference in their persistent homology barcodes or persistence diagrams at the same layers.

Figures

Figures reproduced from arXiv: 2512.07988 by Paul Rosen, Sudhanva Manjunath Athreya.

**Figure 1.** Figure 1: HOLE provides global interpretability via multiple visualization techniques: (left) Sankey flows (layer-wise represen [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗

**Figure 2.** Figure 2: Example (left) persistence diagram and (rght) barcode. [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: HOLE overview shows how during inference, neural net [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: (o) The input dataset was used to generate (a-d,i-k) dis [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Examples of the visualizations used to support tasks [PITH_FULL_IMAGE:figures/full_fig_p006_5.png] view at source ↗

**Figure 7.** Figure 7: Input noise robustness evaluation on CIFAR-10. Various [PITH_FULL_IMAGE:figures/full_fig_p007_7.png] view at source ↗

**Figure 6.** Figure 6: Comparison of (a-d) Sankey diagrams for ViT encoder [PITH_FULL_IMAGE:figures/full_fig_p007_6.png] view at source ↗

**Figure 8.** Figure 8: Comparison of (left) Blob graphs and (right) Sankey dia [PITH_FULL_IMAGE:figures/full_fig_p008_8.png] view at source ↗

**Figure 9.** Figure 9: ResNet-34 persistent dendrogram + heatmap before and [PITH_FULL_IMAGE:figures/full_fig_p008_9.png] view at source ↗

**Figure 10.** Figure 10: Blob visualizations of ViT encoder layer 11 activations [PITH_FULL_IMAGE:figures/full_fig_p009_10.png] view at source ↗

read the original abstract

Deep learning models have achieved remarkable success across various domains, yet their learned representations and decision-making processes remain largely opaque and hard to interpret. This work introduces HOLE (Homological Observation of Latent Embeddings), a method for analyzing and interpreting discriminative neural networks through persistent homology. HOLE extracts topological features from intermediate activations and presents them using a suite of visualization techniques, including cluster flow diagrams, blob graphs, and heatmap dendrograms. These tools facilitate the examination of representation structure and quality across layers. We evaluate HOLE using a range of discriminative models, focusing on representation quality, interpretability across layers, and robustness to input perturbations and model compression. The results indicate that topological analysis reveals patterns associated with class separation, feature disentanglement, and model robustness, providing a complementary perspective for understanding and improving deep learning systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

HOLE applies persistent homology to layer activations with some visualization tools, but the interpretability claims lack controls and metrics to show the features reflect learned structure rather than input geometry.

read the letter

The main thing to know about this paper is that it takes persistent homology and runs it on the point clouds of intermediate neural network activations, then displays the results through cluster flow diagrams, blob graphs, and heatmap dendrograms. The goal is to give a topological view of how representations change across layers and under perturbations or compression. If those features really track class separation or robustness, the visualizations could be a practical addition for people who already work with topological data analysis in ML. What the paper does reasonably well is package the method as an observational diagnostic and show example outputs on standard discriminative models. The framing treats the homology barcodes as a way to inspect representation quality and disentanglement without forcing a predictive model on top of them. That keeps the contribution focused and avoids overclaiming a new algorithm. The soft spots are clear and central. The evaluations stay qualitative with no reported metrics, baselines, or statistical checks. There are also no controls that would separate learned structure from the geometry already present in the input data or any generic feed-forward map. Comparisons to random initialization, label shuffling, or linear probes on the same inputs are missing, so it remains possible that the observed persistence patterns are incidental rather than tied to training. This weakens the link between the topological features and the claimed interpretability benefits. The paper is aimed at readers who already know persistent homology and want new ways to plot activations. It shows honest engagement with the literature on representation analysis, so it deserves a serious referee even though the current version would need added experiments and tighter claims before publication.

Referee Report

3 major / 2 minor

Summary. The paper introduces HOLE, a method that applies persistent homology to intermediate activations of discriminative neural networks to extract topological features, which are then visualized via cluster flow diagrams, blob graphs, and heatmap dendrograms. It claims this provides insights into representation quality, layer-wise interpretability, class separation, feature disentanglement, and robustness to perturbations and compression, evaluated across a range of models.

Significance. If validated with appropriate controls, HOLE could provide a useful topological lens for neural network interpretability that complements existing activation-based or attribution methods, potentially helping identify structural changes across layers or under model modifications.

major comments (3)

Abstract and evaluation sections: the claims of revealing patterns associated with class separation, feature disentanglement, and model robustness are supported only by qualitative descriptions; no quantitative metrics (e.g., persistence diagram distances, classification accuracies on topological features), baselines (e.g., random networks or untrained models), or statistical analysis are reported to substantiate the interpretability conclusions.
Method and experiments: the central assumption that persistent homology barcodes on activation point clouds encode learned class structure (rather than incidental geometry of the input manifold or any Lipschitz embedding) is not tested via controls such as randomly initialized networks, label-shuffled training, or linear probes on the same inputs; without these, the interpretability interpretation remains ungrounded.
Evaluation claims: the robustness analysis to input perturbations and model compression lacks specific comparisons (e.g., before/after compression persistence diagrams or correlation with accuracy drops) that would make the robustness findings load-bearing rather than observational.

minor comments (2)

The visualization techniques (cluster flow diagrams, blob graphs) would benefit from explicit pseudocode or parameter settings (e.g., filtration thresholds, distance metrics used in Vietoris-Rips) to allow reproducibility.
Notation for persistent homology features (e.g., barcodes, persistence diagrams) should be defined more formally with reference to standard definitions to avoid ambiguity for readers unfamiliar with topological data analysis.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below and will revise the manuscript to incorporate quantitative controls and comparisons where they strengthen the claims without altering the core contribution.

read point-by-point responses

Referee: Abstract and evaluation sections: the claims of revealing patterns associated with class separation, feature disentanglement, and model robustness are supported only by qualitative descriptions; no quantitative metrics (e.g., persistence diagram distances, classification accuracies on topological features), baselines (e.g., random networks or untrained models), or statistical analysis are reported to substantiate the interpretability conclusions.

Authors: We agree that the current version relies primarily on qualitative visualizations. In the revised manuscript we will add quantitative metrics, including Wasserstein distances between persistence diagrams across layers and models, as well as baseline comparisons against randomly initialized networks. Statistical tests will be included to support the reported patterns. revision: yes
Referee: Method and experiments: the central assumption that persistent homology barcodes on activation point clouds encode learned class structure (rather than incidental geometry of the input manifold or any Lipschitz embedding) is not tested via controls such as randomly initialized networks, label-shuffled training, or linear probes on the same inputs; without these, the interpretability interpretation remains ungrounded.

Authors: This is a fair criticism. While the original experiments focus on trained models, we will add the suggested controls—randomly initialized networks and label-shuffled training—in the revised version. These experiments will directly test whether the observed topological signatures arise from learned class structure rather than input geometry alone. revision: yes
Referee: Evaluation claims: the robustness analysis to input perturbations and model compression lacks specific comparisons (e.g., before/after compression persistence diagrams or correlation with accuracy drops) that would make the robustness findings load-bearing rather than observational.

Authors: We accept that more explicit quantitative links are needed. The revised manuscript will include direct before-and-after persistence diagram comparisons under compression and perturbations, together with reported correlations between topological changes and accuracy drops. revision: yes

Circularity Check

0 steps flagged

No circularity: purely observational application of standard persistent homology

full rationale

The paper introduces HOLE as a visualization and analysis pipeline that applies off-the-shelf persistent homology (Vietoris-Rips or equivalent) to intermediate activation point clouds and then renders the resulting barcodes via cluster-flow diagrams, blob graphs, and dendrograms. No equations are presented that derive a new quantity from fitted parameters, no predictions are made that are statistically forced by the same data used to demonstrate them, and no uniqueness theorems or ansatzes are smuggled in via self-citation. All reported patterns (class separation, disentanglement, robustness) are empirical observations from the computed topological features; they are not shown to be equivalent to the input activations by construction. The method is therefore self-contained as an observational tool and receives a score of 0.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the domain assumption that topological invariants computed via persistent homology on activation vectors carry semantic meaning for model behavior; no free parameters or new entities are introduced in the abstract.

axioms (1)

domain assumption Persistent homology applied to point clouds in activation space yields features that reflect representation quality and robustness.
Invoked when the abstract claims the extracted features reveal class separation and disentanglement.

pith-pipeline@v0.9.0 · 5443 in / 1207 out tokens · 40034 ms · 2026-05-16T23:51:24.033839+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

HOLE extracts topological features from intermediate activations... persistent homology... Vietoris-Rips... H0 components... class separation, feature disentanglement

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

80 extracted references · 80 canonical work pages

[1]

R. Amar, J. Eagan, and J. Stasko. Low-level components of analytic activity in information visualization. InIEEE Symposium on Infor- mation Visualization, pp. 111–117, 2005. doi: 10.1109/INFVIS.2005 .1532136 2, 5

work page doi:10.1109/infvis.2005 2005
[2]

S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. M ¨uller, and W. Samek. On pixel-wise explanations for non-linear classifier deci- sions by layer-wise relevance propagation.PloS one, 10(7), 2015. doi: 10.1371/journal.pone.0130140 2

work page doi:10.1371/journal.pone.0130140 2015
[3]

Software design analysis and technical debt management based on design rule theory,

R. Ballester, X. Arnal, C. Casacuberta, M. Madadi, C. Corneanu, and S. Escalera. Predicting the generalization gap in neural networks us- ing topological data analysis.Neurocomputing, 2024. doi: 10.1016/j. neucom.2024.127787 3

work page doi:10.1016/j 2024
[4]

Banner, Y

R. Banner, Y . Nahshan, E. Hoffer, and D. Soudry. Post training 4- bit quantization of convolutional networks for rapid-deployment. In Advances in Neural Information Processing Systems, vol. 32, 2019. 9

work page 2019
[5]

Barocas and A

S. Barocas and A. D. Selbst.Big data’s disparate impact, vol. 104. HeinOnline, 2016. doi: 10.2139/ssrn.2477899 1

work page doi:10.2139/ssrn.2477899 2016
[6]

D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. Network dissection: Quantifying interpretability of deep visual representations. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 6541–6549, 2017. doi: 10.1109/CVPR.2017.354 2

work page doi:10.1109/cvpr.2017.354 2017
[7]

Birdal, A

T. Birdal, A. Lou, L. Guibas, and U. Simsekli. Intrinsic dimension, persistent homology and generalization in neural networks. InAd- vances in Neural Information Processing Systems, 2021. 3

work page 2021
[8]

Blalock, J

D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag. What is the state of neural network pruning?Machine Learning and Systems, 2:129–146,

work page
[9]

P. Bubenik. Statistical topological data analysis using persistence landscapes.Journal of Machine Learning Research, 16:77–102, 2015. 2

work page 2015
[10]

Carlsson

G. Carlsson. Topology and data.Bulletin of the American Mathe- matical Society, 46(2):255–308, 2009. doi: 10.1090/S0273-0979-09 -01249-X 4

work page doi:10.1090/s0273-0979-09 2009
[11]

Carri `ere, M

M. Carri `ere, M. Cuturi, and S. Oudot. Sliced wasserstein kernel for persistence diagrams. InInternational Conference on Machine Learn- ing, pp. 664–673, 2017. 2

work page 2017
[12]

Carri `ere, M

M. Carri `ere, M. Cuturi, S. Oudot, and B. Rieck. Perslay: A neu- ral network layer for persistence diagrams and new graph topological signatures. InAISTATS, 2020. 3

work page 2020
[13]

Cohen-Steiner, H

D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of persis- tence diagrams.Discrete & Computational Geometry, 37(1):103–120,

work page
[14]

doi: 10.1007/s00454-006-1276-5 2

work page doi:10.1007/s00454-006-1276-5
[15]

Dettmers, M

T. Dettmers, M. Lewis, Y . Belkada, and L. Zettlemoyer. Gpt3.int8(): 8-bit matrix multiplication for transformers at scale.Advances in Neu- ral Information Processing Systems, 35:30318–30332, 2022. 9

work page 2022
[16]

Devlin, M.-W

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint, 2018. 1

work page 2018
[17]

Dosovitskiy, L

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Trans- formers for image recognition at scale. InInternational Conference on Learning Representations, 2021. 7

work page 2021
[18]

Edelsbrunner and J

H. Edelsbrunner and J. Harer.Computational topology: an introduc- tion. American Mathematical Soc., 2010. 4

work page 2010
[19]

Edelsbrunner, D

H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological per- sistence and simplification.Discrete & Computational Geometry, 28(4):511–533, 2002. doi: 10.1109/SFCS.2000.892133 2, 4

work page doi:10.1109/sfcs.2000.892133 2002
[20]

Erhan, Y

D. Erhan, Y . Bengio, A. Courville, and P. Vincent. Visualizing higher- layer features of a deep network. InInternational Conference on Ma- chine Learning, pp. 341–348, 2009. 2

work page 2009
[21]

S. K. Esser, J. L. McKinstry, D. Bablani, R. Appuswamy, and D. S. Modha. Learned step size quantization. InInternational Conference on Learning Representations, 2020. 9

work page 2020
[22]

Feldman, M

D. Feldman, M. Schmidt, and C. Sohler. Turning big data into tiny data: Constant-size coresets for k-means, pca, and projective clus- tering.SIAM Journal on Computing, 49(3):601–657, 2020. doi: 10. 1137/18M1209854 10 10

work page 2020
[23]

Frankle and M

J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. InInternational Conference on Learning Representations, 2019. 8

work page 2019
[24]

Gholami, S

A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer. A survey of quantization methods for efficient neural net- work inference.arXiv preprint, 2021. doi: 10.1201/9781003162810 -13 9

work page doi:10.1201/9781003162810 2021
[25]

R. Ghrist. Barcodes: The persistent topology of data.Bulletin of the American Mathematical Society, 45(1):61–75, 2008. doi: 10.1090/ S0273-0979-07-01191-3 2

work page 2008
[26]

Goodfellow, Y

I. Goodfellow, Y . Bengio, and A. Courville.Deep learning. MIT press, 2016. 1

work page 2016
[27]

Goodfellow, J

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, vol. 27, 2014. 10

work page 2014
[28]

Guti ´errez-Fandi˜no, D

A. Guti ´errez-Fandi˜no, D. P ´erez-Fern´andez, J. Armengol-Estap ´e, and M. Villegas. Persistent homology captures the generalization of neural networks without a validation set.arXiv preprint, 2021. 3

work page 2021
[29]

S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. InInternational Conference on Learning Representations,

work page
[30]

ICLR 2016 (oral). 8, 9

work page 2016
[31]

Hassibi and D

B. Hassibi and D. G. Stork. Second order derivatives for network pruning: Optimal brain surgeon.Advances in Neural Information Pro- cessing Systems, 5, 1993. 8

work page 1993
[32]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016. doi: 10.1109/CVPR.2016.90 6

work page doi:10.1109/cvpr.2016.90 2016
[33]

Hofer, R

C. Hofer, R. Kwitt, M. Niethammer, and A. Uhl. Deep learning with topological signatures. InAdvances in Neural Information Processing Systems, 2017. 3

work page 2017
[34]

Hohman, H

F. Hohman, H. Park, C. Robinson, and D. H. Chau. Summit: Scaling deep learning interpretability by visualizing activation and attribution summarizations.IEEE Transactions on Visualization and Computer Graphics, 26(1):1–12, 2020. doi: 10.1109/TVCG.2019.2934659 3

work page doi:10.1109/tvcg.2019.2934659 2020
[35]

In: Proceedings of the IEEE conference on computer vi- sion and pattern recognition

B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713, 2018. doi: 10.1109/CVPR.2018.00286 9

work page doi:10.1109/cvpr.2018.00286 2018
[36]

Kahng, P

M. Kahng, P. Y . Andrews, A. Kalro, and D. H. Chau. Activis: Vi- sual exploration of industry-scale deep neural network models.IEEE Transactions on Visualization and Computer Graphics, 24(1):88–97,

work page
[37]

doi: 10.1109/TVCG.2017.2744718 3

work page doi:10.1109/tvcg.2017.2744718 2017
[38]

A. E. Khandani, A. J. Kim, and A. W. Lo. Consumer credit-risk mod- els via machine-learning algorithms.Journal of Banking & Finance, 34(11):2767–2787, 2010. 1

work page 2010
[39]

B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, and R. Sayres. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). InInternational Con- ference on Machine Learning, pp. 2668–2677, 2018. 2

work page 2018
[40]

D. P. Kingma and M. Welling. Auto-encoding variational bayes.arXiv preprint, 2013. 10

work page 2013
[41]

Krishnamoorthi

R. Krishnamoorthi. Quantizing deep convolutional networks for effi- cient inference: A whitepaper.arXiv preprint, 2018. 9

work page 2018
[42]

Krizhevsky

A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. Technical Report. 6

work page 2009
[43]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classifica- tion with deep convolutional neural networks. InAdvances in Neural Information Processing Systems, pp. 1097–1105, 2012. doi: 10.1145/ 3065386 1

work page 2012
[44]

Nature, 521, 436 –444, https://doi.org/10.1038/nature14539

Y . LeCun, Y . Bengio, and G. Hinton. Deep learning.Nature, 521(7553):436–444, 2015. doi: 10.1038/nature14539 1

work page doi:10.1038/nature14539 2015
[45]

LeCun, J

Y . LeCun, J. Denker, and S. Solla. Optimal brain damage.Advances in Neural Information Processing Systems, 2, 1989. 8

work page 1989
[46]

H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. InInternational Conference on Learning Representations, 2017. 8

work page 2017
[47]

M. Liu, J. Shi, Z. Li, C. Li, J. Zhu, and S. Liu. Towards better anal- ysis of deep convolutional neural networks.IEEE Transactions on Visualization and Computer Graphics, 23(1):831–840, 2017. doi: 10. 1109/TVCG.2016.2598831 3

work page arXiv 2017
[48]

Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang. Learning efficient convolutional networks through network slimming. InIEEE International Conference on Computer Vision, pp. 2736–2744, 2017. doi: 10.1109/ICCV.2017.298 8

work page doi:10.1109/iccv.2017.298 2017
[49]

F. J. L ´opez Iturriaga and I. P. Sanz. Machine learning: Challenges, lessons, and opportunities in credit risk modeling.Moody’s Analytics Risk Perspectives, 2013. 1

work page 2013
[50]

A. Lou, D. Lim, I. Katsman, L. Huang, Q. Jiang, S.-N. Lim, and C. De Sa. Neural manifold ordinary differential equations.Advances in Neural Information Processing Systems, 33:17548–17558, 2020. 10

work page 2020
[51]

Louizos, M

C. Louizos, M. Welling, and D. P. Kingma. Learning sparse neural networks throughl 0 regularization. InInternational Conference on Learning Representations, 2018. 8

work page 2018
[52]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems, vol. 30, pp. 4765–4774, 2017. 2

work page 2017
[53]

Maria, J.-D

C. Maria, J.-D. Boissonnat, M. Glisse, and M. Yvinec. The gudhi library: Simplicial complexes and persistent homology. InInter- national Congress on Mathematical Software (ICMS), pp. 167–174,

work page
[54]

doi: 10.1007/978-3-662-44199-2 28 2, 6

work page doi:10.1007/978-3-662-44199-2
[55]

ACM Comput

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. A survey on bias and fairness in machine learning.ACM Computing Surveys (CSUR), 54(6):1–35, 2021. doi: 10.1145/3457607 1

work page doi:10.1145/3457607 2021
[56]

S. Migacz. 8-bit inference with tensorrt. InGPU Technology Confer- ence, 2017. 9

work page 2017
[57]

Molchanov, S

P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz. Pruning convolutional neural networks for resource efficient inference. InIn- ternational Conference on Learning Representations, 2017. 8

work page 2017
[58]

Molnar.Interpretable machine learning

C. Molnar.Interpretable machine learning. Lulu. com, 2020. 1

work page 2020
[59]

M. Moor, M. Horn, B. Rieck, and K. Borgwardt. Topological autoen- coders. InInternational Conference on Machine Learning, 2020. 3

work page 2020
[60]

Nagel, M

M. Nagel, M. v. Baalen, T. Blankevoort, and M. Welling. Data-free quantization through weight equalization and bias correction. InIEEE International Conference on Computer Vision, pp. 1325–1334, 2019. doi: 10.1109/ICCV.2019.00141 9

work page doi:10.1109/iccv.2019.00141 2019
[61]

C. Olah, A. Mordvintsev, and L. Schubert. Feature visualization.Dis- till, 2017. doi: 10.23915/distill.00007 2

work page doi:10.23915/distill.00007 2017
[63]

Experimental observations of the topology of convolutional neural network activations

E. Purvine et al. Experimental observations of the topology of convo- lutional neural network activations. InIEEE Symposium on Visualiza- tion for Cyber Security, 2022. doi: 10.1609/aaai.v37i8.26134 3

work page doi:10.1609/aaai.v37i8.26134 2022
[64]

Scalable and accurate deep learning with electronic health records

A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun, et al. Scalable and accurate deep learning with electronic health records.NPJ Digital Medicine, 1(1):1– 10, 2018. doi: 10.1038/s41746-018-0029-1 1

work page doi:10.1038/s41746-018-0029-1 2018
[65]

M. T. Ribeiro, S. Singh, and C. Guestrin. ”why should i trust you?” explaining the predictions of any classifier. InACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, pp. 1135–1144, 2016. doi: 10.18653/v1/N16-3020 2

work page doi:10.18653/v1/n16-3020 2016
[66]

Rieck, M

B. Rieck, M. Togninalli, M. Bianchini, J. M. Buhmann, C. Kenel, D. Lun, A. Radeghieri, C. Ertle, and D. H”ottger. Neural persistence: A complexity measure for deep neural networks using algebraic topol- ogy. InInternational Conference on Learning Representations, 2019. ICLR. 3

work page 2019
[67]

C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1(5):206–215, 2019. doi: 10.1038/s42256-019-0048-x 1

work page doi:10.1038/s42256-019-0048-x 2019
[68]

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. InIEEE International Conference on Computer Vision, pp. 618–626, 2017. doi: 10.1007/s11263-019 -01228-7 2 11

work page doi:10.1007/s11263-019 2017
[69]

B. W. Silverman.Density Estimation for Statistics and Data Analysis. Routledge, 1st ed., 2018. doi: 10.1201/9781315140919 10

work page doi:10.1201/9781315140919 2018
[70]

Singh, F

G. Singh, F. M ´emoli, G. E. Carlsson, et al. Topological methods for the analysis of high dimensional data sets and 3d object recognition. PBG@ Eurographics, 2:091–100, 2007. 2

work page 2007
[71]

Smilkov, N

D. Smilkov, N. Thorat, B. Kim, F. Vi ´egas, and M. Wattenberg. Smoothgrad: removing noise by adding noise.arXiv preprint, 2017. 2

work page 2017
[72]

Sundararajan, A

M. Sundararajan, A. Taly, and Q. Yan. Axiomatic attribution for deep networks.International Conference on Machine Learning, pp. 3319– 3328, 2017. 2

work page 2017
[73]

Szegedy, W

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel- low, and R. Fergus. Intriguing properties of neural networks.arXiv preprint, 2013. 10

work page 2013
[74]

E. J. Topol. High-performance medicine: the convergence of human and artificial intelligence.Nature medicine, 25(1):44–56, 2019. doi: 10.1038/s41591-018-0300-7 1

work page doi:10.1038/s41591-018-0300-7 2019
[75]

Z. J. Wang, R. Turko, O. Shaikh, H. Park, N. Das, F. Hohman, M. Kahng, and D. H. Chau. Cnn explainer: Learning convolutional neural networks with interactive visualization.IEEE Transactions on Visualization and Computer Graphics, 27(1):1396–1406, 2021. doi: 10.1109/TVCG.2020.3030418 3

work page doi:10.1109/tvcg.2020.3030418 2021
[76]

Watanabe and H

S. Watanabe and H. Yamana. Topological measurement of deep neural networks using persistent homology.Complexity, 2021. doi: 10.1007/ s10472-021-09761-3 3

work page 2021
[77]

Wheeler, V

B. Wheeler, V . Bouza, and P. Bubenik. Activation landscapes as a topological summary of neural network performance. InInternational Conference on Machine Learning, 2021. doi: 10.1109/BigData52589 .2021.9671368 3

work page doi:10.1109/bigdata52589 2021
[78]

H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius. Integer quan- tization for deep learning inference: Principles and empirical evalua- tion.arXiv preprint, 2020. 9

work page 2020
[79]

Z. Yao, Z. Dong, Z. Zheng, A. Gholami, J. Yu, E. Tan, L. Wang, Q. Huang, Y . Wang, M. Mahoney, et al. Hawq-v3: Dyadic neural net- work quantization. InInternational Conference on Machine Learning, pp. 11875–11886, 2021. 9

work page 2021
[80]

M. D. Zeiler and R. Fergus. Visualizing and understanding convo- lutional networks. InEuropean Conference on Computer Vision, pp. 818–833, 2014. doi: 10.1007/978-3-319-10590-1 53 2

work page doi:10.1007/978-3-319-10590-1 2014
[81]

Zomorodian and G

A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete & Computational Geometry, 33(2):249–274, 2005. doi: 10. 1145/997817.997870 2 12

work page arXiv 2005

[1] [1]

R. Amar, J. Eagan, and J. Stasko. Low-level components of analytic activity in information visualization. InIEEE Symposium on Infor- mation Visualization, pp. 111–117, 2005. doi: 10.1109/INFVIS.2005 .1532136 2, 5

work page doi:10.1109/infvis.2005 2005

[2] [2]

S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. M ¨uller, and W. Samek. On pixel-wise explanations for non-linear classifier deci- sions by layer-wise relevance propagation.PloS one, 10(7), 2015. doi: 10.1371/journal.pone.0130140 2

work page doi:10.1371/journal.pone.0130140 2015

[3] [3]

Software design analysis and technical debt management based on design rule theory,

R. Ballester, X. Arnal, C. Casacuberta, M. Madadi, C. Corneanu, and S. Escalera. Predicting the generalization gap in neural networks us- ing topological data analysis.Neurocomputing, 2024. doi: 10.1016/j. neucom.2024.127787 3

work page doi:10.1016/j 2024

[4] [4]

Banner, Y

R. Banner, Y . Nahshan, E. Hoffer, and D. Soudry. Post training 4- bit quantization of convolutional networks for rapid-deployment. In Advances in Neural Information Processing Systems, vol. 32, 2019. 9

work page 2019

[5] [5]

Barocas and A

S. Barocas and A. D. Selbst.Big data’s disparate impact, vol. 104. HeinOnline, 2016. doi: 10.2139/ssrn.2477899 1

work page doi:10.2139/ssrn.2477899 2016

[6] [6]

D. Bau, B. Zhou, A. Khosla, A. Oliva, and A. Torralba. Network dissection: Quantifying interpretability of deep visual representations. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 6541–6549, 2017. doi: 10.1109/CVPR.2017.354 2

work page doi:10.1109/cvpr.2017.354 2017

[7] [7]

Birdal, A

T. Birdal, A. Lou, L. Guibas, and U. Simsekli. Intrinsic dimension, persistent homology and generalization in neural networks. InAd- vances in Neural Information Processing Systems, 2021. 3

work page 2021

[8] [8]

Blalock, J

D. Blalock, J. J. G. Ortiz, J. Frankle, and J. Guttag. What is the state of neural network pruning?Machine Learning and Systems, 2:129–146,

work page

[9] [9]

P. Bubenik. Statistical topological data analysis using persistence landscapes.Journal of Machine Learning Research, 16:77–102, 2015. 2

work page 2015

[10] [10]

Carlsson

G. Carlsson. Topology and data.Bulletin of the American Mathe- matical Society, 46(2):255–308, 2009. doi: 10.1090/S0273-0979-09 -01249-X 4

work page doi:10.1090/s0273-0979-09 2009

[11] [11]

Carri `ere, M

M. Carri `ere, M. Cuturi, and S. Oudot. Sliced wasserstein kernel for persistence diagrams. InInternational Conference on Machine Learn- ing, pp. 664–673, 2017. 2

work page 2017

[12] [12]

Carri `ere, M

M. Carri `ere, M. Cuturi, S. Oudot, and B. Rieck. Perslay: A neu- ral network layer for persistence diagrams and new graph topological signatures. InAISTATS, 2020. 3

work page 2020

[13] [13]

Cohen-Steiner, H

D. Cohen-Steiner, H. Edelsbrunner, and J. Harer. Stability of persis- tence diagrams.Discrete & Computational Geometry, 37(1):103–120,

work page

[14] [14]

doi: 10.1007/s00454-006-1276-5 2

work page doi:10.1007/s00454-006-1276-5

[15] [15]

Dettmers, M

T. Dettmers, M. Lewis, Y . Belkada, and L. Zettlemoyer. Gpt3.int8(): 8-bit matrix multiplication for transformers at scale.Advances in Neu- ral Information Processing Systems, 35:30318–30332, 2022. 9

work page 2022

[16] [16]

Devlin, M.-W

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint, 2018. 1

work page 2018

[17] [17]

Dosovitskiy, L

A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby. An image is worth 16x16 words: Trans- formers for image recognition at scale. InInternational Conference on Learning Representations, 2021. 7

work page 2021

[18] [18]

Edelsbrunner and J

H. Edelsbrunner and J. Harer.Computational topology: an introduc- tion. American Mathematical Soc., 2010. 4

work page 2010

[19] [19]

Edelsbrunner, D

H. Edelsbrunner, D. Letscher, and A. Zomorodian. Topological per- sistence and simplification.Discrete & Computational Geometry, 28(4):511–533, 2002. doi: 10.1109/SFCS.2000.892133 2, 4

work page doi:10.1109/sfcs.2000.892133 2002

[20] [20]

Erhan, Y

D. Erhan, Y . Bengio, A. Courville, and P. Vincent. Visualizing higher- layer features of a deep network. InInternational Conference on Ma- chine Learning, pp. 341–348, 2009. 2

work page 2009

[21] [21]

S. K. Esser, J. L. McKinstry, D. Bablani, R. Appuswamy, and D. S. Modha. Learned step size quantization. InInternational Conference on Learning Representations, 2020. 9

work page 2020

[22] [22]

Feldman, M

D. Feldman, M. Schmidt, and C. Sohler. Turning big data into tiny data: Constant-size coresets for k-means, pca, and projective clus- tering.SIAM Journal on Computing, 49(3):601–657, 2020. doi: 10. 1137/18M1209854 10 10

work page 2020

[23] [23]

Frankle and M

J. Frankle and M. Carbin. The lottery ticket hypothesis: Finding sparse, trainable neural networks. InInternational Conference on Learning Representations, 2019. 8

work page 2019

[24] [24]

Gholami, S

A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer. A survey of quantization methods for efficient neural net- work inference.arXiv preprint, 2021. doi: 10.1201/9781003162810 -13 9

work page doi:10.1201/9781003162810 2021

[25] [25]

R. Ghrist. Barcodes: The persistent topology of data.Bulletin of the American Mathematical Society, 45(1):61–75, 2008. doi: 10.1090/ S0273-0979-07-01191-3 2

work page 2008

[26] [26]

Goodfellow, Y

I. Goodfellow, Y . Bengio, and A. Courville.Deep learning. MIT press, 2016. 1

work page 2016

[27] [27]

Goodfellow, J

I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y . Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, vol. 27, 2014. 10

work page 2014

[28] [28]

Guti ´errez-Fandi˜no, D

A. Guti ´errez-Fandi˜no, D. P ´erez-Fern´andez, J. Armengol-Estap ´e, and M. Villegas. Persistent homology captures the generalization of neural networks without a validation set.arXiv preprint, 2021. 3

work page 2021

[29] [29]

S. Han, H. Mao, and W. J. Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. InInternational Conference on Learning Representations,

work page

[30] [30]

ICLR 2016 (oral). 8, 9

work page 2016

[31] [31]

Hassibi and D

B. Hassibi and D. G. Stork. Second order derivatives for network pruning: Optimal brain surgeon.Advances in Neural Information Pro- cessing Systems, 5, 1993. 8

work page 1993

[32] [32]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778, 2016. doi: 10.1109/CVPR.2016.90 6

work page doi:10.1109/cvpr.2016.90 2016

[33] [33]

Hofer, R

C. Hofer, R. Kwitt, M. Niethammer, and A. Uhl. Deep learning with topological signatures. InAdvances in Neural Information Processing Systems, 2017. 3

work page 2017

[34] [34]

Hohman, H

F. Hohman, H. Park, C. Robinson, and D. H. Chau. Summit: Scaling deep learning interpretability by visualizing activation and attribution summarizations.IEEE Transactions on Visualization and Computer Graphics, 26(1):1–12, 2020. doi: 10.1109/TVCG.2019.2934659 3

work page doi:10.1109/tvcg.2019.2934659 2020

[35] [35]

In: Proceedings of the IEEE conference on computer vi- sion and pattern recognition

B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko. Quantization and training of neural networks for efficient integer-arithmetic-only inference. InIEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713, 2018. doi: 10.1109/CVPR.2018.00286 9

work page doi:10.1109/cvpr.2018.00286 2018

[36] [36]

Kahng, P

M. Kahng, P. Y . Andrews, A. Kalro, and D. H. Chau. Activis: Vi- sual exploration of industry-scale deep neural network models.IEEE Transactions on Visualization and Computer Graphics, 24(1):88–97,

work page

[37] [37]

doi: 10.1109/TVCG.2017.2744718 3

work page doi:10.1109/tvcg.2017.2744718 2017

[38] [38]

A. E. Khandani, A. J. Kim, and A. W. Lo. Consumer credit-risk mod- els via machine-learning algorithms.Journal of Banking & Finance, 34(11):2767–2787, 2010. 1

work page 2010

[39] [39]

B. Kim, M. Wattenberg, J. Gilmer, C. Cai, J. Wexler, F. Viegas, and R. Sayres. Interpretability beyond feature attribution: Quantitative testing with concept activation vectors (tcav). InInternational Con- ference on Machine Learning, pp. 2668–2677, 2018. 2

work page 2018

[40] [40]

D. P. Kingma and M. Welling. Auto-encoding variational bayes.arXiv preprint, 2013. 10

work page 2013

[41] [41]

Krishnamoorthi

R. Krishnamoorthi. Quantizing deep convolutional networks for effi- cient inference: A whitepaper.arXiv preprint, 2018. 9

work page 2018

[42] [42]

Krizhevsky

A. Krizhevsky. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. Technical Report. 6

work page 2009

[43] [43]

Krizhevsky, I

A. Krizhevsky, I. Sutskever, and G. E. Hinton. Imagenet classifica- tion with deep convolutional neural networks. InAdvances in Neural Information Processing Systems, pp. 1097–1105, 2012. doi: 10.1145/ 3065386 1

work page 2012

[44] [44]

Nature, 521, 436 –444, https://doi.org/10.1038/nature14539

Y . LeCun, Y . Bengio, and G. Hinton. Deep learning.Nature, 521(7553):436–444, 2015. doi: 10.1038/nature14539 1

work page doi:10.1038/nature14539 2015

[45] [45]

LeCun, J

Y . LeCun, J. Denker, and S. Solla. Optimal brain damage.Advances in Neural Information Processing Systems, 2, 1989. 8

work page 1989

[46] [46]

H. Li, A. Kadav, I. Durdanovic, H. Samet, and H. P. Graf. Pruning filters for efficient convnets. InInternational Conference on Learning Representations, 2017. 8

work page 2017

[47] [47]

M. Liu, J. Shi, Z. Li, C. Li, J. Zhu, and S. Liu. Towards better anal- ysis of deep convolutional neural networks.IEEE Transactions on Visualization and Computer Graphics, 23(1):831–840, 2017. doi: 10. 1109/TVCG.2016.2598831 3

work page arXiv 2017

[48] [48]

Z. Liu, J. Li, Z. Shen, G. Huang, S. Yan, and C. Zhang. Learning efficient convolutional networks through network slimming. InIEEE International Conference on Computer Vision, pp. 2736–2744, 2017. doi: 10.1109/ICCV.2017.298 8

work page doi:10.1109/iccv.2017.298 2017

[49] [49]

F. J. L ´opez Iturriaga and I. P. Sanz. Machine learning: Challenges, lessons, and opportunities in credit risk modeling.Moody’s Analytics Risk Perspectives, 2013. 1

work page 2013

[50] [50]

A. Lou, D. Lim, I. Katsman, L. Huang, Q. Jiang, S.-N. Lim, and C. De Sa. Neural manifold ordinary differential equations.Advances in Neural Information Processing Systems, 33:17548–17558, 2020. 10

work page 2020

[51] [51]

Louizos, M

C. Louizos, M. Welling, and D. P. Kingma. Learning sparse neural networks throughl 0 regularization. InInternational Conference on Learning Representations, 2018. 8

work page 2018

[52] [52]

S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. InAdvances in Neural Information Processing Systems, vol. 30, pp. 4765–4774, 2017. 2

work page 2017

[53] [53]

Maria, J.-D

C. Maria, J.-D. Boissonnat, M. Glisse, and M. Yvinec. The gudhi library: Simplicial complexes and persistent homology. InInter- national Congress on Mathematical Software (ICMS), pp. 167–174,

work page

[54] [54]

doi: 10.1007/978-3-662-44199-2 28 2, 6

work page doi:10.1007/978-3-662-44199-2

[55] [55]

ACM Comput

N. Mehrabi, F. Morstatter, N. Saxena, K. Lerman, and A. Galstyan. A survey on bias and fairness in machine learning.ACM Computing Surveys (CSUR), 54(6):1–35, 2021. doi: 10.1145/3457607 1

work page doi:10.1145/3457607 2021

[56] [56]

S. Migacz. 8-bit inference with tensorrt. InGPU Technology Confer- ence, 2017. 9

work page 2017

[57] [57]

Molchanov, S

P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz. Pruning convolutional neural networks for resource efficient inference. InIn- ternational Conference on Learning Representations, 2017. 8

work page 2017

[58] [58]

Molnar.Interpretable machine learning

C. Molnar.Interpretable machine learning. Lulu. com, 2020. 1

work page 2020

[59] [59]

M. Moor, M. Horn, B. Rieck, and K. Borgwardt. Topological autoen- coders. InInternational Conference on Machine Learning, 2020. 3

work page 2020

[60] [60]

Nagel, M

M. Nagel, M. v. Baalen, T. Blankevoort, and M. Welling. Data-free quantization through weight equalization and bias correction. InIEEE International Conference on Computer Vision, pp. 1325–1334, 2019. doi: 10.1109/ICCV.2019.00141 9

work page doi:10.1109/iccv.2019.00141 2019

[61] [61]

C. Olah, A. Mordvintsev, and L. Schubert. Feature visualization.Dis- till, 2017. doi: 10.23915/distill.00007 2

work page doi:10.23915/distill.00007 2017

[62] [63]

Experimental observations of the topology of convolutional neural network activations

E. Purvine et al. Experimental observations of the topology of convo- lutional neural network activations. InIEEE Symposium on Visualiza- tion for Cyber Security, 2022. doi: 10.1609/aaai.v37i8.26134 3

work page doi:10.1609/aaai.v37i8.26134 2022

[63] [64]

Scalable and accurate deep learning with electronic health records

A. Rajkomar, E. Oren, K. Chen, A. M. Dai, N. Hajaj, M. Hardt, P. J. Liu, X. Liu, J. Marcus, M. Sun, et al. Scalable and accurate deep learning with electronic health records.NPJ Digital Medicine, 1(1):1– 10, 2018. doi: 10.1038/s41746-018-0029-1 1

work page doi:10.1038/s41746-018-0029-1 2018

[64] [65]

M. T. Ribeiro, S. Singh, and C. Guestrin. ”why should i trust you?” explaining the predictions of any classifier. InACM SIGKDD Inter- national Conference on Knowledge Discovery and Data Mining, pp. 1135–1144, 2016. doi: 10.18653/v1/N16-3020 2

work page doi:10.18653/v1/n16-3020 2016

[65] [66]

Rieck, M

B. Rieck, M. Togninalli, M. Bianchini, J. M. Buhmann, C. Kenel, D. Lun, A. Radeghieri, C. Ertle, and D. H”ottger. Neural persistence: A complexity measure for deep neural networks using algebraic topol- ogy. InInternational Conference on Learning Representations, 2019. ICLR. 3

work page 2019

[66] [67]

C. Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead.Nature Machine Intelligence, 1(5):206–215, 2019. doi: 10.1038/s42256-019-0048-x 1

work page doi:10.1038/s42256-019-0048-x 2019

[67] [68]

R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. InIEEE International Conference on Computer Vision, pp. 618–626, 2017. doi: 10.1007/s11263-019 -01228-7 2 11

work page doi:10.1007/s11263-019 2017

[68] [69]

B. W. Silverman.Density Estimation for Statistics and Data Analysis. Routledge, 1st ed., 2018. doi: 10.1201/9781315140919 10

work page doi:10.1201/9781315140919 2018

[69] [70]

Singh, F

G. Singh, F. M ´emoli, G. E. Carlsson, et al. Topological methods for the analysis of high dimensional data sets and 3d object recognition. PBG@ Eurographics, 2:091–100, 2007. 2

work page 2007

[70] [71]

Smilkov, N

D. Smilkov, N. Thorat, B. Kim, F. Vi ´egas, and M. Wattenberg. Smoothgrad: removing noise by adding noise.arXiv preprint, 2017. 2

work page 2017

[71] [72]

Sundararajan, A

M. Sundararajan, A. Taly, and Q. Yan. Axiomatic attribution for deep networks.International Conference on Machine Learning, pp. 3319– 3328, 2017. 2

work page 2017

[72] [73]

Szegedy, W

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfel- low, and R. Fergus. Intriguing properties of neural networks.arXiv preprint, 2013. 10

work page 2013

[73] [74]

E. J. Topol. High-performance medicine: the convergence of human and artificial intelligence.Nature medicine, 25(1):44–56, 2019. doi: 10.1038/s41591-018-0300-7 1

work page doi:10.1038/s41591-018-0300-7 2019

[74] [75]

Z. J. Wang, R. Turko, O. Shaikh, H. Park, N. Das, F. Hohman, M. Kahng, and D. H. Chau. Cnn explainer: Learning convolutional neural networks with interactive visualization.IEEE Transactions on Visualization and Computer Graphics, 27(1):1396–1406, 2021. doi: 10.1109/TVCG.2020.3030418 3

work page doi:10.1109/tvcg.2020.3030418 2021

[75] [76]

Watanabe and H

S. Watanabe and H. Yamana. Topological measurement of deep neural networks using persistent homology.Complexity, 2021. doi: 10.1007/ s10472-021-09761-3 3

work page 2021

[76] [77]

Wheeler, V

B. Wheeler, V . Bouza, and P. Bubenik. Activation landscapes as a topological summary of neural network performance. InInternational Conference on Machine Learning, 2021. doi: 10.1109/BigData52589 .2021.9671368 3

work page doi:10.1109/bigdata52589 2021

[77] [78]

H. Wu, P. Judd, X. Zhang, M. Isaev, and P. Micikevicius. Integer quan- tization for deep learning inference: Principles and empirical evalua- tion.arXiv preprint, 2020. 9

work page 2020

[78] [79]

Z. Yao, Z. Dong, Z. Zheng, A. Gholami, J. Yu, E. Tan, L. Wang, Q. Huang, Y . Wang, M. Mahoney, et al. Hawq-v3: Dyadic neural net- work quantization. InInternational Conference on Machine Learning, pp. 11875–11886, 2021. 9

work page 2021

[79] [80]

M. D. Zeiler and R. Fergus. Visualizing and understanding convo- lutional networks. InEuropean Conference on Computer Vision, pp. 818–833, 2014. doi: 10.1007/978-3-319-10590-1 53 2

work page doi:10.1007/978-3-319-10590-1 2014

[80] [81]

Zomorodian and G

A. Zomorodian and G. Carlsson. Computing persistent homology. Discrete & Computational Geometry, 33(2):249–274, 2005. doi: 10. 1145/997817.997870 2 12

work page arXiv 2005