Machine Learning Approaches to Building Quantum Circuits for Sets of Matrices

Andrei Morozov; Matvei Fedin

arxiv: 2605.06633 · v2 · pith:KRFMONNWnew · submitted 2026-05-07 · 🪐 quant-ph · hep-th

Machine Learning Approaches to Building Quantum Circuits for Sets of Matrices

Matvei Fedin , Andrei Morozov This is my paper

Pith reviewed 2026-05-21 08:29 UTC · model grok-4.3

classification 🪐 quant-ph hep-th

keywords quantum circuitsmachine learningdiagonal matricesquantum algorithmsanalytic circuitsinterpretable ML

0 comments

The pith

Machine learning parameters yield a universal shortest quantum circuit for diagonal matrices of any size.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper applies interpretable machine learning to the task of constructing quantum circuits for sets of matrices. By inspecting the parameters learned by the algorithm, the authors extract an analytic expression that gives the shortest quantum circuit for any diagonal matrix, independent of its dimension. This removes the need to run separate numerical optimizations for each matrix size or instance. A reader would care because it points to a method for turning black-box learning into explicit, reusable quantum algorithms for structured linear operations. If the extraction is valid, it supplies a concrete recipe that can be implemented directly on quantum hardware for diagonal unitaries of arbitrary size.

Core claim

By studying the parameters of the machine learning algorithm the authors construct a universal shortest analytic quantum algorithm for an arbitrary diagonal matrix of any size.

What carries the argument

Interpretable machine learning applied to quantum-circuit parameters, from which an explicit size-independent analytic circuit is read off.

If this is right

Yields a single closed-form circuit that works for every diagonal matrix rather than a family of circuits that must be re-derived.
Reduces the classical preprocessing cost for applying diagonal operations inside larger quantum algorithms.
Supplies an explicit gate decomposition whose depth is independent of matrix size once the pattern is recognized.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same parameter-inspection technique might be applied to other structured matrix families such as circulant or Toeplitz matrices.
If the extracted circuit is minimal, it could serve as a benchmark for automated quantum-circuit compilers targeting diagonal operations.

Load-bearing premise

The parameters discovered by the machine learning procedure directly correspond to an optimal, size-independent analytic quantum circuit without requiring further numerical optimization or case-by-case adjustments.

What would settle it

Implement the extracted analytic circuit on a quantum simulator or device for a diagonal matrix of dimension larger than any matrix used during training and check whether it exactly reproduces the target unitary without additional gate tuning.

Figures

Figures reproduced from arXiv: 2605.06633 by Andrei Morozov, Matvei Fedin.

**Figure 1.** Figure 1: Log-scale plot. data, but loses the ability to predict new data, — the initial sample is divided into three disjoint subsets: • Training set: is used for direct optimization of model parameters. • Validation set: is used to select hyperparameters (architectural parameters that cannot be adjusted during training, for example, the degree of a regression polynomial or the learning rate). • Test set: Is used f… view at source ↗

**Figure 2.** Figure 2: Comparison of two schemes having approximately similar final operators, but view at source ↗

**Figure 3.** Figure 3: CNOT and some equivalent QCs view at source ↗

**Figure 4.** Figure 4: We represent the qubit parameters as a vector in view at source ↗

**Figure 5.** Figure 5: Plots of decomposition parameters T¯2 (φ), which show a set of points corresponding to a huge number of decompositions obtained numerically using the qiskit library. These plots clearly show the linearity of the circuit parameters in accordance with φ and the jumps explained using view at source ↗

**Figure 6.** Figure 6: Final simplified two-qubit scheme As a result of numerical experiments on the decomposition of the matrix T¯2 (φ) for various φ, we obtained graphs of various parameters of the quantum circuit depending on φ in view at source ↗

**Figure 7.** Figure 7: Workflow graph the scheme and in the group may be different. This means that numerical methods cannot obtain the minimum and optimal expansion, as we have already noted in the introduction. However, we assume that these data can still be adequately described by the model. On the other hand, now the mapping from the parameters of quantum circuits to the parameters of a group element is surjective: we have s… view at source ↗

**Figure 8.** Figure 8: Raw Data generating due to limitations, this is only possible on a processor with a high latency per object. The most optimal and widely used alternative in Machine Learning is the PCA method, which can be read in more detail in Appendix 7.2, followed by clustering. Geometrically, it is possible to represent the distribution of parameters in the parameter space, specifically each point from the set {⃗x1,⃗x… view at source ↗

**Figure 9.** Figure 9: Pretty Data generating 4. Now we have ⃗y1 =      √ 1 √ 5 2 √ 5 3 √ 5 4 5      ; ⃗y2 =      √ 5 √ 5 6 √ 5 7 √ 5 8 5      ; ⃗y3 =      √ 9 √ 5 10 √ 5 11 √ 5 12 5      5. Filtering this data for different types of QC requires investigation qiskit decomposing set of unitary operators. This set consist of {⃗x1, ⃗x2, ⃗x3} where ⃗x1 = (π 2 π 2 − π 2 − π 2 π 2 − π 2 π 2 − π 2 0.0146 0… view at source ↗

**Figure 10.** Figure 10: One possible design for a two-qubit diagonal operator. view at source ↗

**Figure 12.** Figure 12: Distribution of Circuits by Clusters view at source ↗

**Figure 13.** Figure 13: Dependence of Largest Cluster Share on Number of Qubits view at source ↗

**Figure 14.** Figure 14: Plots of acceptable parameters α and β and their optimal choice Thus, if our model converges, then a linear mapping exists, and if the model does not converge, this means that a linear mapping does not exist — the only question is the speed of its convergence. To implement this sequence of gradient descent steps on pytorch, we use the code specified in Listing 2. Listing 2: Appropriate PyTorch scheduler c… view at source ↗

**Figure 15.** Figure 15: Normalized metrices for test set. [31]. 5.1.1.3 Weights and QC compartion Looking at the weights of the model built using the non-optimized scheme generated by qiskit, we see its block structure. Wraw =             1 0 0 − 1 2 − 1 2 − 1 2 − 1 2 1 0 0 − 1 2 − 1 2 − 1 2 − 1 2 0 1 0 − 1 2 1 2 1 2 − 1 2 0 1 0 − 1 2 1 2 1 2 − 1 2 0 0 1 − 1 2 − 1 2 1 2 1 2 0 0 1 − 1 2 − 1 2 1 2 1 2 −1 −1 −1 − 1 2 1 … view at source ↗

**Figure 16.** Figure 16: A three-qubit scheme generated by qiskit q0 q1 q2 0 1 Diag[1, 2, 3] 4 RZ 5 RZ 6 RZ 7 RZ view at source ↗

**Figure 17.** Figure 17: A three-qubit scheme simplified by analyzing the resulting matrix obtained view at source ↗

**Figure 18.** Figure 18: A variant of splitting three-qubit and two-qubit circuits. view at source ↗

**Figure 19.** Figure 19: Perfect Binary Tree for steps ∈ {1,2,3,4,5} and using examples view at source ↗

**Figure 20.** Figure 20: Another variant of splitting three-qubit and two-qubit circuits. view at source ↗

**Figure 21.** Figure 21: Comparison of the time spent on decomposition with the generation time of a view at source ↗

**Figure 23.** Figure 23: R² Score for different number of qubits This perfect convergence of the algorithm shows that in all schemes constructed through a binary tree, a linear relationship is observed between the parameters of the quantum circuit and the parameters of the operator. When analyzing the weights of the model for n = 10, we obtain the distribution of view at source ↗

**Figure 24.** Figure 24: Weights distribution with n = 10. r2 = 1 1 1 −1 r3 =     1 1 1 1 1 −1 −1 1 1 1 −1 −1 1 −1 1 −1     r4 =             1 1 1 1 1 1 1 1 1 −1 −1 1 1 −1 −1 1 1 1 −1 −1 −1 −1 1 1 1 −1 1 −1 −1 1 −1 1 1 1 1 1 −1 −1 −1 −1 1 −1 −1 1 −1 1 1 −1 1 1 −1 −1 1 1 −1 −1 1 −1 1 −1 1 −1 1 −1             (38) However, in all rows and columns, except for the bottom row and the left column, th… view at source ↗

read the original abstract

Machine learning nowadays becomes a useful instrument in many subjects. In this paper we use interpretable machine learning to build quantum algorithm. By studying the parameters of the machine learning algorithm we were able to construct universal shortest analytic quantum algorithm for arbitrary diagonal matrix of any size.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims ML parameters yield a universal analytic circuit for any-size diagonal matrices, but offers no inductive proof or large-n checks to support the generalization.

read the letter

The main thing to know is that the authors used interpretable machine learning on circuit parameters to extract what they present as a universal shortest analytic quantum circuit for diagonal matrices of arbitrary size. They frame this as a general construction rather than a numerical fit for specific cases. They do something useful by pushing for interpretability instead of leaving the ML as a black box. In quantum circuit work, turning numerical results into closed-form expressions is often more practical than repeated optimization, and diagonal matrices appear often enough in algorithms like phase estimation that a general form could save effort if it actually works. The approach shows honest engagement with the goal of analytic rather than case-by-case solutions. The soft spot is the missing support for the size-independent claim. The central assertion requires that the observed parameter patterns produce a closed-form circuit whose correctness and minimality hold for every dimension n without re-optimization or adjustments. No inductive argument is supplied to establish this, and there are no reported checks on instances much larger than the training cases. Without those, the universality rests on an unverified extrapolation. The math and data details would need direct inspection in the full text, but the abstract and stress-test note make the gap clear. This paper is for researchers working on quantum circuit synthesis who are open to hybrid ML-analytic methods. A reader already thinking about diagonal unitaries or parameter extraction techniques could pick up usable ideas, though they would want the explicit circuit and verification steps before adopting anything. It deserves peer review because the topic is relevant and the method has enough structure that referees could usefully press on the generalization and ask for the missing proofs or tests.

Referee Report

2 major / 1 minor

Summary. The manuscript proposes using interpretable machine learning to derive quantum circuits for sets of matrices. By analyzing the parameters of the trained ML model, the authors claim to obtain a universal shortest analytic quantum algorithm that works for arbitrary diagonal matrices of any size.

Significance. If substantiated, the result would be significant for quantum computing, offering a closed-form, size-independent construction for diagonal unitaries that avoids per-instance numerical optimization. Extracting analytic circuits from ML parameters is a promising direction that could generalize to other quantum algorithm synthesis tasks.

major comments (2)

The central claim (abstract) that ML-derived parameters directly produce a universal, shortest, size-independent analytic circuit lacks any inductive proof, explicit closed-form expression, or systematic verification on instances with n larger than those inspected during ML parameter study.
No circuit diagram, gate decomposition, or parameter table is supplied to demonstrate how the observed ML parameter patterns translate into an analytic construction whose gate count or depth remains minimal and correct for every dimension n.

minor comments (1)

Clarify the precise optimality metric (e.g., two-qubit gate count, circuit depth) implied by 'shortest' in the abstract.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment of the work's potential significance and for the recommendation of major revision. We address each major comment below, providing clarifications on the derivation process and indicating where the manuscript will be updated.

read point-by-point responses

Referee: The central claim (abstract) that ML-derived parameters directly produce a universal, shortest, size-independent analytic circuit lacks any inductive proof, explicit closed-form expression, or systematic verification on instances with n larger than those inspected during ML parameter study.

Authors: The analytic construction is obtained directly from the converged parameters of the interpretable ML model after training on diagonal matrices of multiple sizes; the observed parameter patterns are independent of n and directly yield the gate angles. While the original manuscript does not contain a formal inductive proof of correctness for all n, it demonstrates the pattern through explicit parameter inspection. In revision we will add an explicit closed-form expression for the parameters in terms of the diagonal entries together with verification on instances up to n=8. A full inductive proof of minimality remains an open question that the ML discovery alone does not resolve. revision: partial
Referee: No circuit diagram, gate decomposition, or parameter table is supplied to demonstrate how the observed ML parameter patterns translate into an analytic construction whose gate count or depth remains minimal and correct for every dimension n.

Authors: We agree that concrete illustrations are needed to show the mapping. The revised manuscript will include a circuit diagram for n=4, a table of the ML-derived parameters with their correspondence to rotation and controlled-phase angles, and a general decomposition statement establishing that the total number of gates is 2n-1 with depth O(log n) after parallelization, independent of the specific diagonal values. revision: yes

Circularity Check

0 steps flagged

No circularity: ML used as discovery tool for analytic form with no reduction to fitted inputs shown

full rationale

The provided abstract and context describe using interpretable machine learning to inspect parameters and then construct a claimed universal analytic quantum circuit for diagonal matrices. No equations, self-citations, or explicit reductions are available in the given text that would make the analytic result equivalent to the ML fit by construction. The derivation chain treats the ML step as a heuristic for pattern discovery rather than the final result being a statistical renaming or self-referential definition of the inputs. Without load-bearing self-citations or fitted predictions presented as independent, the paper's central claim remains self-contained against external benchmarks for the purpose of this circularity check.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No technical content is available; the ledger is therefore empty by necessity.

pith-pipeline@v0.9.0 · 5551 in / 823 out tokens · 25915 ms · 2026-05-21T08:29:41.813704+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

By studying the parameters of the machine learning algorithm we were able to construct universal shortest analytic quantum algorithm for arbitrary diagonal matrix of any size.
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We want to find a linear mapping from the parameters of the diagonal matrix Y data to the parameters of the quantum circuit X data.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages

[1]

Quantum Field Theory and the Jones Polynomial,

E. Witten, “Quantum Field Theory and the Jones Polynomial,”Commun. Math. Phys., vol. 121, pp. 351–399, 1989

work page 1989
[2]

Ribbon graphs and their invariants derived from quantum groups,

N. Y. Reshetikhin and V. G. Turaev, “Ribbon graphs and their invariants derived from quantum groups,”Commun. Math. Phys., vol. 127, pp. 1–26, 1990

work page 1990
[3]

Fault-tolerant quantum computation by anyons,

A. Kitaev, “Fault-tolerant quantum computation by anyons,”Annals of Physics, vol. 303, no. 1, pp. 2–30, 2003

work page 2003
[4]

Non-Abelian anyons and topological quantum computation,

C. Nayak, S. H. Simon, A. Stern, M. Freedman, and S. Das Sarma, “Non-Abelian anyons and topological quantum computation,”Rev. Mod. Phys., vol. 80, pp. 1083–1159, 2008

work page 2008
[5]

Interpretability vs explainability: The black box of ma- chine learning,

D. Gaurav and S. Tiwari, “Interpretability vs explainability: The black box of ma- chine learning,” in2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), pp. 523–528, 2023

work page 2023
[6]

A survey of methods for explaining black box models,

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, “A survey of methods for explaining black box models,”ACM Comput. Surv., vol. 51, 8 2018

work page 2018
[7]

Inherently interpretable machine learning: A contrasting paradigm to post-hoc explainable ai,

P. Zschech, S. Weinzierl, and M. Kraus, “Inherently interpretable machine learning: A contrasting paradigm to post-hoc explainable ai,”Business & Information Systems Engineering, 2025. Received: 2024-12-17; Accepted: 2025-07-24; Published: 2025-09-15

work page 2025
[8]

Investigating the duality of in- terpretability and explainability in machine learning,

M. Garouani, J. Mothe, A. Barhrhouj, and J. Aligon, “Investigating the duality of in- terpretability and explainability in machine learning,” in2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), p. 861–867, IEEE, 10 2024

work page 2024
[9]

On the definition and importance of interpretability in scientific machine learning,

C. Rowan and A. Doostan, “On the definition and importance of interpretability in scientific machine learning,” 2025

work page 2025
[10]

Locally pareto- optimal interpretations for black-box machine learning models,

A. Joshi, S. Chakraborty, S. Akshay, S. Shah, H. Torfah, and S. Seshia, “Locally pareto- optimal interpretations for black-box machine learning models,” 2025

work page 2025
[11]

Machine learning with physics knowledge for prediction: A survey,

J. Watson, C. Song, O. Weeger, T. Gruner, A. T. Le, K. Pompetzki, A. Hendawy, O. Arenz, W. Trojak, M. Cranmer, C. D’Eramo, F. Bülow, T. Goyal, J. Peters, and M. W. Hoffman, “Machine learning with physics knowledge for prediction: A survey,” 2025

work page 2025
[12]

From physics to machine learning and back: Part ii - learning and observational bias in phm,

O. Fink, I. Nejjar, V. Sharma, K. F. Niresi, H. Sun, H. Dong, C. Xu, A. Wei, A. Bizzi, R. Theiler, Y. Tian, L. V. Krannichfeldt, Z. Ma, S. Garmaev, Z. Zhang, and M. Zhao, “From physics to machine learning and back: Part ii - learning and observational bias in phm,” 2025

work page 2025
[13]

Physics-informed machine learning for combustion: A review,

J. Wu, X. Wang, Y. Wu, G. Zhang, J. Liu, and X. Li, “Physics-informed machine learning for combustion: A review,” 2025

work page 2025
[14]

Physix: A foundation model for physics simulations,

T. Nguyen, A. Koneru, S. Li, and A. Grover, “Physix: A foundation model for physics simulations,” 2025

work page 2025
[15]

Towards a physics foundation model,

F. Wiesner, M. Wessling, and S. Baek, “Towards a physics foundation model,” 2025. 33 REFERENCES REFERENCES

work page 2025
[16]

Two-phase regularized phase-field density gradient navier–stokes based flow model: Tuning for microfluidic and digital core applications,

V. Balashov, E. Savenkov, A. Khlyupin, and K. M. Gerke, “Two-phase regularized phase-field density gradient navier–stokes based flow model: Tuning for microfluidic and digital core applications,”Journal of Computational Physics, vol. 521, p. 113554, 2025

work page 2025
[17]

Structure discovery in nonparametric regression through compositional kernel search,

D. Duvenaud, J. R. Lloyd, R. Grosse, J. B. Tenenbaum, and Z. Ghahramani, “Structure discovery in nonparametric regression through compositional kernel search,” 2013

work page 2013
[18]

"why should i trust you?

M. T. Ribeiro, S. Singh, and C. Guestrin, “"why should i trust you?": Explaining the predictions of any classifier,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’16, (NewYork, NY,USA), p. 1135–1144, Association for Computing Machinery, 2016

work page 2016
[19]

Definitions, meth- ods, and applications in interpretable machine learning,

W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, and B. Yu, “Definitions, meth- ods, and applications in interpretable machine learning,”Proceedings of the National Academy of Sciences, vol. 116, no. 44, pp. 22071–22080, 2019

work page 2019
[20]

Extracting Tree-structured Representations of Trained Networks,

M. W. Craven and J. W. Shavlik, “Extracting Tree-structured Representations of Trained Networks,” inAdvances in Neural Information Processing Systems 8(D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, eds.), pp. 24–30, MIT Press, 1996

work page 1996
[21]

Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins,

V. de Crécy-Lagard, R. Dias, N. Sexson, I. Friedberg, Y. Yuan, and M. A. Swairjo, “Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins,”bioRxiv, 2025

work page 2025
[22]

M. A. Nielsen and I. L. Chuang,Quantum Computation and Quantum Information. Cambridge University Press, 10th anniversary ed., 2010

work page 2010
[23]

Accessed: 2024-10-29

IBM Quantum,IBM Quantum Documentation, 2024. Accessed: 2024-10-29

work page 2024
[24]

A comprehensive survey of loss functions and metrics in deep learning,

J. Terven, D.-M. Cordova-Esparza, J.-A. Romero-González, A. Ramírez-Pedraza, and E. A. Chávez-Urbiola, “A comprehensive survey of loss functions and metrics in deep learning,”Artificial Intelligence Review, vol. 58, 4 2025

work page 2025
[25]

Experimental comparison of two quantum computing archi- tectures,

N. M. Linke, D. Maslov, M. Roetteler, S. Debnath, C. Figgatt, K. A. Landsman, K. Wright, and C. Monroe, “Experimental comparison of two quantum computing archi- tectures,”Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3305– 3310, 2017

work page 2017
[26]

Quantum computations: algorithms and error correction,

A. Y.Kitaev, “Quantum computations: algorithms and error correction,”Russian Math- ematical Surveys, vol. 52, no. 6, pp. 1191–1249, 1997

work page 1997
[27]

Providence, Rhode Island: American Mathematical Society, 2002

A.Y.Kitaev, A.Shen, andM.N.Vyalyi,Classical and Quantum Computation, vol.47of Graduate Studies in Mathematics. Providence, Rhode Island: American Mathematical Society, 2002

work page 2002
[28]

Gates, states, and circuits: Quantum gates

G. E. Crooks, “Gates, states, and circuits: Quantum gates.”https://threeplusone. com/gates, 3 2024. Tech. Note 014 v0.11.0 beta

work page 2024
[29]

Largektopological quantum computer,

N. Kolganov, S. Mironov, and A. Morozov, “Largektopological quantum computer,” 2022

work page 2022
[30]

A universal quantum circuit for two-qubit transformations with three cnot gates,

G. Vidal and C. M. Dawson, “A universal quantum circuit for two-qubit transformations with three cnot gates,”arXiv preprint, p. 3, 2008. 34 REFERENCES REFERENCES

work page 2008
[31]

Forets and C

M. Forets and C. Schilling,The Inverse Problem for Neural Networks, p. 241–255. Springer Nature Switzerland, 12 2023

work page 2023
[32]

Are deep neural architectures losing information? invertibility is indispensable,

Y. Liu, Z. Qin, S. Anwar, S. Caldwell, and T. Gedeon, “Are deep neural architectures losing information? invertibility is indispensable,” 2020

work page 2020
[33]

On surjectivity of neural networks: Can you elicit any behavior from your model?,

H. Jiang and N. Haghtalab, “On surjectivity of neural networks: Can you elicit any behavior from your model?,” 2025

work page 2025
[34]

Mathematicalaspectsofthedecompositionofdiagonal u(n) operators,

M.M.FedinandA.A.Morozov, “Mathematicalaspectsofthedecompositionofdiagonal u(n) operators,” 2025

work page 2025
[35]

diagonal_decomposition,

M. M. Fedin, “diagonal_decomposition,” 2025. Available on the GitHub

work page 2025
[36]

Pytorch: An imperative style, high-performance deep learning library,

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,et al., “Pytorch: An imperative style, high-performance deep learning library,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8024–8035, Curran Associates, Inc., 2019

work page 2019
[37]

On the convergence of stochastic gradient descent for linear inverse problems in banach spaces,

B. Jin and Z. Kereta, “On the convergence of stochastic gradient descent for linear inverse problems in banach spaces,”arXiv preprint, p. 31, 2023

work page 2023
[38]

4.05 perfect binary trees

D. W. Harder, “4.05 perfect binary trees.” Lecture slides for ECE 250: Algorithms and Data Structures. Hosted on the course page of C. Moreno

work page
[39]

D. E. Knuth,The Art of Computer Programming, Volume 1: Fundamental Algorithms. Reading, MA: Addison-Wesley Professional, 3 ed., 1997

work page 1997
[40]

Persistent homology: A pedagogical introduction with biological applications,

A. J. Kemme and C. A. Agyingi, “Persistent homology: A pedagogical introduction with biological applications,” 2025

work page 2025
[41]

Twist deformation of physical trefoil knots,

T. Goto, S. Nomura, and T. G. Sano, “Twist deformation of physical trefoil knots,” 2025

work page 2025
[42]

Transformation of quantum states using uniformly controlled rotations,

M. Mottonen, J. J. Vartiainen, V. Bergholm, and M. M. Salomaa, “Transformation of quantum states using uniformly controlled rotations,” 2004

work page 2004
[43]

On lines and planes of closest fit to systems of points in space,

K. Pearson, “On lines and planes of closest fit to systems of points in space,”The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, 1901

work page 1901
[44]

I. T. Jolliffe,Principal Component Analysis. Springer Series in Statistics, Springer, 2002

work page 2002
[45]

Eigenfaces for recognition,

M. Turk and A. Pentland, “Eigenfaces for recognition,”Journal of Cognitive Neuro- science, vol. 3, no. 1, pp. 71–86, 1991

work page 1991
[46]

A tutorial on principal component analysis,

J. Shlens, “A tutorial on principal component analysis,” 2014

work page 2014
[47]

A survey of knots and quivers,

S. Sachdeva, “A survey of knots and quivers,” 2025

work page 2025
[48]

D. S. Philippe Di Francesco, Pierre Mathieu,Conformal Field Theory. Graduate Texts in Contemporary Physics, Springer, 1 ed., 1997. 35 7 APPENDIX 7 Appendix 7.1 Proof of the correspondence of our problem to the convergence theorem We are working in a Euclideanm−dimensional space, that is,X=Rm. OurAmapping is linear and one-to-one, which means it does not c...

work page 1997
[49]

Proof.The data does not contain noise by construction - they are generated by a theoretical dependence of a deterministic nature

The input data does not contain noise, that is, the observationsycorrespond exactly to the modely=Ax. Proof.The data does not contain noise by construction - they are generated by a theoretical dependence of a deterministic nature. 2.Xis a Banach space that is strictly convex and smooth. Proof.ForR m obviousx,y∈R m, than∀t∈[0,1] :xt+ (1−t)y∈R m

work page
[50]

Proof.IfX ≡R m is a trivial isomorphismRm∗ ∼= Rm 4.Ais linear and continuous operatorA:X → Y

The space allows for a dual mappingJ:X → X ∗, which is continuous and strictly monotonous. Proof.IfX ≡R m is a trivial isomorphismRm∗ ∼= Rm 4.Ais linear and continuous operatorA:X → Y. Proof.Alinear by definition, and linear operator in Euclidean spacesRis continious

work page
[51]

Proof.The image of a Euclidean space with a linear invertible map is a Euclidean space — a closed manifold

The operatorAhas a closed image, which guarantees the existence of a solution with a minimum norm. Proof.The image of a Euclidean space with a linear invertible map is a Euclidean space — a closed manifold

work page
[52]

Therefore,2α−1< β < α−1

The sequence{γ k} ⊂R + is positive and satisfies the following conditions: ∞X k=1 γk =∞(divergence of the sum of steps),(45) ∞X k=1 γ2 k <∞(convergence of the sum of square steps).(46) 36 7.2 Principal Component Analysis (PCA) 7 APPENDIX Proof.According to our dependence of the gradient descent step on the step number, we can group the terms and obtain it...

work page
[53]

Proof.We initialize the parameters arbitrarily, according to the Section 5.1.1

The initial approximation of the linear mapping is chosen arbitrarily. Proof.We initialize the parameters arbitrarily, according to the Section 5.1.1

work page
[54]

own faces

At each iteration, a data element for SGD is randomly selected, which ensures the stochasticity of the method. Proof.This is implemented in the SGD algorithm inpytorch, which is used by us. 7.2 Principal Component Analysis (PCA) ThePrincipalComponentAnalysis(PCA)methodisaclassiclineardatatransformation technique used to reduce dimensionality in multidimen...

work page

[1] [1]

Quantum Field Theory and the Jones Polynomial,

E. Witten, “Quantum Field Theory and the Jones Polynomial,”Commun. Math. Phys., vol. 121, pp. 351–399, 1989

work page 1989

[2] [2]

Ribbon graphs and their invariants derived from quantum groups,

N. Y. Reshetikhin and V. G. Turaev, “Ribbon graphs and their invariants derived from quantum groups,”Commun. Math. Phys., vol. 127, pp. 1–26, 1990

work page 1990

[3] [3]

Fault-tolerant quantum computation by anyons,

A. Kitaev, “Fault-tolerant quantum computation by anyons,”Annals of Physics, vol. 303, no. 1, pp. 2–30, 2003

work page 2003

[4] [4]

Non-Abelian anyons and topological quantum computation,

C. Nayak, S. H. Simon, A. Stern, M. Freedman, and S. Das Sarma, “Non-Abelian anyons and topological quantum computation,”Rev. Mod. Phys., vol. 80, pp. 1083–1159, 2008

work page 2008

[5] [5]

Interpretability vs explainability: The black box of ma- chine learning,

D. Gaurav and S. Tiwari, “Interpretability vs explainability: The black box of ma- chine learning,” in2023 International Conference on Computer Science, Information Technology and Engineering (ICCoSITE), pp. 523–528, 2023

work page 2023

[6] [6]

A survey of methods for explaining black box models,

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, “A survey of methods for explaining black box models,”ACM Comput. Surv., vol. 51, 8 2018

work page 2018

[7] [7]

Inherently interpretable machine learning: A contrasting paradigm to post-hoc explainable ai,

P. Zschech, S. Weinzierl, and M. Kraus, “Inherently interpretable machine learning: A contrasting paradigm to post-hoc explainable ai,”Business & Information Systems Engineering, 2025. Received: 2024-12-17; Accepted: 2025-07-24; Published: 2025-09-15

work page 2025

[8] [8]

Investigating the duality of in- terpretability and explainability in machine learning,

M. Garouani, J. Mothe, A. Barhrhouj, and J. Aligon, “Investigating the duality of in- terpretability and explainability in machine learning,” in2024 IEEE 36th International Conference on Tools with Artificial Intelligence (ICTAI), p. 861–867, IEEE, 10 2024

work page 2024

[9] [9]

On the definition and importance of interpretability in scientific machine learning,

C. Rowan and A. Doostan, “On the definition and importance of interpretability in scientific machine learning,” 2025

work page 2025

[10] [10]

Locally pareto- optimal interpretations for black-box machine learning models,

A. Joshi, S. Chakraborty, S. Akshay, S. Shah, H. Torfah, and S. Seshia, “Locally pareto- optimal interpretations for black-box machine learning models,” 2025

work page 2025

[11] [11]

Machine learning with physics knowledge for prediction: A survey,

J. Watson, C. Song, O. Weeger, T. Gruner, A. T. Le, K. Pompetzki, A. Hendawy, O. Arenz, W. Trojak, M. Cranmer, C. D’Eramo, F. Bülow, T. Goyal, J. Peters, and M. W. Hoffman, “Machine learning with physics knowledge for prediction: A survey,” 2025

work page 2025

[12] [12]

From physics to machine learning and back: Part ii - learning and observational bias in phm,

O. Fink, I. Nejjar, V. Sharma, K. F. Niresi, H. Sun, H. Dong, C. Xu, A. Wei, A. Bizzi, R. Theiler, Y. Tian, L. V. Krannichfeldt, Z. Ma, S. Garmaev, Z. Zhang, and M. Zhao, “From physics to machine learning and back: Part ii - learning and observational bias in phm,” 2025

work page 2025

[13] [13]

Physics-informed machine learning for combustion: A review,

J. Wu, X. Wang, Y. Wu, G. Zhang, J. Liu, and X. Li, “Physics-informed machine learning for combustion: A review,” 2025

work page 2025

[14] [14]

Physix: A foundation model for physics simulations,

T. Nguyen, A. Koneru, S. Li, and A. Grover, “Physix: A foundation model for physics simulations,” 2025

work page 2025

[15] [15]

Towards a physics foundation model,

F. Wiesner, M. Wessling, and S. Baek, “Towards a physics foundation model,” 2025. 33 REFERENCES REFERENCES

work page 2025

[16] [16]

Two-phase regularized phase-field density gradient navier–stokes based flow model: Tuning for microfluidic and digital core applications,

V. Balashov, E. Savenkov, A. Khlyupin, and K. M. Gerke, “Two-phase regularized phase-field density gradient navier–stokes based flow model: Tuning for microfluidic and digital core applications,”Journal of Computational Physics, vol. 521, p. 113554, 2025

work page 2025

[17] [17]

Structure discovery in nonparametric regression through compositional kernel search,

D. Duvenaud, J. R. Lloyd, R. Grosse, J. B. Tenenbaum, and Z. Ghahramani, “Structure discovery in nonparametric regression through compositional kernel search,” 2013

work page 2013

[18] [18]

"why should i trust you?

M. T. Ribeiro, S. Singh, and C. Guestrin, “"why should i trust you?": Explaining the predictions of any classifier,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD’16, (NewYork, NY,USA), p. 1135–1144, Association for Computing Machinery, 2016

work page 2016

[19] [19]

Definitions, meth- ods, and applications in interpretable machine learning,

W. J. Murdoch, C. Singh, K. Kumbier, R. Abbasi-Asl, and B. Yu, “Definitions, meth- ods, and applications in interpretable machine learning,”Proceedings of the National Academy of Sciences, vol. 116, no. 44, pp. 22071–22080, 2019

work page 2019

[20] [20]

Extracting Tree-structured Representations of Trained Networks,

M. W. Craven and J. W. Shavlik, “Extracting Tree-structured Representations of Trained Networks,” inAdvances in Neural Information Processing Systems 8(D. S. Touretzky, M. C. Mozer, and M. E. Hasselmo, eds.), pp. 24–30, MIT Press, 1996

work page 1996

[21] [21]

Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins,

V. de Crécy-Lagard, R. Dias, N. Sexson, I. Friedberg, Y. Yuan, and M. A. Swairjo, “Limitations of current machine-learning models in predicting enzymatic functions for uncharacterized proteins,”bioRxiv, 2025

work page 2025

[22] [22]

M. A. Nielsen and I. L. Chuang,Quantum Computation and Quantum Information. Cambridge University Press, 10th anniversary ed., 2010

work page 2010

[23] [23]

Accessed: 2024-10-29

IBM Quantum,IBM Quantum Documentation, 2024. Accessed: 2024-10-29

work page 2024

[24] [24]

A comprehensive survey of loss functions and metrics in deep learning,

J. Terven, D.-M. Cordova-Esparza, J.-A. Romero-González, A. Ramírez-Pedraza, and E. A. Chávez-Urbiola, “A comprehensive survey of loss functions and metrics in deep learning,”Artificial Intelligence Review, vol. 58, 4 2025

work page 2025

[25] [25]

Experimental comparison of two quantum computing archi- tectures,

N. M. Linke, D. Maslov, M. Roetteler, S. Debnath, C. Figgatt, K. A. Landsman, K. Wright, and C. Monroe, “Experimental comparison of two quantum computing archi- tectures,”Proceedings of the National Academy of Sciences, vol. 114, no. 13, pp. 3305– 3310, 2017

work page 2017

[26] [26]

Quantum computations: algorithms and error correction,

A. Y.Kitaev, “Quantum computations: algorithms and error correction,”Russian Math- ematical Surveys, vol. 52, no. 6, pp. 1191–1249, 1997

work page 1997

[27] [27]

Providence, Rhode Island: American Mathematical Society, 2002

A.Y.Kitaev, A.Shen, andM.N.Vyalyi,Classical and Quantum Computation, vol.47of Graduate Studies in Mathematics. Providence, Rhode Island: American Mathematical Society, 2002

work page 2002

[28] [28]

Gates, states, and circuits: Quantum gates

G. E. Crooks, “Gates, states, and circuits: Quantum gates.”https://threeplusone. com/gates, 3 2024. Tech. Note 014 v0.11.0 beta

work page 2024

[29] [29]

Largektopological quantum computer,

N. Kolganov, S. Mironov, and A. Morozov, “Largektopological quantum computer,” 2022

work page 2022

[30] [30]

A universal quantum circuit for two-qubit transformations with three cnot gates,

G. Vidal and C. M. Dawson, “A universal quantum circuit for two-qubit transformations with three cnot gates,”arXiv preprint, p. 3, 2008. 34 REFERENCES REFERENCES

work page 2008

[31] [31]

Forets and C

M. Forets and C. Schilling,The Inverse Problem for Neural Networks, p. 241–255. Springer Nature Switzerland, 12 2023

work page 2023

[32] [32]

Are deep neural architectures losing information? invertibility is indispensable,

Y. Liu, Z. Qin, S. Anwar, S. Caldwell, and T. Gedeon, “Are deep neural architectures losing information? invertibility is indispensable,” 2020

work page 2020

[33] [33]

On surjectivity of neural networks: Can you elicit any behavior from your model?,

H. Jiang and N. Haghtalab, “On surjectivity of neural networks: Can you elicit any behavior from your model?,” 2025

work page 2025

[34] [34]

Mathematicalaspectsofthedecompositionofdiagonal u(n) operators,

M.M.FedinandA.A.Morozov, “Mathematicalaspectsofthedecompositionofdiagonal u(n) operators,” 2025

work page 2025

[35] [35]

diagonal_decomposition,

M. M. Fedin, “diagonal_decomposition,” 2025. Available on the GitHub

work page 2025

[36] [36]

Pytorch: An imperative style, high-performance deep learning library,

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga,et al., “Pytorch: An imperative style, high-performance deep learning library,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 32, pp. 8024–8035, Curran Associates, Inc., 2019

work page 2019

[37] [37]

On the convergence of stochastic gradient descent for linear inverse problems in banach spaces,

B. Jin and Z. Kereta, “On the convergence of stochastic gradient descent for linear inverse problems in banach spaces,”arXiv preprint, p. 31, 2023

work page 2023

[38] [38]

4.05 perfect binary trees

D. W. Harder, “4.05 perfect binary trees.” Lecture slides for ECE 250: Algorithms and Data Structures. Hosted on the course page of C. Moreno

work page

[39] [39]

D. E. Knuth,The Art of Computer Programming, Volume 1: Fundamental Algorithms. Reading, MA: Addison-Wesley Professional, 3 ed., 1997

work page 1997

[40] [40]

Persistent homology: A pedagogical introduction with biological applications,

A. J. Kemme and C. A. Agyingi, “Persistent homology: A pedagogical introduction with biological applications,” 2025

work page 2025

[41] [41]

Twist deformation of physical trefoil knots,

T. Goto, S. Nomura, and T. G. Sano, “Twist deformation of physical trefoil knots,” 2025

work page 2025

[42] [42]

Transformation of quantum states using uniformly controlled rotations,

M. Mottonen, J. J. Vartiainen, V. Bergholm, and M. M. Salomaa, “Transformation of quantum states using uniformly controlled rotations,” 2004

work page 2004

[43] [43]

On lines and planes of closest fit to systems of points in space,

K. Pearson, “On lines and planes of closest fit to systems of points in space,”The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, vol. 2, no. 11, pp. 559–572, 1901

work page 1901

[44] [44]

I. T. Jolliffe,Principal Component Analysis. Springer Series in Statistics, Springer, 2002

work page 2002

[45] [45]

Eigenfaces for recognition,

M. Turk and A. Pentland, “Eigenfaces for recognition,”Journal of Cognitive Neuro- science, vol. 3, no. 1, pp. 71–86, 1991

work page 1991

[46] [46]

A tutorial on principal component analysis,

J. Shlens, “A tutorial on principal component analysis,” 2014

work page 2014

[47] [47]

A survey of knots and quivers,

S. Sachdeva, “A survey of knots and quivers,” 2025

work page 2025

[48] [48]

D. S. Philippe Di Francesco, Pierre Mathieu,Conformal Field Theory. Graduate Texts in Contemporary Physics, Springer, 1 ed., 1997. 35 7 APPENDIX 7 Appendix 7.1 Proof of the correspondence of our problem to the convergence theorem We are working in a Euclideanm−dimensional space, that is,X=Rm. OurAmapping is linear and one-to-one, which means it does not c...

work page 1997

[49] [49]

Proof.The data does not contain noise by construction - they are generated by a theoretical dependence of a deterministic nature

The input data does not contain noise, that is, the observationsycorrespond exactly to the modely=Ax. Proof.The data does not contain noise by construction - they are generated by a theoretical dependence of a deterministic nature. 2.Xis a Banach space that is strictly convex and smooth. Proof.ForR m obviousx,y∈R m, than∀t∈[0,1] :xt+ (1−t)y∈R m

work page

[50] [50]

Proof.IfX ≡R m is a trivial isomorphismRm∗ ∼= Rm 4.Ais linear and continuous operatorA:X → Y

The space allows for a dual mappingJ:X → X ∗, which is continuous and strictly monotonous. Proof.IfX ≡R m is a trivial isomorphismRm∗ ∼= Rm 4.Ais linear and continuous operatorA:X → Y. Proof.Alinear by definition, and linear operator in Euclidean spacesRis continious

work page

[51] [51]

Proof.The image of a Euclidean space with a linear invertible map is a Euclidean space — a closed manifold

The operatorAhas a closed image, which guarantees the existence of a solution with a minimum norm. Proof.The image of a Euclidean space with a linear invertible map is a Euclidean space — a closed manifold

work page

[52] [52]

Therefore,2α−1< β < α−1

The sequence{γ k} ⊂R + is positive and satisfies the following conditions: ∞X k=1 γk =∞(divergence of the sum of steps),(45) ∞X k=1 γ2 k <∞(convergence of the sum of square steps).(46) 36 7.2 Principal Component Analysis (PCA) 7 APPENDIX Proof.According to our dependence of the gradient descent step on the step number, we can group the terms and obtain it...

work page

[53] [53]

Proof.We initialize the parameters arbitrarily, according to the Section 5.1.1

The initial approximation of the linear mapping is chosen arbitrarily. Proof.We initialize the parameters arbitrarily, according to the Section 5.1.1

work page

[54] [54]

own faces

At each iteration, a data element for SGD is randomly selected, which ensures the stochasticity of the method. Proof.This is implemented in the SGD algorithm inpytorch, which is used by us. 7.2 Principal Component Analysis (PCA) ThePrincipalComponentAnalysis(PCA)methodisaclassiclineardatatransformation technique used to reduce dimensionality in multidimen...

work page