Recognition: unknown
Criticality and Saturation in Orthogonal Neural Networks
Pith reviewed 2026-05-08 12:22 UTC · model grok-4.3
The pith
Orthogonal weight initializations stabilize the finite-width correction tensors of neural networks as depth increases.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We derive explicit layer-wise recursion relations for the tensors appearing in the finite-width expansion of the network statistics in the case of orthogonal initializations. We also provide an extension of Feynman diagrams for the corresponding recursions which are valid to all orders in 1/width. We show explicitly that the recursions reproduce the stability of the finite-width tensors observed for activation functions with vanishing fixed point. Numerical solutions of the recursions and their large-depth expansions agree with Monte-Carlo estimates from network ensembles.
What carries the argument
Layer-wise recursion relations for the tensors in the 1/width expansion of network statistics under orthogonal initialization, which track moment evolution across layers and produce saturation at large depth.
If this is right
- The recursions allow computation of network statistics at arbitrary depth without full ensemble simulation.
- Stability of the tensors holds specifically when the activation function has a vanishing fixed point.
- The diagrammatic extension covers all orders in the inverse-width expansion for the orthogonal case.
- The derived relations close the theoretical account of why orthogonal weights prevent divergence of finite-width corrections in deep networks.
Where Pith is reading between the lines
- The recursions could be solved analytically to predict the depth at which saturation begins for given width and activation.
- Similar recursion structures might appear under other structured initializations that preserve norm.
- The method supplies a route to study how criticality conditions interact with finite-width effects in deeper architectures.
Load-bearing premise
The leading terms of the inverse-width power series continue to dominate the network behavior even when depth becomes large.
What would settle it
Numerical iteration of the derived recursions yields tensors that grow without bound as depth increases, while direct Monte-Carlo sampling from finite-width orthogonal networks produces saturating tensors.
Figures
read the original abstract
It has been known for a long time that initializing weight matrices to be orthogonal instead of having i.i.d. Gaussian components can improve training performance. This phenomenon can be analyzed using finite-width corrections, where the infinite-width statistics are supplemented by a power series in $1/\mathrm{width}$. In particular, recent empirical results by Day et al. show that the tensors appearing in this treatment stabilize for large depth, as opposed to the tensors of i.i.d.-initialized networks. In this article, we derive explicit layer-wise recursion relations for the tensors appearing in the finite-width expansion of the network statistics in the case of orthogonal initializations. We also provide an extension of recently-introduced Feynman diagrams for the corresponding recursions in the i.i.d.-case which are valid to all orders in $1/\mathrm{width}$. Finally, we show explicitly that the recursions we derive reproduce the stability of the finite-width tensors which was observed for activation functions with vanishing fixed point. This work therefore provides a theoretical explanation for the stability of nonlinear networks of finite width initialized with orthogonal weights, closing a long-standing gap in the literature. We validate our theoretical results experimentally by showing that numerical solutions of our recursion relations and their analytical large-depth expansions agree excellently with Monte-Carlo estimates from network ensembles.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper derives explicit layer-wise recursion relations for the leading finite-width correction tensors in the statistics of deep neural networks initialized with orthogonal weights. It extends Feynman diagram techniques (previously for i.i.d. Gaussian weights) to the orthogonal case to all orders in 1/width, obtains closed recursions for the tensors, and shows that these recursions reproduce the large-depth stability of the tensors observed empirically for activations with vanishing fixed points. The theoretical predictions are validated by agreement between numerical solutions of the recursions (and their large-depth analytic expansions) and Monte-Carlo estimates from finite-width network ensembles.
Significance. If the derivations and perturbative control hold, the work supplies the missing theoretical account for why orthogonal initialization stabilizes finite-width corrections at large depth (in contrast to i.i.d. Gaussian initialization), thereby closing a documented gap. The explicit recursions and diagram extension are reusable tools; the Monte-Carlo match provides direct empirical support for the central claim.
major comments (2)
- [large-depth expansions and validation sections] The central claim that the derived recursions explain the observed stability rests on the assumption that the leading 1/width tensors remain dominant as depth L grows large. No explicit remainder bound or uniform-in-L control on the truncation error of the finite-width expansion is provided (see the large-depth analysis and the statement that 'the recursions reproduce the stability'). If higher-order terms accumulate or the effective expansion parameter grows with L via diagram combinatorics, the leading-tensor stability would not suffice to explain the finite-width behavior.
- [derivation of recursion relations] The recursion relations are stated to close at leading order after incorporating the orthogonal constraint. However, it is not shown whether orthogonality-induced correlations at finite width can feed back into the leading tensors at depths where the expansion parameter is no longer parametrically small (see the derivation of the layer-wise recursions and the diagram extension).
minor comments (2)
- [preliminaries] Notation for the tensors (e.g., the precise definition of the leading correction objects) should be introduced once with an explicit equation reference rather than relying on prior diagram papers.
- [experimental validation] The Monte-Carlo validation would benefit from reporting the range of widths and depths tested and the number of independent network realizations per point to allow assessment of statistical error.
Simulated Author's Rebuttal
We thank the referee for their careful reading of the manuscript and for the constructive major comments. We address each point below, providing clarifications on the scope of our perturbative analysis and indicating revisions where they strengthen the presentation without altering the central claims.
read point-by-point responses
-
Referee: [large-depth expansions and validation sections] The central claim that the derived recursions explain the observed stability rests on the assumption that the leading 1/width tensors remain dominant as depth L grows large. No explicit remainder bound or uniform-in-L control on the truncation error of the finite-width expansion is provided (see the large-depth analysis and the statement that 'the recursions reproduce the stability'). If higher-order terms accumulate or the effective expansion parameter grows with L via diagram combinatorics, the leading-tensor stability would not suffice to explain the finite-width behavior.
Authors: We agree that the analysis is perturbative in 1/width and that an explicit uniform-in-L remainder bound is not derived. The manuscript establishes closed recursions for the leading-order tensors, obtains their large-depth analytic expansions, and demonstrates quantitative agreement with Monte-Carlo estimates from finite-width ensembles. This agreement holds across the depths examined, indicating that higher-order contributions do not destabilize the leading tensors in practice. In the revised version we will add an explicit caveat in the large-depth section acknowledging the absence of a rigorous truncation bound and clarifying that the explanatory power for observed stability rests on the combination of exact leading-order closure and empirical validation. revision: partial
-
Referee: [derivation of recursion relations] The recursion relations are stated to close at leading order after incorporating the orthogonal constraint. However, it is not shown whether orthogonality-induced correlations at finite width can feed back into the leading tensors at depths where the expansion parameter is no longer parametrically small (see the derivation of the layer-wise recursions and the diagram extension).
Authors: The Feynman-diagram extension incorporates the orthogonal constraints order by order in 1/width. At leading order the diagram rules ensure that orthogonality-induced correlations are absorbed into the recursion kernels without introducing feedback from higher-order diagrams into the leading tensors. This decoupling follows from the structure of the orthogonal ensemble and holds at every depth because the recursion is derived by collecting all diagrams that contribute at O(1/width). We will insert a short clarifying paragraph in the derivation section that explicitly states this decoupling and references the diagram rules that prevent higher-order leakage into the leading tensors. revision: yes
- Absence of an explicit remainder bound or uniform-in-L control on the truncation error of the finite-width expansion
Circularity Check
Derivation of orthogonal recursions independent of stability observation; minor self-citation to prior diagrams
full rationale
The paper derives explicit layer-wise recursion relations for the finite-width tensors directly from the orthogonal weight initialization constraint combined with the 1/width perturbative expansion. These recursions are then solved to reproduce the large-depth stability previously observed for activations with vanishing fixed points, with results matching Monte-Carlo ensemble estimates. This chain does not reduce to a fitted input or self-definition by the paper's equations. A self-citation exists to recently-introduced Feynman diagrams for the i.i.d. case, which is extended here, but the orthogonal recursions constitute new independent content and are externally validated against simulations rather than relying on the citation as load-bearing support.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The finite-width expansion in powers of 1/width is a valid asymptotic description of network statistics for large but finite width.
- domain assumption Activation functions have a vanishing fixed point (average output zero when input is zero).
Reference graph
Works this paper leans on
-
[1]
Understanding the Difficulty of Training Deep Feedforward Neural Networks
Xavier Glorot and Yoshua Bengio. “Understanding the Difficulty of Training Deep Feedforward Neural Networks”. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics. JMLR Workshop and Conference Proceedings, Mar. 2010, pp. 249–256
2010
-
[2]
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He et al. “Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification”. In:Proceedings of the IEEE International Conference on Computer Vision. 2015, pp. 1026–1034. arXiv:1502.01852
work page Pith review arXiv 2015
-
[3]
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Andrew M. Saxe, James L. McClelland, and Surya Ganguli.Exact Solutions to the Nonlinear Dynamics of Learning in Deep Linear Neural Networks. Feb. 2014. arXiv:1312.6120
work page Pith review arXiv 2014
-
[4]
All You Need Is a Good Init
Dmytro Mishkin and Jiri Matas. “All You Need Is a Good Init”. In:International Conference on Learning Representations
- [5]
-
[6]
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks
Wei Hu, Lechao Xiao, and Jeffrey Pennington. “Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks”. In:International Conference on Learning Representations. Sept. 2019. arXiv:2001.05992
-
[7]
Resurrecting the Sigmoid in Deep Learning through Dynamical Isometry: Theory and Practice
Jeffrey Pennington, Samuel Schoenholz, and Surya Ganguli. “Resurrecting the Sigmoid in Deep Learning through Dynamical Isometry: Theory and Practice”. In:Advances in Neural Information Processing Systems. Vol. 30. Curran Associates, Inc., 2017. arXiv:1711.04735
-
[8]
The Emergence of Spectral Universality in Deep Net- works
Jeffrey Pennington, Samuel Schoenholz, and Surya Ganguli. “The Emergence of Spectral Universality in Deep Net- works”. In:Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics. PMLR, Mar. 2018, pp. 1924–1932. arXiv:1802.09979
-
[9]
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao et al. “Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks”. In:Proceedings of the 35th International Conference on Machine Learning. PMLR, July 2018, pp. 5393–5402
2018
-
[10]
Neural tangent kernel: Convergence and generalization in neural networks
Arthur Jacot, Franck Gabriel, and Clement Hongler. “Neural Tangent Kernel: Convergence and Generalization in Neural Networks”. In:Advances in Neural Information Processing Systems. Vol. 31. Curran Associates, Inc., 2018. arXiv:1806.07572
-
[11]
On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization
Wei Huang, Weitao Du, and Richard Yi Da Xu. “On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization”. In:Twenty-Ninth International Joint Conference on Artificial Intelligence. Vol. 3. Aug. 2021, pp. 2577– 2583.doi:10.24963/ijcai.2021/355. arXiv:2004.05867
-
[12]
Roberts, Sho Yaida, and Boris Hanin
Daniel A. Roberts and Sho Yaida.The Principles of Deep Learning Theory: An Effective Theory Approach to Under- standing Neural Networks. Cambridge: Cambridge University Press, 2022.isbn: 978-1-316-51933-2.doi:10.1017/ 9781009023405. arXiv:2106.10165. 11
-
[13]
Weingarten, Asymptotic behavior of group integrals in the limit of infinite rank, J
Don Weingarten. “Asymptotic Behavior of Group Integrals in the Limit of Infinite Rank”. In:Journal of Mathematical Physics19.5 (May 1978), pp. 999–1001.issn: 0022-2488.doi:10.1063/1.523807
-
[14]
Max Guillen, Philipp Misof, and Jan E. Gerken.Finite-Width Neural Tangent Kernels from Feynman Diagrams. Aug. 2025.doi:10.48550/arXiv.2508.11522. arXiv:2508.11522
-
[15]
Tiled Convolutional Neural Networks
Jiquan Ngiam et al. “Tiled Convolutional Neural Networks”. In:Advances in Neural Information Processing Systems. Vol. 23. Curran Associates, Inc., 2010
2010
-
[16]
Non-Gaussian Processes and Neural Networks at Finite Widths
Sho Yaida. “Non-Gaussian Processes and Neural Networks at Finite Widths”. In:Proceedings of The First Mathematical and Scientific Machine Learning Conference. PMLR, Aug. 2020, pp. 165–192. arXiv:1910.00019
-
[17]
Symmetry-via-Duality: Invariant Neural Network Densities from Parameter-Space Correlators
Anindita Maiti, Keegan Stoner, and James Halverson. “Symmetry-via-Duality: Invariant Neural Network Densities from Parameter-Space Correlators”. In:Machine Learning in Pure Mathematics and Theoretical Physics. Chap. Chapter 8, pp. 293–330.doi:10.1142/9781800613706_0008. arXiv:2106.00694
-
[18]
Structures of Neural Network Effective Theories
Ian Banta et al. “Structures of Neural Network Effective Theories”. In:Physical Review D109.10 (May 2024), p. 105007. doi:10.1103/PhysRevD.109.105007. arXiv:2305.02334
-
[19]
arXiv preprint arXiv:2210.16859 , year=
Alexander Maloney, Daniel A. Roberts, and James Sully.A Solvable Model of Neural Scaling Laws. Oct. 2022. arXiv: 2210.16859
-
[20]
Graph neural networks in particle physics
Zhengkang Zhang. “Neural Scaling Laws from Large-N Field Theory: Solvable Model beyond the Ridgeless Limit”. In:Machine Learning: Science and Technology6.2 (Apr. 2025), p. 025010.issn: 2632-2153.doi:10.1088/2632- 2153/adc872. arXiv:2405.19398
-
[21]
Neural Networks and Quantum Field Theory
James Halverson, Anindita Maiti, and Keegan Stoner. “Neural Networks and Quantum Field Theory”. In:Machine Learning: Science and Technology2.3 (Sept. 2021), p. 035002.issn: 2632-2153.doi:10.1088/2632-2153/abeca3. arXiv:2008.08601
-
[22]
The Edge of Chaos: Quantum Field Theory and Deep Neural Networks
Kevin Grosvenor and Ro Jefferson. “The Edge of Chaos: Quantum Field Theory and Deep Neural Networks”. In:SciPost Physics12.3 (Mar. 2022), p. 081.issn: 2542-4653.doi:10.21468/SciPostPhys.12.3.081. arXiv:2109.13247
-
[23]
Neural Network Field Theories: Non-Gaussianity, Actions, and Locality
Mehmet Demirtas et al. “Neural Network Field Theories: Non-Gaussianity, Actions, and Locality”. In:Machine Learning: Science and Technology5.1 (Jan. 2024), p. 015002.issn: 2632-2153.doi:10.1088/2632-2153/ad17d3. arXiv:2307.03223
-
[24]
Beno ˆıt Collins and Piotr´Sniady. “Integration with Respect to the Haar Measure on Unitary, Orthogonal and Symplectic Groups”. In:Communications in Mathematical Physics264.3 (2006), pp. 773–795.doi:10.1007/s00220- 006- 1554-3
-
[25]
Beno ˆıt Collins and Sho Matsumoto. “On some properties of orthogonal Weingarten functions”. In:Journal of Mathe- matical Physics50.11 (2009), p. 113516.doi:10.1063/1.3251304
-
[26]
Feature Learning and Generalization in Deep Networks with Orthogonal Weights
Hannah Day, Yonatan Kahn, and Daniel A Roberts. “Feature Learning and Generalization in Deep Networks with Orthogonal Weights”. In:Machine Learning: Science and Technology6.3 (Aug. 2025), p. 035027.issn: 2632-2153. doi:10.1088/2632-2153/adf278. arXiv:2310.07765. 12 A Orthogonal NTK tensors The statistics of the joint distribution of preactivations and the...
-
[27]
Preactivations, NTKs, dNTKs and ddNTKs are represented by external lines, as illustrated below. 𝑧 𝛼 ≡ 𝛼 cΔΘ𝛼𝛽 ≡ 𝛼 𝛽 cdΘ 𝛿0 𝛿1 𝛿2 ≡ 𝛿1 𝛿2 𝛿0 ddIΘ 𝛿0 𝛿1 𝛿2 𝛿3 ≡ 𝛿1 𝛿2 𝛿3 𝛿0 ddIIΘ 𝛿1 𝛿2 𝛿3 𝛿4 ≡ 𝛿3 𝛿4 𝛿1 𝛿2 (47) In the first line of (47), a colored line represents a single NTK label. Distinct colors are used for external dotted lines associated with differe...
-
[28]
Define the cubic vertices as 𝛼𝑐 𝛽𝑐 𝜎 (ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼 𝛽 𝜎 (ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 ∼1 𝛼𝑐 𝛽𝑐 Θ(ℓ) 𝛼𝛽 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼𝑐 𝛽𝑐 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼𝑐 𝛽𝑐 𝜎 (ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼𝑐 𝛽𝑐 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼𝑐 𝛽𝑐 𝜎′′(ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼𝑐 𝛽𝑐 𝜎′(ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 ∼𝐶 𝑊 𝛼 𝛽 𝜎′(ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 ∼1 𝛼𝑐 𝛽𝑐 Θ(ℓ) 𝛼𝛽 𝜎′′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼𝐶 ...
-
[29]
The square represents the full expectation value
Draw a square propagator connecting internal lines in all possible ways, consistent with the chosen pairing. The square represents the full expectation value. E[ · ] (49) This procedure generates distinct diagram types, both connected and disconnected. The connected diagrams are further classified as𝑠-class diagrams, defined by the number𝑠of square propag...
-
[30]
For each𝑠-class diagram, generate all inequivalent permutations of its 2𝑚external labels carrying orthogonality charge. Multiply each resulting diagram by 1/𝑛for every uncharged pairing, and by the appropriate𝑚-class Weingarten functionW, determined by the relative ordering𝜏of the diagram’s labels with respect to the original pairing𝜋=(12) (34). . .(2𝑘2𝑘−...
-
[31]
Multiply each𝑠-class contribution by the M ¨obius coefficient(−1) 𝑠−1 (𝑠−1)!, and sum over all classes. The second group implements the effective field theory techniques developed in [11], applied to the square propagator in the diagrammatic construction of the previous step, through the following set of Feynman rules analogous to those introduced in [13]:
-
[32]
The expectation value is taken over the decorations of the internal lines attached to the propagator, which obeys the same selection rules described in [13]
We define the bare propagator as ⟨ ⟩ 𝐾 (ℓ) ≡ (50) where⟨ ⟩ 𝐾 (ℓ) denotes a zero-mean Gaussian expectation with covariance specified by𝐾 (ℓ) . The expectation value is taken over the decorations of the internal lines attached to the propagator, which obeys the same selection rules described in [13]. These rules are summarized as follows: (a) Propagators ma...
-
[33]
Quartic vertices are defined analogously, following [13]. Explicitly, 𝛼1 𝛼2 1 𝑛 𝑉 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 𝛼1 𝛼2 1 𝑛 𝐷 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 𝛼1 𝛼3 1 𝑛 𝐹 (ℓ+1) 𝛼1 𝛼3 𝛼2 𝛼4 𝛼2 𝛼4 19 𝛼1 𝛼2 1 𝑛 𝐴(ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 𝛼1 𝛼3 1 𝑛 𝐵(ℓ+1) 𝛼1 𝛼3 𝛼2 𝛼4 𝛼2 𝛼4 𝛼1 𝛼2 1 𝑛 𝑃 (ℓ+1) 𝛼3 𝛼1 𝛼2 𝛼4 𝛼3 𝛼4 𝛼1 𝛼3 1 𝑛 𝑄 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼2 𝛼4 𝛼1 𝛼2 1 𝑛 𝑅 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼...
-
[34]
Higher-order NTK and preactivation tensors are introduced via a natural generalization of the vertices in (51)
-
[35]
This decomposition respects the selection rules (a)-(f)
The square propagator decomposes into all connected and disconnected diagrams built from the bare propagator, quartic vertices, and higher-order vertices, with internal lines remaining undotted. This decomposition respects the selection rules (a)-(f). C Feynman rules in action In this appendix, we explicitly apply the Feynman rules (1)-(9) of Section 4 to...
-
[36]
𝑧 𝛼 ≡ 𝛼 cΔΘ𝛼𝛽 ≡ 𝛼 𝛽 (72) where a colored line corresponds to a single NTK label
Preactivations and NTKs are represented by external lines, as illustrated below. 𝑧 𝛼 ≡ 𝛼 cΔΘ𝛼𝛽 ≡ 𝛼 𝛽 (72) where a colored line corresponds to a single NTK label
-
[37]
The propagator is represented by ⟨ ⟩ 𝐾 (ℓ) ≡ (73) where⟨ ⟩ 𝐾 (ℓ) denotes a zero-mean Gaussian expectation with covariance specified by𝐾 (ℓ) . The expectation value is taken over the decorations of the internal lines attached to the propagator, which satisfies the set of selection rules (a)-(f) listed in Rule (6) of Section 4
-
[38]
Cubic vertices are defined as in [13]. Explicitly, 𝛼 𝛽 dΔ𝐺 (ℓ) 𝑖, 𝛼𝛽 ∼ 𝐶𝑊 𝑛 , 𝛼 𝛽 cΔΩ (ℓ+1) 𝑖, 𝛼𝛽 ∼ 1 𝑛 , 𝛼 𝛽 𝜎 (ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼ 𝐶𝑊 𝑛 , 𝛼 𝛽 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼ 𝐶𝑊 𝑛 , 𝛼 𝛽 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 ∼ 𝐶𝑊 𝑛 (74) where bΩ(ℓ+1) 𝑖, 𝛼𝛽 =𝜎 (ℓ) 𝑖, 𝛼 𝜎 (ℓ) 𝑖,𝛽 +𝐶𝑊 Θ(ℓ) 𝛼𝛽 𝜎′(ℓ) 𝑖, 𝛼 𝜎′(ℓ) 𝑖,𝛽 and cΔΩ (ℓ+1) 𝑖, 𝛼𝛽 = bΩ(ℓ+1) 𝑖, 𝛼𝛽 −⟨bΩ(ℓ+1) 𝑖, 𝛼𝛽 ⟩𝐾 (ℓ) . Lines with...
-
[39]
Quartic vertices are defined analogously, following [13]. Explicitly, 𝛼1 𝛼2 1 𝑛 𝑉 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 , 𝛼1 𝛼2 1 𝑛 𝐷 (ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 , 𝛼1 𝛼3 1 𝑛 𝐹 (ℓ+1) 𝛼1 𝛼3 𝛼2 𝛼4 𝛼2 𝛼4 , 𝛼1 𝛼2 1 𝑛 𝐴(ℓ+1) 𝛼1 𝛼2 𝛼3 𝛼4 𝛼3 𝛼4 , 𝛼1 𝛼3 1 𝑛 𝐵(ℓ+1) 𝛼1 𝛼3 𝛼2 𝛼4 𝛼2 𝛼4 (75)
-
[40]
The orthogonal diagram describing the 2𝑚-point cumulant for the reference pairing𝜋=(12) (34) · · · (2𝑚− 1 2𝑚)is obtained from the Gaussian one, using the Feynman rules (74) and (75), by summing over all reconnections of the external labels: 𝑉 orth 2𝑚, 𝜋 =𝑉 gauss 2𝑚, 𝜋 + ∑︁ 𝜆⊢𝑚, 𝜆≠(1,...,1) 𝛽𝜆 𝑛 𝑚−ℓ(𝜆) ∑︁ 𝜏∈ C𝜆 (𝜋) ℓ(𝜆)Ö 𝑗=1 𝑉 conn 2𝜆 𝑗 (76) Here: •𝜆=(𝜆 1,...
-
[41]
In the 1 𝑛-expansion ofW [2,1], only the terms of order 1 𝑛5 and 1 𝑛4 contribute nontrivially. The former becomes relevant when all neural indices are distinct, yielding 1 𝑛2 ℓ ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼2 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼3 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼4 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) + ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼2 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼3 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼4 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) +⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼3 ⟩𝐾...
-
[42]
In the 1 𝑛-expansion ofW [3], only the term of order 1 𝑛5 contributes. This occurs when all neural indices are distinct: 2 𝑛2 ℓ ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼3 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼2 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼4 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) + ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼3 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼2 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼4 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) +⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼4 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼2 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼3 𝜎 (ℓ) 𝛼6...
-
[43]
5 54 (logℓ) 3 + 1 36 (logℓ) 2 (51−10 logℓ 0) +𝑐 𝐷 1,0 + 1 144 logℓ −274−290 logℓ 0 −75(logℓ 0)2 −192𝑐 Θ 1,0 −288𝑐 𝑉 2,0 # ,(144) 𝐹 (ℓ) =− 1 2ℓ + 1 ℓ2
In the 1 𝑛-expansion ofW [2], only the terms of order 1 𝑛4 and 1 𝑛3 contribute nontrivially. The 1 𝑛4 term arises when all neural indices are distinct − 1 𝑛2 ℓ ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼3 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼2 𝜎 (ℓ) 𝛼4 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼5 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) + ⟨𝜎 (ℓ) 𝛼1 𝜎 (ℓ) 𝛼4 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼2 𝜎 (ℓ) 𝛼3 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼5 𝜎 (ℓ) 𝛼6 ⟩𝐾 (ℓ) 38 +⟨𝜎 (ℓ) 𝛼3 𝜎 (ℓ) 𝛼5 ⟩𝐾 (ℓ) ⟨𝜎 (ℓ) 𝛼4...
2009
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.