Recognition: unknown
New non-Euclidean neural quantum states from additional types of hyperbolic recurrent neural networks
Pith reviewed 2026-05-08 04:11 UTC · model grok-4.3
The pith
Hyperbolic RNN and GRU networks produce neural quantum states that outperform Euclidean versions on Heisenberg spin models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
All four hyperbolic RNN/GRU NQS variants outperform their Euclidean counterparts on 100-spin systems for every J2 and (J2,J3) coupling considered; Lorentz RNN and Poincaré RNN always beat Euclidean RNN, while Lorentz and Poincaré GRU beat Euclidean GRU except for one J2=0 case. Lorentz GRU and Poincaré GRU alternate as the single best ansatz in four of eight settings, yet the parameter-light Lorentz RNN surpasses Euclidean GRU in all eight settings and beats both other hyperbolic variants in four of them.
What carries the argument
Hyperbolic recurrent neural networks (Poincaré RNN, Lorentz RNN, Lorentz GRU, Poincaré GRU) serving as variational ansatzes for neural quantum states in VMC calculations.
If this is right
- Hyperbolic NQS can reach lower energies than Euclidean NQS even when the latter uses more parameters.
- The performance edge persists across different strengths of next-nearest-neighbor couplings, including the unfrustrated limit.
- Lorentz RNN offers a compact ansatz that can exceed both Euclidean GRU and other hyperbolic GRUs in some regimes.
- Hyperbolic embeddings appear useful for quantum models whose interaction graphs contain hierarchical or tree-like structure.
Where Pith is reading between the lines
- Similar curvature advantages might appear if the same hyperbolic RNN layers were inserted into other variational families such as tensor networks.
- The single exception for Poincaré GRU at J2=0 suggests that the benefit of hyperbolic geometry is not limited to long-range couplings.
- Because the models are one-dimensional chains, the same constructions could be tested on two-dimensional lattices where hierarchical patterns are stronger.
Load-bearing premise
The lower variational energies arise from the use of hyperbolic geometry rather than from unequal parameter counts, different optimization hyperparameters, or particular choices of initial conditions and training schedules.
What would settle it
Re-run the same VMC experiments on the 100-spin Heisenberg models while forcing every Euclidean and hyperbolic variant to have identical parameter counts, identical random seeds for initialization, and identical training schedules; the hyperbolic variants must still produce strictly lower energies.
Figures
read the original abstract
In this work, we extend the class of previously introduced non-Euclidean neural quantum states (NQS) which consists only of Poincar\'e hyperbolic GRU, to new variants including Poincar\'e RNN as well as Lorentz RNN and Lorentz GRU. In addition to constructing and introducing the new non-Euclidean hyperbolic NQS ansatzes, we generalized the results of our earlier work regarding the definitive outperformances delivered by hyperbolic Poincar\'e GRU NQS ansatzes when benchmarked against their Euclidean counterparts in the Variational Monte Carlo (VMC) experiments involving the quantum many-body settings of the Heisenberg $J_1J_2$ and $J_1J_2J_3$ models, which exhibit hierarchical structures in the forms of the different degrees of nearest-neighbor interactions. Here, in particular, using larger systems consisting of 100 spins, we found that all four hyperbolic RNN/GRU NQS variants always outperformed their respective Euclidean counterparts. Specifically, for all $J_2$ and $(J_2,J_3)$ couplings considered, including $J_2=0.0$, Lorentz RNN NQS and Poincar\'e RNN NQS always outperformd Euclidean RNN NQS, while Lorentz/Poincar\'e GRU NQS always outperformed Euclidean GRU NQS, with a single exception when $J_2=0.0$ for Poincar\'e GRU NQS. Furthermore, among the four hyperbolic NQS ansatzes, depending on the specific $J_2$ or $(J_2, J_3)$ couplings, on four out of eight experiment settings, Lorentz GRU and Poincar\'e GRU took turns to be the top performing variant among all Euclidean and hyperbolic NQS ansatzes considered, while Lorentz RNN, with up to three times fewer parameters, was capable of not only surpassing the Euclidean GRU eight out of eight times but also outperforming both Lorentz GRU and Poincar\'e GRU four out of eight times, to emerge as the best overall hyperbolic NQS ansatz.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces three new hyperbolic neural quantum state (NQS) architectures—Poincaré RNN, Lorentz RNN, and Lorentz GRU—extending prior work on Poincaré GRU NQS. It reports variational Monte Carlo results on the J1J2 and J1J2J3 Heisenberg models for 100-spin systems, claiming that all four hyperbolic variants (including the new ones) consistently achieve lower variational energies than their Euclidean RNN and GRU counterparts across eight coupling settings, with Lorentz RNN performing particularly well despite using up to three times fewer parameters.
Significance. If the reported energy improvements are shown to arise specifically from the hyperbolic geometry rather than differences in model capacity or optimization, the work would meaningfully expand the toolkit of non-Euclidean NQS ansatzes for frustrated spin systems that exhibit hierarchical structure. The extension to RNN variants and the scaling to 100 spins are concrete contributions that could be built upon in future variational studies.
major comments (2)
- [Results / abstract] Results section (and abstract): the central attribution of performance gains to hyperbolic geometry requires that total parameter counts, hidden dimensions, optimizer schedules, and initialization schemes are matched between each hyperbolic variant and its Euclidean counterpart. The abstract states that Lorentz RNN uses up to three times fewer parameters while still outperforming Euclidean GRU; without explicit scaling of the Euclidean baselines to enforce parity, the energy differences cannot be unambiguously assigned to the Lorentz or Poincaré metric rather than to differences in expressivity or training dynamics.
- [Results] Experimental protocol (presumably §4 or §5): the manuscript supplies no numerical variational energies, statistical error bars, number of independent runs, or convergence diagnostics for the eight 100-spin settings. The claim that hyperbolic variants “always outperformed” their Euclidean counterparts therefore rests on qualitative statements rather than quantitative, reproducible evidence that would allow assessment of effect size and robustness.
minor comments (2)
- [Abstract] Abstract: “outperformd” is a typographical error.
- [Figures / tables] Notation: the distinction between the four hyperbolic variants and their Euclidean baselines should be made explicit in every figure caption and table that reports energies, to avoid ambiguity when comparing RNN versus GRU families.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback. We address each major comment below with clarifications based on our experimental setup and indicate the revisions we will make to strengthen the manuscript.
read point-by-point responses
-
Referee: [Results / abstract] Results section (and abstract): the central attribution of performance gains to hyperbolic geometry requires that total parameter counts, hidden dimensions, optimizer schedules, and initialization schemes are matched between each hyperbolic variant and its Euclidean counterpart. The abstract states that Lorentz RNN uses up to three times fewer parameters while still outperforming Euclidean GRU; without explicit scaling of the Euclidean baselines to enforce parity, the energy differences cannot be unambiguously assigned to the Lorentz or Poincaré metric rather than to differences in expressivity or training dynamics.
Authors: We thank the referee for this important observation on ensuring comparable model capacities. In our VMC experiments, the hidden dimensions, optimizer schedules (Adam with identical learning rates and decay), and initialization schemes were matched between each hyperbolic variant and its Euclidean counterpart. The parameter reduction in the Lorentz RNN is an intrinsic consequence of the Lorentz-group parameterization in hyperbolic space, which we view as evidence of the geometry's efficiency rather than a mismatch. To eliminate any ambiguity and allow direct attribution to the non-Euclidean structure, we will add an explicit table of parameter counts for all models and include supplementary results with Euclidean baselines scaled to equal or higher parameter counts. revision: yes
-
Referee: [Results] Experimental protocol (presumably §4 or §5): the manuscript supplies no numerical variational energies, statistical error bars, number of independent runs, or convergence diagnostics for the eight 100-spin settings. The claim that hyperbolic variants “always outperformed” their Euclidean counterparts therefore rests on qualitative statements rather than quantitative, reproducible evidence that would allow assessment of effect size and robustness.
Authors: We agree that quantitative details are necessary for full reproducibility and evaluation of robustness. While the manuscript emphasizes the consistent qualitative trend across the eight settings, the underlying runs included statistical analysis. In the revised manuscript we will insert tables reporting the variational energies (means and standard deviations) from five independent VMC runs per setting, together with convergence diagnostics and full optimization-protocol specifications. This will provide the requested numerical evidence and effect-size information. revision: yes
Circularity Check
Minor self-citation to prior Poincaré GRU work; new outperformance claims rest on independent VMC benchmarks
full rationale
The paper introduces new hyperbolic NQS architectures (Poincaré RNN, Lorentz RNN, Lorentz GRU) by direct construction and reports their performance via fresh VMC simulations on 100-spin J1J2 and J1J2J3 Heisenberg instances. These energy comparisons are external numerical measurements, not quantities defined inside the paper's equations or fitted parameters renamed as predictions. The abstract's reference to generalizing the authors' earlier Poincaré GRU results is acknowledged but does not carry the new claims, which involve distinct architectures, larger system sizes, and explicit outperformance counts (e.g., Lorentz RNN surpassing Euclidean GRU in 8/8 settings). No self-definitional loops, uniqueness theorems imported from self-citations, or ansatzes smuggled via prior work appear in the derivation or benchmarking chain. The work is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- RNN/GRU network weights and biases
axioms (1)
- domain assumption Hyperbolic geometry provides a more natural representation for systems with hierarchical or tree-like correlation structures than Euclidean geometry.
Reference graph
Works this paper leans on
-
[1]
Solving the Quantum Many-Body Problem with Artificial Neural Networks
G. Carleo and M. Troyer, Solving the Quantum Many-Body Problem with Artificial Neural Networks, Science 355, 602 (2017), arXiv:1606.02318 [cond-mat.dis-nn]
work page Pith review arXiv 2017
-
[2]
L. Huang and L. Wang, Accelerate Monte Carlo Simulations with Restricted Boltzmann Machines, arXiv:1610.02746v2
- [3]
-
[4]
H. Saito and M. Kato, Machine learning technique to find quantum many-body ground states of bosons on a lattice, J. Phys. Soc. Jpn. 87, 014001 (2018), arXiv:1709.05468 [cond-mat.dis-nn]
- [5]
- [6]
- [7]
-
[8]
O.-E. Ganea, G. Becigneul, and T. Hofmann, Hyperbolic Neural Networks, Advances in Neural Information Processing Systems 31, pages 5345–5355. Curran Associates, Inc. arXiv: 1805.09112 [cs.LG]
-
[9]
R. Sarkar. Low distortion Delaunay embedding of trees in hyperbolic plane, In Proc. of the International Symposium on Graph Drawing (GD 2011), pages 355–366, Eindhoven, Netherlands, 2011
2011
-
[10]
Hibat-Allah, M
M. Hibat-Allah, M. Ganahl, L. E. Hayward, R. G. Melko, and J. Carrasquill, Recurrent neural network wave functions, Physical Review Research 2, 023358 (2020)
2020
-
[11]
M. Hibat-Allah, R. G. Melko, J. Carrasquilla, Supplementing Recurrent Neural Network Wave Functions with Symmetry and Annealing to Improve Accuracy, Machine Learning and the Physical Sciences, NeurIPS 2021, arXiv:2207.14314v2 [cond-mat.dis-nn]
-
[12]
M. Hibat-Allah, E. Merali, G. Torlai, R. G. Melko and J. Carrasquilla, Recurrent neural network wave functions for Rydberg atom arrays on kagome lattice, arXiv:2405.20384v1 [cond-mat.quant-gas]
- [13]
- [14]
- [15]
-
[16]
K. Sprague and S. Czischek, Variational Monte Carlo with Large Patched Transformers, Commun Phys 7, 90 (2024), arXiv:2306.03921 [quant-ph]
- [17]
-
[18]
Rende and L
R. Rende and L. L. Viteritti, Are queries and keys always relevant? a case study on transformer wave functions, Machine Learning: Science and Technology 6, 010501 (2025)
2025
- [19]
-
[20]
F. Becca and S. Sorella, Quantum Monte Carlo approaches for correlated systems, Cambridge University Press 2017, DOI: 10.1017/9781316417041
- [21]
-
[22]
Chung, C
J. Chung, C. Gulcehre, K. Cho, Y. Bengio, Gated feedback recurrent neural networks, in: ICML, 2015
2015
- [23]
- [24]
- [25]
-
[26]
arXiv preprint arXiv:1805.09786 , year=
C. Gulcehre, M. Denil, M. Malinowski, A. Razavi, R. Pascanu, et. al., Hyperbolic Attention Networks, arXiv:1805.09786v1 [cs.NE]
-
[27]
F.Lopez and M. Strube, A Fully Hyperbolic Neural Model for Hierarchical Multi-Class Classification, Findings of EMNLP2020, arXiv:2010.02053 [cs.CL]
-
[28]
arXiv preprint arXiv:2006.08210 (2020)
R. Shimizu, Y. Mukuta and T. Harada, Hyperbolic Neural Networks++, The Ninth International Confer- ence on Learning Representations (ICLR 2021), arXiv:2006.08210 [cs.LG]
-
[29]
E. Mathieu, C. Le Lan, C. J. Maddison, R. Tomioka, and Y. W. Teh, Continuous Hierarchical Represen- tations with Poincar´ e Variational Auto-Encoders, arXiv:1901.06033
-
[30]
G. Bachmann, G. Becigneul, O.-E. Ganea, Constant Curvature Graph Convolutional Networks, arXiv:1911.05076v3
-
[31]
Linial, E
N. Linial, E. London, and Y. Rabinovich. The geometry of graphs and some of its algorithmic applications, Combinatorica, 15(2):215–245, 1995
1995
-
[32]
Krioukov, F
D. Krioukov, F. Papadopoulos, A. Vahdat, and M. Bogun´ a, Curvature and temperature of complex net- works, Physical Review E, 80(3):035101, 2009
2009
-
[33]
Krioukov, F
D. Krioukov, F. Papadopoulos, M. Kitsak, A. Vahdat, and M. Bogun´ a, Hyperbolic geometry of complex networks, Physical Review E, 82(3):036106, 2010
2010
-
[34]
F. Sala, C. De Sa, A. Gu, and C. R´ e. 2018. Representation tradeoffs for hyperbolic embeddings. In Inter- national Conference on Machine Learning, pages 4457–4466
2018
-
[35]
Bonnabel, Stochastic gradient descent on riemannian manifolds, IEEE Transactions on Automatic Con- trol, 58(9):2217–2229, Sept 2013
S. Bonnabel, Stochastic gradient descent on riemannian manifolds, IEEE Transactions on Automatic Con- trol, 58(9):2217–2229, Sept 2013
2013
-
[36]
Ganea, G
O.-E. Ganea, G. B´ ecigneul, and T. Hofmann. Hyperbolic entailment cones for learning hierarchical embed- dings. In Proceedings of the thirty-fifth international conference on machine learning (ICML), 2018
2018
-
[37]
Riemannian Adaptive Optimization Methods
G. B´ ecigneul and O.-E Ganea, Riemannian Adaptive Optimization Methods, International Conference on Learning Representations (ICLR) (2019), arXiv:1810.00760 [cs.LG]
work page Pith review arXiv 2019
-
[38]
S. R. White and I. Affleck, Dimerization and incommensurate spiral spin correlations in the zigzag spin chain: Analogies to the Kondo lattice, Phys. Rev. B 54, 9862 (1996)
1996
-
[39]
Marshall, Antiferromagnetism, Proc
W. Marshall, Antiferromagnetism, Proc. R. Soc. A 232, 48 (1955)
1955
-
[40]
Poincar\'e Embeddings for Learning Hierarchical Representations
M. Nickel and D. Kiela, Poincar´ e embeddings for learning hierarchical representations, arXiv:1705.08039 [cs.AI]
-
[41]
M. Nickel and D. Kiela, Learning Continuous Hierarchies, in the Lorentz Model of Hyperbolic Geometry, ICML 2018, arXiv:1806.03417 [cs.AI]
- [42]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.