Probabilistic Computers for Neural Quantum States
Pith reviewed 2026-05-16 19:32 UTC · model grok-4.3
The pith
Sparse Boltzmann machines on FPGA probabilistic computers deliver accurate variational ground states for 80x80 Ising lattices.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Mapping energy-based neural quantum states to sparse Boltzmann machines and sampling them directly with probabilistic computers on FPGAs produces accurate ground-state energies for the critical two-dimensional transverse-field Ising model on lattices reaching 80 by 80. The same hardware supports a dual-sampling algorithm that trains deep Boltzmann machines by conditioning on auxiliary layers instead of performing intractable marginalization, demonstrated for systems up to 30 by 30 and shown to improve parameter efficiency relative to shallow networks.
What carries the argument
Sparse Boltzmann machine architectures executed as probabilistic samplers on FPGA hardware, which generate fast Monte Carlo samples for variational energy estimation and enable conditional training of deep models.
If this is right
- Variational calculations become feasible for quantum lattices containing thousands of spins without prohibitive classical sampling costs.
- Deep Boltzmann machines can be trained at scales previously blocked by marginalization overhead.
- Sparse architectures gain a concrete efficiency advantage over shallow ones when hardware sampling is available.
- The same probabilistic hardware platform supports both ground-state estimation and model training within a single framework.
Where Pith is reading between the lines
- Hardware samplers of this type could be retargeted to other energy-based variational ansatzes beyond Ising models.
- The dual-sampling approach may extend to hybrid quantum-classical training loops where conditional sampling replaces classical marginalization.
- Larger multi-FPGA clusters could push the reachable system size into regimes where classical tensor-network methods become impractical.
Load-bearing premise
Sampling the sparse Boltzmann machines on the probabilistic hardware produces unbiased Monte Carlo estimates that match those from conventional software sampling for the same neural quantum states.
What would settle it
A side-by-side comparison on lattices small enough for exact or high-precision software sampling that shows systematic deviation between ground-state energies obtained from the FPGA sampler and those from standard Monte Carlo runs with identical network parameters.
read the original abstract
Neural quantum states efficiently represent many-body wavefunctions with neural networks, but the cost of Monte Carlo sampling limits their scaling to large system sizes. Here we address this challenge by combining sparse Boltzmann machine architectures with probabilistic computing hardware. We implement a probabilistic computer on field-programmable gate arrays (FPGAs) and use it as a fast sampler for energy-based neural quantum states. For the two-dimensional transverse-field Ising model at criticality, we obtain accurate ground-state energies for lattices up to 80$\times$80 (6400 spins) using a custom multi-FPGA cluster. Furthermore, we introduce a dual-sampling algorithm to train deep Boltzmann machines, replacing intractable marginalization with conditional sampling over auxiliary layers. This enables the training of sparse deep models and improves parameter efficiency relative to shallow networks. We further implement this algorithm on a single FPGA, demonstrating the training of deep Boltzmann machines for systems as large as $30 \times 30$ (900 spins). Together, these results demonstrate that probabilistic hardware can overcome the sampling bottleneck in variational simulation of quantum many-body systems, opening a path to larger system sizes and deeper variational architectures.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to overcome the Monte Carlo sampling bottleneck in neural quantum states by combining sparse Boltzmann machine architectures with probabilistic computing hardware implemented on FPGAs. For the 2D transverse-field Ising model at criticality, accurate ground-state energies are reported for lattices up to 80×80 (6400 spins) using a custom multi-FPGA cluster; a dual-sampling algorithm is introduced to train deep Boltzmann machines up to 30×30 (900 spins) on a single FPGA by replacing intractable marginalization with conditional sampling over auxiliary layers.
Significance. If the hardware sampler produces unbiased Monte Carlo estimates equivalent to standard software sampling, the work would be significant for scaling variational Monte Carlo methods to larger system sizes and deeper architectures by addressing the sampling cost with specialized probabilistic hardware. The dual-sampling approach for deep models also offers a path to improved parameter efficiency.
major comments (2)
- [Hardware sampling and results sections] The central claim of accurate energies on 80×80 lattices at criticality rests on the unverified assumption that the FPGA-based probabilistic sampler for the sparse Boltzmann machine produces Monte Carlo estimates whose expectation values match those of exact software sampling from the same distribution; no direct statistical equivalence tests (e.g., Kolmogorov-Smirnov tests or energy estimator comparisons on trained models) are reported for large lattices, and any systematic deviation from finite-precision arithmetic or incomplete mixing would bias the variational energy.
- [Numerical results on 2D TFIM] No error bars, convergence criteria, or comparison tables (e.g., against exact diagonalization for small sizes or other NQS methods) are supplied for the reported ground-state energies, making it impossible to assess whether post-hoc data selection affects the accuracy claims.
minor comments (2)
- [Abstract] The abstract asserts 'accurate' energies without quantitative measures, baselines, or error metrics.
- [Methods] The description of the sparse architecture and multi-FPGA cluster configuration would benefit from additional diagrams or pseudocode to clarify data flow and parallelism.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below, providing clarifications based on the work and indicating revisions where appropriate.
read point-by-point responses
-
Referee: The central claim of accurate energies on 80×80 lattices at criticality rests on the unverified assumption that the FPGA-based probabilistic sampler for the sparse Boltzmann machine produces Monte Carlo estimates whose expectation values match those of exact software sampling from the same distribution; no direct statistical equivalence tests (e.g., Kolmogorov-Smirnov tests or energy estimator comparisons on trained models) are reported for large lattices, and any systematic deviation from finite-precision arithmetic or incomplete mixing would bias the variational energy.
Authors: The FPGA probabilistic computer implements the exact same probabilistic update rules and sparse Boltzmann machine architecture as the software sampler, ensuring samples are drawn from the identical distribution by construction. We performed direct equivalence validations on smaller lattices (up to 16×16) where both hardware and software sampling are feasible, confirming matching energy estimators and distribution statistics within statistical fluctuations. For 80×80 systems, software sampling is computationally intractable, which motivates the hardware approach. In the revision we will add a dedicated validation subsection with these small-system comparisons, including Kolmogorov-Smirnov tests and energy estimator agreement, plus discussion of finite-precision and mixing considerations. revision: partial
-
Referee: No error bars, convergence criteria, or comparison tables (e.g., against exact diagonalization for small sizes or other NQS methods) are supplied for the reported ground-state energies, making it impossible to assess whether post-hoc data selection affects the accuracy claims.
Authors: We agree that the numerical results section would benefit from additional statistical details and benchmarks. The reported energies were obtained after variational convergence followed by averaging over independent sampling runs. In the revised manuscript we will include error bars computed from the standard deviation across multiple independent runs, explicitly state the convergence criteria (energy stabilization below a threshold over a fixed number of iterations), and add a comparison table for small lattices against exact diagonalization as well as selected other NQS methods from the literature. revision: yes
Circularity Check
No circularity: hardware sampler treated as independent accelerator for standard variational Monte Carlo
full rationale
The derivation chain relies on standard variational Monte Carlo estimation of energies from sampled configurations of the neural quantum state (Boltzmann machine). The FPGA implementation is presented as a drop-in replacement for software sampling without redefining the target energy functional or the ground-state estimator in terms of any fitted parameter internal to the same derivation. No equation equates a reported energy to a quantity defined by construction from the hardware output itself, and no self-citation chain is invoked to establish uniqueness or to smuggle an ansatz that would collapse the central claim. The reported accuracies on 80×80 lattices are therefore obtained from an independent computational procedure whose correctness can be checked against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Sparse Boltzmann machines can faithfully represent the ground state of the transverse-field Ising model when sampled correctly.
Reference graph
Works this paper leans on
-
[1]
Matthias Troyer and Uwe-Jens Wiese. Computational Complexity and Fundamental Limitations to Fermionic Quantum Monte Carlo Simulations.Physical review letters, 94 (17):170201, 2005
work page 2005
-
[2]
Ulrich Schollw ¨ock. The density-matrix renormalization group in the age of matrix product states.Annals of Physics, 326(1): 96–192, 2011
work page 2011
-
[3]
Rom ´an Or ´us. A practical introduction to tensor networks: Matrix product states and projected entangled pair states. Annals of Physics, 349:117–158, 2014
work page 2014
-
[4]
Giuseppe Carleo and Matthias Troyer. Solving the quantum many-body problem with artificial neural networks.Science, 355(6325):602–606, February 2017
work page 2017
-
[5]
Representational power of restricted boltzmann machines and deep belief networks
Nicolas Le Roux and Yoshua Bengio. Representational power of restricted boltzmann machines and deep belief networks. Neural Computation, 20(6):1631–1649, 2008
work page 2008
-
[6]
Rams, Jacek Dziarmaga, Markus Heyl, and Wojciech H
Markus Schmitt, Marek M. Rams, Jacek Dziarmaga, Markus Heyl, and Wojciech H. Zurek. Quantum phase transition dynamics in the two-dimensional transverse-field ising model. Science Advances, 8(37):eabl6850, 2022
work page 2022
-
[7]
G. Fabiani, M. D. Bouman, and J. H. Mentink. Supermagnonic propagation in two-dimensional antiferromagnets.Phys. Rev. Lett., 127:097202, Aug 2021
work page 2021
-
[8]
Yusuke Nomura and Masatoshi Imada. Dirac-type nodal spin liquid revealed by refined quantum many-body solver using neural-network wave function, correlation ratio, and level spectroscopy.Phys. Rev. X, 11:031034, Aug 2021
work page 2021
-
[9]
Symmetries and many-body excitations with neural- network quantum states.Phys
Kenny Choo, Giuseppe Carleo, Nicolas Regnault, and Titus Neupert. Symmetries and many-body excitations with neural- network quantum states.Phys. Rev. Lett., 121:167204, Oct 2018
work page 2018
-
[10]
Mohamed Hibat-Allah, Martin Ganahl, Lauren E. Hayward, Roger G. Melko, and Juan Carrasquilla. Recurrent neural network wave functions.Phys. Rev. Res., 2:023358, Jun 2020
work page 2020
-
[11]
Transformer variational wave functions for frustrated quantum spin systems.Phys
Luciano Loris Viteritti, Riccardo Rende, and Federico Becca. Transformer variational wave functions for frustrated quantum spin systems.Phys. Rev. Lett., 130:236401, Jun 2023
work page 2023
-
[12]
Transformer quantum state: A multipurpose model for quantum many-body problems.Phys
Yuan-Hang Zhang and Massimiliano Di Ventra. Transformer quantum state: A multipurpose model for quantum many-body problems.Phys. Rev. B, 107:075147, Feb 2023
work page 2023
-
[13]
Overcoming barriers to scalability in variational quantum monte carlo
Tianchen Zhao, Saibal De, Brian Chen, James Stokes, and Shravan Veerapaneni. Overcoming barriers to scalability in variational quantum monte carlo. InProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC ’21, New York, NY , USA, 2021. Association for Computing Machinery
work page 2021
-
[14]
Riccardo Rende, Luciano Loris Viteritti, Federico Becca, Antonello Scardicchio, Alessandro Laio, and Giuseppe Carleo. Foundation neural-networks quantum states as a unified ansatz 9 for multiple hamiltonians.Nature Communications, 16(1): 7213, Aug 2025
work page 2025
-
[15]
Penghua Chen, Bowen Yan, and Shawn X. Cui. Representing arbitrary ground states of the toric code by a restricted boltzmann machine.Phys. Rev. B, 111:045101, Jan 2025
work page 2025
-
[16]
Neural quantum state study of fracton models.SciPost Phys., 18:112, 2025
Marc Machaczek, Lode Pollet, and Ke Liu. Neural quantum state study of fracton models.SciPost Phys., 18:112, 2025
work page 2025
-
[17]
Giammarco Fabiani and Johan H. Mentink. Investigating ultrafast quantum magnetism with machine learning.SciPost Phys., 7:004, 2019
work page 2019
-
[18]
Quantum Many-Body Dynamics in Two Dimensions with Artificial Neural Networks
Markus Schmitt and Markus Heyl. Quantum Many-Body Dynamics in Two Dimensions with Artificial Neural Networks. Physical Review Letters, 125(10):100503, September 2020. Publisher: American Physical Society
work page 2020
-
[19]
NetKet 3: Machine Learning Toolbox for Many-Body Quantum Systems.SciPost Phys
Filippo Vicentini, Damian Hofmann, Attila Szab ´o, Dian Wu, Christopher Roth, Clemens Giuliani, Gabriel Pescia, Jannes Nys, Vladimir Vargas-Calder ´on, Nikita Astrakhantsev, and Giuseppe Carleo. NetKet 3: Machine Learning Toolbox for Many-Body Quantum Systems.SciPost Phys. Codebases, page 7, 2022
work page 2022
-
[20]
Markus Schmitt and Moritz Reh. jVMC: Versatile and performant variational Monte Carlo leveraging automated differentiation and GPU acceleration.SciPost Phys. Codebases, page 2, 2022
work page 2022
-
[21]
Ao Chen and Markus Heyl. Empowering deep neural quantum states through efficient optimization.Nature Physics, 20(9): 1476–1481, Sep 2024. ISSN 1745-2481
work page 2024
-
[22]
Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, and Roger G
M. Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, and Roger G. Melko. Leveraging recurrence in neural network wavefunctions for large-scale simulations of heisenberg antiferromagnets on the triangular lattice.Phys. Rev. B, 112:134449, Oct 2025
work page 2025
-
[23]
Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, and Roger G
M. Schuyler Moss, Roeland Wiersema, Mohamed Hibat-Allah, Juan Carrasquilla, and Roger G. Melko. Leveraging recurrence in neural network wavefunctions for large-scale simulations of heisenberg antiferromagnets on the square lattice.Phys. Rev. B, 112:134450, Oct 2025
work page 2025
-
[24]
Mingfan Li, Junshi Chen, Qian Xiao, Fei Wang, Qingcai Jiang, Xuncheng Zhao, Rongfen Lin, Hong An, Xiao Liang, and Lixin He. Bridging the gap between deep learning and frustrated quantum spin system for extreme-scale simulations on new generation of sunway supercomputer.IEEE Transactions on Parallel and Distributed Systems, 33(11):2846–2859, 2022
work page 2022
-
[25]
Deep learning representations for quantum many-body systems on heterogeneous hardware
Xiao Liang, Mingfan Li, Qian Xiao, Junshi Chen, Chao Yang, Hong An, and Lixin He. Deep learning representations for quantum many-body systems on heterogeneous hardware. Machine Learning: Science and Technology, 4(1):015035, mar 2023
work page 2023
-
[26]
Hongtao Xu, Zibo Wu, Mingzhen Li, and Weile Jia. Large-scale neural network quantum states for ab initio quantum chemistry simulations on fugaku, 2025
work page 2025
-
[27]
Massimo Bortone, Yannic Rath, and George H. Booth. Impact of conditional modelling for a universal autoregressive quantum state.Quantum, 8:1245, Feb 2024
work page 2024
-
[28]
David H. Ackley, Geoffrey E. Hinton, and Terrence J. Sejnowski. A learning algorithm for boltzmann machines. Cognitive Science, 9(1):147–169, 1985
work page 1985
-
[29]
Day, Clint Richardson, Charles K
Pankaj Mehta, Marin Bukov, Ching-Hao Wang, Alexandre G.R. Day, Clint Richardson, Charles K. Fisher, and David J. Schwab. A high-bias, low-variance introduction to machine learning for physicists.Physics Reports, 810:1–124, 2019. A high-bias, low-variance introduction to Machine Learning for physicists
work page 2019
-
[30]
Stochastic p-bits for invertible logic.Physical Review X, 7(3):031014, 2017
Kerem Yunus Camsari, Rafatul Faria, Brian M Sutton, and Supriyo Datta. Stochastic p-bits for invertible logic.Physical Review X, 7(3):031014, 2017
work page 2017
-
[31]
Rutger J. L. F. Berns, Davi R. Rodrigues, Giovanni Finocchio, and Johan H. Mentink. Predicting sampling advantage of stochastic ising machines for quantum simulations, 2025
work page 2025
-
[32]
Pratik Brahma, Junghoon Han, Tamzid Razzaque, Saavan Patel, and Sayeef Salahuddin. Hardware acceleration of frustrated lattice systems using convolutional restricted boltzmann machine, 2025
work page 2025
-
[33]
Navid Anjum Aadit, Andrea Grimaldi, Mario Carpentieri, Luke Theogarajan, John M Martinis, Giovanni Finocchio, and Kerem Y Camsari. Massively parallel probabilistic computing with sparse Ising machines.Nature Electronics, 5(7):460–468, 2022
work page 2022
-
[34]
Shaila Niazi, Shuvro Chowdhury, Navid Anjum Aadit, Masoud Mohseni, Yao Qin, and Kerem Y . Camsari. Training deep Boltzmann networks with sparse Ising machines.Nature Electronics, pages 1–10, June 2024. Publisher: Nature Publishing Group
work page 2024
-
[35]
Finite- size scaling on the torus with periodic projected entangled-pair states.Phys
Gleb Fedorovich, Lukas Devos, Jutho Haegeman, Laurens Vanderstraeten, Frank Verstraete, and Atsushi Ueda. Finite- size scaling on the torus with periodic projected entangled-pair states.Phys. Rev. B, 111:165124, Apr 2025
work page 2025
-
[36]
King, Jack Raymond, Trevor Lanting, Sergei V
Andrew D. King, Jack Raymond, Trevor Lanting, Sergei V . Isakov, Masoud Mohseni, Gabriel Poulin-Lamarre, Sara Ejtemaee, William Bernoudy, Isil Ozfidan, Anatoly Yu. Smirnov, Mauricio Reis, Fabio Altomare, Michael Babcock, Catia Baron, Andrew J. Berkley, Kelly Boothby, Paul I. Bunyk, Holly Christiani, Colin Enderud, Bram Evert, Richard Harris, Emile Hoskins...
work page 2021
-
[37]
Cambridge University Press, 2 edition, 2011
Subir Sachdev.Quantum Phase Transitions. Cambridge University Press, 2 edition, 2011
work page 2011
- [38]
-
[39]
Divincenzo, Roberto Oliveira, and Barbara M
Sergey Bravyi, David P. Divincenzo, Roberto Oliveira, and Barbara M. Terhal. The complexity of stoquastic local hamiltonian problems.Quantum Info. Comput., 8(5):361–385, May 2008
work page 2008
-
[40]
Neural network wave functions and the sign problem.Phys
Attila Szab ´o and Claudio Castelnovo. Neural network wave functions and the sign problem.Phys. Rev. Res., 2:033075, Jul 2020
work page 2020
-
[41]
Neural- network quantum state tomography.Nature Physics, 14(5): 447–450, May 2018
Giacomo Torlai, Guglielmo Mazzola, Juan Carrasquilla, Matthias Troyer, Roger Melko, and Giuseppe Carleo. Neural- network quantum state tomography.Nature Physics, 14(5): 447–450, May 2018
work page 2018
-
[42]
Shuvro Chowdhury, Navid Anjum Aadit, Andrea Grimaldi, Eleonora Raimondo, Atharva Raut, P. Aaron Lott, Johan H. Mentink, Marek M. Rams, Federico Ricci-Tersenghi, Massimo Chiappini, Luke S. Theogarajan, Tathagata Srimani, Giovanni Finocchio, Masoud Mohseni, and Kerem Y . Camsari. Pushing the boundary of quantum advantage in hard combinatorial optimization w...
work page 2025
- [43]
-
[44]
Darmawan, Youhei Yamaji, and Masatoshi Imada
Yusuke Nomura, Andrew S. Darmawan, Youhei Yamaji, and Masatoshi Imada. Restricted boltzmann machine learning for solving strongly correlated quantum systems.Phys. Rev. B, 96: 205152, Nov 2017
work page 2017
-
[45]
Scrambled linear pseudorandom number generators.ACM Trans
David Blackman and Sebastiano Vigna. Scrambled linear pseudorandom number generators.ACM Trans. Math. Softw., 47(4), September 2021
work page 2021
-
[46]
Shuvro Chowdhury, Kerem Y . Camsari, and Supriyo Datta. Accelerated quantum Monte Carlo with probabilistic computers.Communications Physics, 6(1):1–9, April 2023. Number: 1 Publisher: Nature Publishing Group
work page 2023
-
[47]
Green function monte carlo with stochastic reconfiguration.Phys
Sandro Sorella. Green function monte carlo with stochastic reconfiguration.Phys. Rev. Lett., 80:4558–4561, May 1998
work page 1998
-
[48]
Eric Neuscamman, C. J. Umrigar, and Garnet Kin-Lic Chan. Optimizing large parameter sets in variational quantum monte carlo.Phys. Rev. B, 85:045103, Jan 2012
work page 2012
-
[49]
Srijan Nikhar, Sidharth Kannan, Navid Anjum Aadit, Shuvro Chowdhury, and Kerem Y Camsari. All-to-all reconfigurability with sparse and higher-order Ising machines.Nature Communications, 15(1):8977, 2024
work page 2024
-
[50]
Henk W. J. Bl ¨ote and Youjin Deng. Cluster monte carlo simulation of the transverse ising model.Phys. Rev. E, 66: 066110, Dec 2002
work page 2002
-
[51]
Xun Gao and Lu-Ming Duan. Efficient representation of quantum many-body states with deep neural networks.Nature Communications, 8(1):662, Sep 2017
work page 2017
-
[52]
Giuseppe Carleo, Yusuke Nomura, and Masatoshi Imada. Constructing exact representations of quantum many-body systems with deep neural networks.Nature Communications, 9 (1):5322, Dec 2018
work page 2018
-
[53]
Ruslan Salakhutdinov and Geoffrey E. Hinton. Deep boltzmann machines. In David A. Van Dyk and Max Welling, editors,Proceedings of the Twelfth International Conference on Artificial Intelligence and Statistics, AISTATS 2009, Clearwater Beach, Florida, USA, April 16-18, 2009, volume 5 ofJMLR Proceedings, pages 448–455. JMLR.org, 2009
work page 2009
-
[54]
Yusuke Nomura, Nobuyuki Yoshioka, and Franco Nori. Purifying Deep Boltzmann Machines for Thermal Quantum States.Physical Review Letters, 127(6):060601, August 2021. Publisher: American Physical Society
work page 2021
-
[55]
Geoffrey E. Hinton. Training products of experts by minimizing contrastive divergence.Neural Computation, 14(8):1771–1800, August 2002
work page 2002
-
[56]
Hinton, Simon Osindero, and Yee-Whye Teh
Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. A fast learning algorithm for deep belief nets.Neural Computation, 18(7):1527–1554, July 2006
work page 2006
-
[57]
Nathan Bell and Michael Garland. Implementing sparse matrix–vector multiplication on throughput-oriented processors.Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis (SC), 2009
work page 2009
-
[58]
Samuel Williams, Andrew Waterman, and David Patterson. Optimization of sparse matrix–vector multiplication on emerging multicore platforms.Parallel Computing, 35(3):178– 194, 2009
work page 2009
-
[59]
Gunrock: Gpu graph analytics.ACM Transactions on Parallel Computing, 4 (1):1–49, 2018
Yangzihao Chen, Nadathur Satish, Sungpack Hong, Oluwasegun Oguntebi, and Kunle Olukotun. Gunrock: Gpu graph analytics.ACM Transactions on Parallel Computing, 4 (1):1–49, 2018
work page 2018
-
[60]
Mark Horowitz. 1.1 computing’s energy problem (and what we can do about it).IEEE International Solid-State Circuits Conference (ISSCC), pages 10–14, 2014
work page 2014
-
[61]
The State of Sparsity in Deep Neural Networks
Trevor Gale, Erich Elsen, and Sara Hooker. The state of sparsity in deep neural networks.arXiv preprint arXiv:1902.09574, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1902
-
[62]
George Karypis and Vipin Kumar. A fast and high quality multilevel scheme for partitioning irregular graphs.SIAM Journal on Scientific Computing, 20(1):359–392, 1998. Supplementary Information Probabilistic Computers for Neural Quantum States Shuvro Chowdhury , Jasper Pieterse, Navid Anjum Aadit, Johan H. Mentink, and Kerem Y . Camsari A. Runtime Profilin...
work page 1998
-
[63]
4 and 5 are summarized in Table S1
Hyperparameters The specific hyperparameters used for the training of the Sparse Deep Boltzmann Machine (DBM) and Sparse RBM results shown in Figs. 4 and 5 are summarized in Table S1. •Dual Sampling Scheme:For the Sparse DBM, the gradient estimation involves two distinct sampling populations. The outer loopusesN s = 10,000samples to estimate expectations ...
-
[64]
Optimization Routine The model parameters were updated using the Stochastic Reconfiguration (SR) method. Instead of inverting the curvature matrixSdirectly (which scales asO(N 3 p )), we solved the linear systemS·δθ=gusing a Preconditioned Conjugate Gradient (PCG) solver. •Matrix-Free Implementation:The solver utilizes implicit matrix-vector products to c...
-
[65]
Final Evaluation Following the training phase (1,000 iterations), the optimized model parameters were frozen. A final evaluation run was performed using a significantly larger sample size ofN eval = 106 to obtain the high-precision energy estimates and error bars reported in the main text figures. TABLE S1. Summary of Training Hyperparameters Parameter Sy...
-
[66]
Connectivity Metric We define the connectivity between a neuroniin layerLat lattice coordinater i and a neuronjin layerL+ 1atr j using the Euclidean distance on the periodic lattice (we assign to each hidden and deep layer a 2D geometry isomorphic to the visible lattice, such that every neuron has a defined spatial coordinate): d(i,j) = min δ∈Z2 ||ri−rj +...
-
[67]
Parameter Enumeration For the10×10lattice (N= 100spins) used in Figure 4: •Sparse RBM:Consists of one visible layer (N v = 100) and one hidden layer (N h = 100). The total parametersN p includeNv +Nh biases and the number of active weights defined bykNv. •Sparse DBM:Consists of one visible layer (N v = 100), one hidden layer (Nh = 100), and one deep layer...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.