Rapid training of Hamiltonian graph networks using random features
Pith reviewed 2026-05-19 10:12 UTC · model grok-4.3
The pith
Hamiltonian Graph Networks can be trained 150-600 times faster using random features instead of gradient descent while keeping comparable accuracy and physical symmetries.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Replacing iterative gradient-descent optimization with random feature-based parameter construction allows Hamiltonian Graph Networks to reach training speeds 150-600 times faster than with 15 standard optimizers, while delivering comparable accuracy on diverse N-body systems and retaining Hamiltonian structure plus physical invariances.
What carries the argument
Random feature-based parameter construction, which directly generates the network weights to enforce Hamiltonian dynamics and symmetries without any iterative refinement.
If this is right
- Training becomes practical for systems with thousands of particles in short time.
- Models trained on minimal 8-node examples generalize zero-shot to 4096-node systems.
- Physical symmetries remain intact across different geometries and dimensions.
- The method applies to both mass-spring and molecular dynamics benchmarks.
- Performance stays robust when benchmarked against NeurIPS 2022 dataset standards.
Where Pith is reading between the lines
- Similar random-construction shortcuts could apply to other physics-constrained network families.
- Lower training cost might enable on-the-fly adaptation of models in engineering simulations.
- Limits may appear in systems with dissipation or external driving forces not tested here.
Load-bearing premise
Randomly generated features can produce parameters that automatically preserve Hamiltonian structure, physical invariances, and predictive accuracy without gradient-based adjustment.
What would settle it
If the random-feature model on a large test trajectory shows substantially higher error or violates energy conservation compared to an optimized Hamiltonian Graph Network, the speedup claim would not hold.
Figures
read the original abstract
Learning dynamical systems that respect physical symmetries and constraints remains a fundamental challenge in data-driven modeling. Integrating physical laws with graph neural networks facilitates principled modeling of complex N-body dynamics and yields accurate and permutation-invariant models. However, training graph neural networks with iterative, gradient-descent-based optimization algorithms (e.g., Adam, RMSProp, LBFGS) often leads to slow training, especially for large, complex systems. In comparison to 15 different optimizers, we demonstrate that Hamiltonian Graph Networks (HGN) can be trained 150-600x faster - but with comparable accuracy - by replacing iterative optimization with random feature-based parameter construction. We show robust performance in diverse simulations, including N-body mass-spring and molecular dynamics systems in up to dimensions and 10,000 particles with different geometries, while retaining essential physical invariances with respect to permutation, rotation, and translation. Our proposed approach is benchmarked using a NeurIPS 2022 Datasets and Benchmarks Track publication to further demonstrate its versatility. We reveal that even when trained on minimal 8-node systems, the model can generalize in a zero-shot manner to systems as large as 4096 nodes without retraining. Our work challenges the dominance of iterative gradient-descent-based optimization algorithms for training neural network models for physical systems.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that Hamiltonian Graph Networks (HGN) for modeling N-body and molecular dynamics can be trained 150-600x faster with comparable accuracy by replacing gradient-based iterative optimization (e.g., Adam) with random feature-based parameter construction. It reports robust performance across simulations with up to 10,000 particles and different geometries, retention of permutation/rotation/translation invariances, and zero-shot generalization from 8-node training systems to 4096-node systems, benchmarked against a NeurIPS 2022 dataset.
Significance. If the central empirical result holds, the work would offer a practical alternative to standard optimizers for training physics-constrained graph networks, potentially enabling faster iteration on large-scale dynamical system modeling while maintaining symmetries. The reported generalization and invariance retention, if rigorously verified, would strengthen the case for structure-preserving random constructions in this domain.
major comments (3)
- [Method / random feature construction] The section on random feature construction (likely §3 or equivalent): the manuscript must include an explicit derivation or constraint showing how the random feature map produces parameters that lie on the Hamiltonian manifold and preserve symplectic structure/energy conservation. Random features typically approximate kernels but do not automatically enforce the required physical form; without this, the speedup claim rests on unverified empirical accuracy alone.
- [Experiments / results] Experimental setup and results (likely §4 and tables/figures): for the 150-600x speedup and 'comparable accuracy' claims against 15 optimizers, provide exact baseline code/implementation details, wall-clock measurements including any preprocessing for random features, and statistical error bars or multiple runs. The current support for the central claim is limited without these.
- [Generalization experiments] Generalization section (likely §4.3 or equivalent): clarify whether the random feature construction is system-size independent and how invariances are explicitly verified (e.g., via conservation checks or symmetry tests) when scaling from 8 to 4096 nodes or to 10,000-particle regimes; any deviation could undermine the zero-shot claim.
minor comments (2)
- [Abstract] Abstract: the phrase 'in up to dimensions' appears truncated; specify the maximum spatial dimension tested.
- [Throughout] Notation and figures: ensure all symbols for the random feature map and Hamiltonian terms are defined before first use and that energy-drift plots (if present) are clearly labeled with scales.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments. We address each major comment below and indicate the revisions planned for the next version of the manuscript.
read point-by-point responses
-
Referee: [Method / random feature construction] The section on random feature construction (likely §3 or equivalent): the manuscript must include an explicit derivation or constraint showing how the random feature map produces parameters that lie on the Hamiltonian manifold and preserve symplectic structure/energy conservation. Random features typically approximate kernels but do not automatically enforce the required physical form; without this, the speedup claim rests on unverified empirical accuracy alone.
Authors: We agree that an explicit derivation strengthens the theoretical foundation. In the revised manuscript we will add a dedicated paragraph (or short subsection) in Section 3 that derives the random feature construction from the requirement that the resulting parameters define a Hamiltonian vector field. The derivation shows that the chosen random feature distribution, combined with the graph-network architecture, ensures the output lies on the symplectic manifold and conserves energy up to the approximation error of the random features. We will also include a brief remark on why this construction differs from generic kernel approximation. revision: yes
-
Referee: [Experiments / results] Experimental setup and results (likely §4 and tables/figures): for the 150-600x speedup and 'comparable accuracy' claims against 15 optimizers, provide exact baseline code/implementation details, wall-clock measurements including any preprocessing for random features, and statistical error bars or multiple runs. The current support for the central claim is limited without these.
Authors: We accept that additional experimental detail is required. The revised version will (i) list the precise library versions and hyper-parameter settings for all 15 baseline optimizers, (ii) report wall-clock times that explicitly separate random-feature preprocessing from the subsequent forward-pass evaluation, and (iii) include mean and standard-deviation results computed over five independent random seeds for every reported metric. These additions will appear in the main text and in a new appendix containing the full experimental protocol. revision: yes
-
Referee: [Generalization experiments] Generalization section (likely §4.3 or equivalent): clarify whether the random feature construction is system-size independent and how invariances are explicitly verified (e.g., via conservation checks or symmetry tests) when scaling from 8 to 4096 nodes or to 10,000-particle regimes; any deviation could undermine the zero-shot claim.
Authors: The random-feature map is constructed locally on nodes and edges and therefore does not depend on total system size; we will state this explicitly in Section 4.3. In addition, we will augment the generalization experiments with quantitative invariance checks: we will report relative energy drift and linear-momentum conservation errors on the 4096-node and 10,000-particle test systems, together with a permutation-symmetry test obtained by randomly reordering node indices. These diagnostics will be added to the existing zero-shot generalization figures and tables. revision: yes
Circularity Check
No significant circularity; empirical validation on external benchmarks is self-contained
full rationale
The paper's central claim is an empirical demonstration that random-feature parameter construction yields 150-600x faster training of Hamiltonian Graph Networks with comparable accuracy on N-body and molecular-dynamics benchmarks. This result is obtained by direct experimental comparison against 15 optimizers on externally defined simulation tasks (including zero-shot generalization from 8-node to 4096-node systems) rather than by any mathematical derivation that reduces to a fitted parameter or self-citation. No load-bearing step equates a prediction to its own input by construction, and the cited NeurIPS 2022 benchmark is an independent dataset rather than a self-referential theorem.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Hamiltonian Graph Networks inherently respect physical constraints such as energy conservation and permutation/rotation/translation invariance.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
replacing iterative optimization with random feature-based parameter construction... retaining essential physical invariances with respect to permutation, rotation, and translation
-
IndisputableMonolith/Foundation/AlexanderDuality.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Hamiltonian Graph Networks (HGN)... symplectic Störmer-Verlet integrator
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
End-to-end differentiable physics for learning and control
Filipe de Avila Belbute-Peres et al. “End-to-end differentiable physics for learning and control”. In: Advances in neural information processing systems 31 (2018)
work page 2018
-
[2]
On learning Hamiltonian systems from data
Tom Bertalan et al. “On learning Hamiltonian systems from data”. In:Chaos: An Interdisci- plinary Journal of Nonlinear Science 29.12 (2019)
work page 2019
-
[3]
Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network
Ravinder Bhattoo, Sayan Ranu, and N M Anoop Krishnan. “Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network”. In: Advances in Neural Information Processing Systems. Ed. by S. Koyejo et al. V ol. 35. Curran Associates, Inc., 2022, pp. 29789– 29800
work page 2022
-
[4]
Sampling Weights of Deep Neural Networks
Erik L Bolager et al. “Sampling Weights of Deep Neural Networks”. In:Advances in Neural Information Processing Systems. V ol. 36. Curran Associates, Inc., 2023, pp. 63075–63116
work page 2023
-
[5]
Gradient-Free Training of Recurrent Neural Networks
Erik Lien Bolager et al. Gradient-Free Training of Recurrent Neural Networks. Oct. 30, 2024. arXiv: 2410.23467 [cs]. Pre-published
-
[6]
A unifying framework for spectrum- preserving graph sparsification and coarsening
Gecia Bravo-Hermsdorff and Lee M. Gunderson. “A unifying framework for spectrum- preserving graph sparsification and coarsening”. In:Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY , USA: Curran Asso- ciates Inc., 2019
work page 2019
-
[7]
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation
Marc Brockschmidt. “GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation”. In: Proceedings of the 37th International Conference on Machine Learning. Ed. by Hal Daumé III and Aarti Singh. V ol. 119. Proceedings of Machine Learning Research. PMLR, July 2020, pp. 1144–1152. 10
work page 2020
-
[8]
DGCL: an efficient communication library for distributed GNN training
Zhenkun Cai et al. “DGCL: an efficient communication library for distributed GNN training”. In: Proceedings of the Sixteenth European Conference on Computer Systems. EuroSys ’21. Online Event, United Kingdom: Association for Computing Machinery, 2021, pp. 130–144. ISBN : 9781450383349. DOI: 10.1145/3447786.3456233
-
[9]
Building a knowledge graph to enable precision medicine
Payal Chandak, Kexin Huang, and Marinka Zitnik. “Building a knowledge graph to enable precision medicine”. In: Scientific Data 10.1 (2023), p. 67
work page 2023
-
[10]
A Compositional Object-Based Approach to Learning Physical Dynamics
Michael B Chang et al. “A compositional object-based approach to learning physical dynam- ics”. In: arXiv (2016). eprint: 1612.00341. Pre-published
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[11]
Taming graph kernels with random features
Krzysztof Marcin Choromanski. “Taming graph kernels with random features”. In:Proceed- ings of the 40th International Conference on Machine Learning. Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 5964–5977
work page 2023
-
[12]
Gabriele Corso et al. “Graph Neural Networks”. In: Nature Reviews Methods Primers 4.1 (Mar. 2024), pp. 1–13. ISSN : 2662-8449. DOI: 10.1038/s43586-024-00294-7
-
[13]
Miles Cranmer et al. “Lagrangian Neural Networks”. In:ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations. 2019
work page 2020
-
[14]
Fast training of accurate physics-informed neural networks without gradient descent
Chinmay Datar et al. Solving Partial Differential Equations with Sampled Neural Networks. May 31, 2024. arXiv: 2405.20836 [math]. Pre-published
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[15]
Robust deep learning–based protein sequence design using Protein- MPNN
Justas Dauparas et al. “Robust deep learning–based protein sequence design using Protein- MPNN”. In: Science 378.6615 (2022), pp. 49–56
work page 2022
-
[16]
Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems
Shaan A Desai et al. “Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems”. In: Physical Review E 104.3 (2021), p. 034312
work page 2021
-
[17]
Graph neural networks at the Large Hadron Collider
Gage DeZoort et al. “Graph neural networks at the Large Hadron Collider”. In: Nature Reviews Physics 5.5 (2023), pp. 281–303
work page 2023
-
[18]
Hamiltonian Neural Networks with Automatic Symmetry Detection
Eva Dierkes et al. “Hamiltonian Neural Networks with Automatic Symmetry Detection”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 33.6 (June 1, 2023), p. 063115. ISSN : 1054-1500, 1089-7682
work page 2023
-
[19]
Incorporating Nesterov momentum into Adam
Timothy Dozat. “Incorporating Nesterov momentum into Adam”. In:Proceedings of the 4th International Conference on Learning Representations, Workshop Track(May 2016)
work page 2016
-
[20]
Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
John Duchi, Elad Hazan, and Yoram Singer. “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”. In: Journal of Machine Learning Research 12.61 (2011), pp. 2121–2159
work page 2011
-
[21]
Gianluca Fabiani. Random Projection Neural Networks of Best Approximation: Convergence Theory and Practical Applications. Feb. 2024. arXiv: 2402.11397 [cs]. Pre-published
-
[22]
Gianluca Fabiani et al. “Numerical Solution and Bifurcation Analysis of Nonlinear Partial Differential Equations with Extreme Learning Machines”. In:Journal of Scientific Computing 89.2 (Nov. 2021), p. 44. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915-021-01650- 5
-
[23]
RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators
Gianluca Fabiani et al. “RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators”. In: Journal of Computational Physics 520 (Jan. 2025), p. 113433. ISSN : 00219991. DOI: 10.1016/j.jcp.2024.113433
-
[24]
Structure-Aware Random Fourier Kernel for Graphs
Jinyuan Fang et al. “Structure-Aware Random Fourier Kernel for Graphs”. In:Advances in Neural Information Processing Systems. Ed. by M. Ranzato et al. V ol. 34. Curran Associates, Inc., 2021, pp. 17681–17694
work page 2021
-
[25]
Evangelos Galaris et al. “Numerical Bifurcation Analysis of PDEs From Lattice Boltzmann Model Simulations: A Parsimonious Machine Learning Approach”. In: Journal of Scientific Computing 92.2 (Aug. 2022), p. 34. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915- 022-01883-y
-
[26]
Fast and deep graph neural networks
Claudio Gallicchio and Alessio Micheli. “Fast and deep graph neural networks”. In: Pro- ceedings of the AAAI conference on artificial intelligence . V ol. 34. 04. 2020, pp. 3898– 3905
work page 2020
-
[27]
Claudio Gallicchio and Alessio Micheli. “Graph echo state networks”. In: The 2010 interna- tional joint conference on neural networks (IJCNN). IEEE. 2010, pp. 1–8
work page 2010
-
[28]
Neural Message Passing for Quantum Chemistry
Justin Gilmer et al. “Neural Message Passing for Quantum Chemistry”. In:Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. V ol. 70. Proceedings of Machine Learning Research. PMLR, Aug. 2017, pp. 1263–1272. 11
work page 2017
-
[29]
SGD: General Analysis and Improved Rates
Robert Mansel Gower et al. SGD: General Analysis and Improved Rates . 2019. arXiv: 1901.09401 [cs.LG]. Pre-published
-
[30]
Samuel Greydanus, Misko Dzamba, and Jason Yosinski. “Hamiltonian Neural Networks”. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. V ol. 32. Curran Associates, Inc., 2019
work page 2019
-
[31]
Efficiently Parameterized Neural Metriplectic Systems
Anthony Gruber et al. Efficiently Parameterized Neural Metriplectic Systems. Jan. 27, 2025. arXiv: 2405.16305 [cs]. Pre-published
-
[32]
GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs
Vipul Gupta et al. “GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs”. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. CIKM ’24. Boise, ID, USA: Association for Computing Machinery, 2024, pp. 4514–4521. ISBN : 9798400704369. DOI: 10.1145/3627673.3680021
-
[33]
Geometric numerical integration illustrated by the Störmer–Verlet method
Ernst Hairer, Christian Lubich, and Gerhard Wanner. “Geometric numerical integration illustrated by the Störmer–Verlet method”. In:Acta numerica 12 (2003), pp. 399–450
work page 2003
-
[34]
On a General Method in Dynamics
William Rowan Hamilton. “On a General Method in Dynamics”. In:Philosophical Transac- tions of the Royal Society 124 (1834), pp. 247–308
-
[35]
Second Essay on a General Method in Dynamics
William Rowan Hamilton. “Second Essay on a General Method in Dynamics”. In:Philosoph- ical Transactions of the Royal Society 125 (1835), pp. 95–144
-
[36]
A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation
Mohammad Hashemi et al. A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation. 2024. arXiv: 2402.03358 [cs.SI]
-
[37]
Structure-Preserving Neural Networks
Quercus Hernández et al. “Structure-Preserving Neural Networks”. In: Journal of Computa- tional Physics 426 (Feb. 2021), p. 109950. ISSN : 00219991
work page 2021
-
[38]
Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes
Guang-Bin Huang, Lei Chen, and Chee Siew. “Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes”. In:IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 17 (2006), pp. 879–92
work page 2006
-
[39]
Extreme learning machine: a new learning scheme of feedforward neural networks
Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew. “Extreme learning machine: a new learning scheme of feedforward neural networks”. In: 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541). V ol. 2. 2004, pp. 985–990
work page 2004
-
[40]
Condensing Graphs via One-Step Gradient Matching
Wei Jin et al. “Condensing Graphs via One-Step Gradient Matching”. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’22. Washington DC, USA: Association for Computing Machinery, 2022, pp. 720–730. ISBN : 9781450393850. DOI: 10.1145/3534678.3539429
-
[41]
Graph Condensation for Graph Neural Networks
Wei Jin et al. “Graph Condensation for Graph Neural Networks”. In:International Confer- ence on Learning Representations. 2022. URL: https://openreview.net/forum?id= WLEx3Jo4QaB
work page 2022
-
[42]
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining
Tim Kaler et al. “Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining”. In: Proceedings of Machine Learning and Systems . Ed. by D. Marculescu, Y . Chi, and C. Wu. V ol. 4. 2022, pp. 172–189
work page 2022
-
[43]
Adam: A Method for Stochastic Optimization
D. P. Kingma and L. J. Ba. “Adam: A Method for Stochastic Optimization”. In:International Conference on Learning Representations ICLR 2015. 2015
work page 2015
-
[44]
Directional Message Passing for Molecular Graphs
Johannes Klicpera, Janek Groß, Stephan Günnemann, et al. “Directional Message Passing for Molecular Graphs.” In: ICLR. 2020, pp. 1–13
work page 2020
-
[45]
Fast&Fair: Training Acceleration and Bias Mitigation for GNNs
Oyku Deniz Kose and Yanning Shen. “Fast&Fair: Training Acceleration and Bias Mitigation for GNNs”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=nOk4XEB7Ke
work page 2023
-
[46]
Featured Graph Coarsening with Similarity Guarantees
Manoj Kumar et al. “Featured Graph Coarsening with Similarity Guarantees”. In:Proceedings of the 40th International Conference on Machine Learning . Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 17953–17975
work page 2023
-
[47]
Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude
“Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude”. In: COURSERA: Neural networks for machine learning 4.2 (2012), p. 26
work page 2012
-
[48]
Machine learning structure preserving brack- ets for forecasting irreversible processes
Kookjin Lee, Nathaniel Trask, and Panos Stinis. “Machine learning structure preserving brack- ets for forecasting irreversible processes”. In:Advances in Neural Information Processing Systems 34 (2021), pp. 5696–5707
work page 2021
-
[49]
Fault and Noise Tolerance in the Incremental Extreme Learning Machine
Ho Chun Leung, Chi Sing Leung, and Eric Wing Ming Wong. “Fault and Noise Tolerance in the Incremental Extreme Learning Machine”. In: IEEE Access 7 (2019), pp. 155171–155183
work page 2019
-
[50]
Siye Li et al. “Physics-constrained and flow-field-message-informed graph neural network for solving unsteady compressible flows”. In:Physics of Fluids 36.4 (2024). 12
work page 2024
-
[51]
PaGraph: Scaling GNN training on large graphs via computation-aware caching
Zhiqi Lin et al. “PaGraph: Scaling GNN training on large graphs via computation-aware caching”. In: Proceedings of the 11th ACM Symposium on Cloud Computing . SoCC ’20. Virtual Event, USA: Association for Computing Machinery, 2020, pp. 401–415. ISBN : 9781450381376. DOI: 10.1145/3419111.3421281
-
[52]
On the limited memory BFGS method for large scale optimization
Dong C Liu and Jorge Nocedal. “On the limited memory BFGS method for large scale optimization”. In: Mathematical programming 45.1 (1989), pp. 503–528
work page 1989
-
[53]
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu et al. On the Variance of the Adaptive Learning Rate and Beyond. 2021. arXiv: 1908.03265 [cs.LG]. Pre-published
-
[54]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. 2019. arXiv: 1711.05101 [cs.LG]
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[55]
Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning
Michael Lutter, Christian Ritter, and Jan Peters. “Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning”. In:International Conference on Learning Representations. 2019
work page 2019
-
[56]
A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation
Antonio Marino, Claudio Pacchierotti, and Paolo Robuffo Giordano. “A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation”. In:ACM Trans. Intell. Syst. Technol. (Mar. 2025). Just Accepted. ISSN : 2157-6904. DOI: 10.1145/3725857
-
[57]
LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems
Xiangrui Meng, Michael A. Saunders, and Michael W. Mahoney. “LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems”. In: SIAM Journal on Scientific Computing 36.2 (Jan. 2014), pp. C95–C118. ISSN : 1064-8275, 1095-7197. DOI: 10.1137/ 120866580
work page 2014
-
[58]
arXiv preprint arXiv:2411.17164 , year=
Mohammad Amin Nabian et al. X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation. 2024. arXiv: 2411.17164 [cs.LG]. Pre-published
-
[59]
FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks
Amrit Nagarajan and Anand Raghunathan. “FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=1IYJfwJtjQ
work page 2023
-
[60]
Variational Learning of Euler–Lagrange Dy- namics from Data
Sina Ober-Bloebaum and Christian Offen. “Variational Learning of Euler–Lagrange Dy- namics from Data”. In: Journal of Computational and Applied Mathematics 421 (2023), p. 114780
work page 2023
-
[61]
Symplectic Integration of Learned Hamiltonian Systems
C. Offen and S. Ober-Bloebaum. “Symplectic Integration of Learned Hamiltonian Systems”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 32.1 (2022), p. 013122
work page 2022
-
[62]
Functional-link net computing: theory, system architecture, and functionalities
Y-H Pao and Yoshiyasu Takefuji. “Functional-link net computing: theory, system architecture, and functionalities”. In: Computer 25.5 (1992), pp. 76–79
work page 1992
-
[63]
PyTorch: An Imperative Style, High-Performance Deep Learning Library
Adam Paszke et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library”. In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024–8035
work page 2019
-
[64]
Physics-informed graph convolutional neural network for modeling fluid flow and heat convection
Jiang-Zhou Peng et al. “Physics-informed graph convolutional neural network for modeling fluid flow and heat convection”. In:Physics of Fluids 35.8 (2023)
work page 2023
-
[65]
Learning Mesh-Based Simulation with Graph Networks
Tobias Pfaff et al. “Learning Mesh-Based Simulation with Graph Networks”. In:International Conference on Learning Representations. 2021
work page 2021
-
[66]
Uniform approximation of functions with random bases
Ali Rahimi and Benjamin Recht. “Uniform approximation of functions with random bases”. In: 2008 46th annual allerton conference on communication, control, and computing. IEEE. 2008, pp. 555–561
work page 2008
-
[67]
Training Hamiltonian Neural Net- works without Backpropagation
Atamert Rahma, Chinmay Datar, and Felix Dietrich. “Training Hamiltonian Neural Net- works without Backpropagation”. In: NeurIPS 2024 Workshop on Machine Learning and the Physical Sciences. NeurIPS 2024, Nov. 26, 2024
work page 2024
-
[68]
Quasi-Monte Carlo Graph Random Features
Isaac Reid, Krzysztof M Choromanski, and Adrian Weller. “Quasi-Monte Carlo Graph Random Features”. In: Advances in Neural Information Processing Systems. Ed. by A. Oh et al. V ol. 36. Curran Associates, Inc., 2023, pp. 14770–14796
work page 2023
-
[69]
Isaac Reid et al. General Graph Random Features. 2023. arXiv: 2310.04859 [stat.ML]. Pre-published
-
[70]
A direct adaptive method for faster backpropagation learning: The RPROP algorithm
Martin Riedmiller and Heinrich Braun. “A direct adaptive method for faster backpropagation learning: The RPROP algorithm”. In: IEEE international conference on neural networks . IEEE. 1993, pp. 586–591
work page 1993
-
[71]
A Stochastic Approximation Method
Herbert E. Robbins. “A Stochastic Approximation Method”. In: Annals of Mathematical Statistics 22 (1951), pp. 400–407. 13
work page 1951
-
[72]
Fabian J. Roth et al. Stable Port-Hamiltonian Neural Networks. Feb. 4, 2025. arXiv: 2502. 02480 [cs]. Pre-published
work page 2025
-
[73]
Hamiltonian Graph Networks with ODE Integrators
Alvaro Sanchez-Gonzalez et al. “Hamiltonian Graph Networks with ODE Integrators”. In: Second Workshop on Machine Learning and the Physical Sciences (NeurIPS 2019), Vancouver, Canada. NeurIPS 2019, Sept. 27, 2019
work page 2019
-
[74]
Learning to simulate complex physics with graph networks
Alvaro Sanchez-Gonzalez et al. “Learning to simulate complex physics with graph networks”. In: International conference on machine learning. PMLR. 2020, pp. 8459–8468
work page 2020
-
[75]
Modeling Relational Data with Graph Convolutional Networks
Michael Schlichtkrull et al. “Modeling Relational Data with Graph Convolutional Networks”. In: The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings. Heraklion, Greece: Springer-Verlag, 2018, pp. 593–607. ISBN : 978-3-319-93416-7
work page 2018
-
[76]
Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers
Robin M Schmidt, Frank Schneider, and Philipp Hennig. “Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers”. In: Proceedings of the 38th Interna- tional Conference on Machine Learning. Ed. by Marina Meila and Tong Zhang. V ol. 139. Proceedings of Machine Learning Research. PMLR, 2021, pp. 9367–9376
work page 2021
-
[77]
Feed forward neural networks with random weights
Wouter F Schmidt, Martin A Kraaijveld, Robert PW Duin, et al. “Feed forward neural networks with random weights”. In: International conference on pattern recognition. IEEE Computer Society Press. 1992, pp. 1–1
work page 1992
-
[78]
Schnet: A continuous-filter convolutional neural network for modeling quantum interactions
Kristof Schütt et al. “Schnet: A continuous-filter convolutional neural network for modeling quantum interactions”. In: Advances in neural information processing systems 30 (2017)
work page 2017
-
[79]
Yingxia Shao et al. “Distributed Graph Neural Network Training: A Survey”. In: ACM Comput. Surv. 56.8 (Apr. 2024). ISSN : 0360-0300. DOI: 10.1145/3648358
-
[80]
arXiv preprint arXiv:2501.07373 , year =
Vinay Sharma and Olga Fink. Dynami-CAL GraphNet: A Physics-Informed Graph Neural Network Conserving Linear and Angular Momentum for Dynamical Systems. 2025. arXiv: 2501.07373 [cs.LG]. Pre-published
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.