Rapid training of Hamiltonian graph networks using random features

Ana Cukarska; Atamert Rahma; Chinmay Datar; Felix Dietrich

arxiv: 2506.06558 · v3 · submitted 2025-06-06 · 💻 cs.LG · cs.NE

Rapid training of Hamiltonian graph networks using random features

Atamert Rahma , Chinmay Datar , Ana Cukarska , Felix Dietrich This is my paper

Pith reviewed 2026-05-19 10:12 UTC · model grok-4.3

classification 💻 cs.LG cs.NE

keywords Hamiltonian Graph NetworksRandom FeaturesFast TrainingN-body DynamicsPhysical SymmetriesGraph Neural NetworksDynamical SystemsZero-shot Generalization

0 comments

The pith

Hamiltonian Graph Networks can be trained 150-600 times faster using random features instead of gradient descent while keeping comparable accuracy and physical symmetries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper demonstrates that Hamiltonian Graph Networks, which combine graph neural networks with Hamiltonian mechanics to model physical N-body systems, can replace slow iterative optimizers like Adam with direct random feature-based parameter construction. This yields speedups of 150 to 600 times on mass-spring and molecular dynamics simulations involving up to 10,000 particles while preserving permutation, rotation, and translation invariance. The approach also shows zero-shot generalization from training on 8-node graphs to systems with 4096 nodes. A sympathetic reader would care because conventional training times have limited the use of such models for large-scale physical dynamics.

Core claim

Replacing iterative gradient-descent optimization with random feature-based parameter construction allows Hamiltonian Graph Networks to reach training speeds 150-600 times faster than with 15 standard optimizers, while delivering comparable accuracy on diverse N-body systems and retaining Hamiltonian structure plus physical invariances.

What carries the argument

Random feature-based parameter construction, which directly generates the network weights to enforce Hamiltonian dynamics and symmetries without any iterative refinement.

If this is right

Training becomes practical for systems with thousands of particles in short time.
Models trained on minimal 8-node examples generalize zero-shot to 4096-node systems.
Physical symmetries remain intact across different geometries and dimensions.
The method applies to both mass-spring and molecular dynamics benchmarks.
Performance stays robust when benchmarked against NeurIPS 2022 dataset standards.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar random-construction shortcuts could apply to other physics-constrained network families.
Lower training cost might enable on-the-fly adaptation of models in engineering simulations.
Limits may appear in systems with dissipation or external driving forces not tested here.

Load-bearing premise

Randomly generated features can produce parameters that automatically preserve Hamiltonian structure, physical invariances, and predictive accuracy without gradient-based adjustment.

What would settle it

If the random-feature model on a large test trajectory shows substantially higher error or violates energy conservation compared to an optimized Hamiltonian Graph Network, the speedup claim would not hold.

Figures

Figures reproduced from arXiv: 2506.06558 by Ana Cukarska, Atamert Rahma, Chinmay Datar, Felix Dietrich.

**Figure 2.** Figure 2: Illustration of train and test N-body system posi [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Random-feature Hamiltonian graph neural network architecture. [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗

**Figure 4.** Figure 4: Graphs considered in the experiments: (a) 3D lattice (nodes arranged on a 2D grid with motion in a 3D space - see Section 4.1 and Section 4.2), (b) an open chain (nodes with motion in 2D space - see Section 4.2), and (c) 2D closed chain (nodes with motion in 2D space - see Section 4.3). 4.1 Benchmarking against SOTA optimizers The goal of this experiment is to demonstrate the efficiency of our training app… view at source ↗

**Figure 5.** Figure 5: Illustration of accurate zero-shot generalization for 3D lattice (see Figure 4 (a)): Training on [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Zero-shot generalization in 2D open chain (see Figure 4 (b)): RF-HGN trained up to [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗

**Figure 7.** Figure 7: Illustration of position trajectories (first two columns), Hamiltonian predictions (third [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗

read the original abstract

Learning dynamical systems that respect physical symmetries and constraints remains a fundamental challenge in data-driven modeling. Integrating physical laws with graph neural networks facilitates principled modeling of complex N-body dynamics and yields accurate and permutation-invariant models. However, training graph neural networks with iterative, gradient-descent-based optimization algorithms (e.g., Adam, RMSProp, LBFGS) often leads to slow training, especially for large, complex systems. In comparison to 15 different optimizers, we demonstrate that Hamiltonian Graph Networks (HGN) can be trained 150-600x faster - but with comparable accuracy - by replacing iterative optimization with random feature-based parameter construction. We show robust performance in diverse simulations, including N-body mass-spring and molecular dynamics systems in up to dimensions and 10,000 particles with different geometries, while retaining essential physical invariances with respect to permutation, rotation, and translation. Our proposed approach is benchmarked using a NeurIPS 2022 Datasets and Benchmarks Track publication to further demonstrate its versatility. We reveal that even when trained on minimal 8-node systems, the model can generalize in a zero-shot manner to systems as large as 4096 nodes without retraining. Our work challenges the dominance of iterative gradient-descent-based optimization algorithms for training neural network models for physical systems.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that Hamiltonian Graph Networks (HGN) for modeling N-body and molecular dynamics can be trained 150-600x faster with comparable accuracy by replacing gradient-based iterative optimization (e.g., Adam) with random feature-based parameter construction. It reports robust performance across simulations with up to 10,000 particles and different geometries, retention of permutation/rotation/translation invariances, and zero-shot generalization from 8-node training systems to 4096-node systems, benchmarked against a NeurIPS 2022 dataset.

Significance. If the central empirical result holds, the work would offer a practical alternative to standard optimizers for training physics-constrained graph networks, potentially enabling faster iteration on large-scale dynamical system modeling while maintaining symmetries. The reported generalization and invariance retention, if rigorously verified, would strengthen the case for structure-preserving random constructions in this domain.

major comments (3)

[Method / random feature construction] The section on random feature construction (likely §3 or equivalent): the manuscript must include an explicit derivation or constraint showing how the random feature map produces parameters that lie on the Hamiltonian manifold and preserve symplectic structure/energy conservation. Random features typically approximate kernels but do not automatically enforce the required physical form; without this, the speedup claim rests on unverified empirical accuracy alone.
[Experiments / results] Experimental setup and results (likely §4 and tables/figures): for the 150-600x speedup and 'comparable accuracy' claims against 15 optimizers, provide exact baseline code/implementation details, wall-clock measurements including any preprocessing for random features, and statistical error bars or multiple runs. The current support for the central claim is limited without these.
[Generalization experiments] Generalization section (likely §4.3 or equivalent): clarify whether the random feature construction is system-size independent and how invariances are explicitly verified (e.g., via conservation checks or symmetry tests) when scaling from 8 to 4096 nodes or to 10,000-particle regimes; any deviation could undermine the zero-shot claim.

minor comments (2)

[Abstract] Abstract: the phrase 'in up to dimensions' appears truncated; specify the maximum spatial dimension tested.
[Throughout] Notation and figures: ensure all symbols for the random feature map and Hamiltonian terms are defined before first use and that energy-drift plots (if present) are clearly labeled with scales.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed comments. We address each major comment below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Method / random feature construction] The section on random feature construction (likely §3 or equivalent): the manuscript must include an explicit derivation or constraint showing how the random feature map produces parameters that lie on the Hamiltonian manifold and preserve symplectic structure/energy conservation. Random features typically approximate kernels but do not automatically enforce the required physical form; without this, the speedup claim rests on unverified empirical accuracy alone.

Authors: We agree that an explicit derivation strengthens the theoretical foundation. In the revised manuscript we will add a dedicated paragraph (or short subsection) in Section 3 that derives the random feature construction from the requirement that the resulting parameters define a Hamiltonian vector field. The derivation shows that the chosen random feature distribution, combined with the graph-network architecture, ensures the output lies on the symplectic manifold and conserves energy up to the approximation error of the random features. We will also include a brief remark on why this construction differs from generic kernel approximation. revision: yes
Referee: [Experiments / results] Experimental setup and results (likely §4 and tables/figures): for the 150-600x speedup and 'comparable accuracy' claims against 15 optimizers, provide exact baseline code/implementation details, wall-clock measurements including any preprocessing for random features, and statistical error bars or multiple runs. The current support for the central claim is limited without these.

Authors: We accept that additional experimental detail is required. The revised version will (i) list the precise library versions and hyper-parameter settings for all 15 baseline optimizers, (ii) report wall-clock times that explicitly separate random-feature preprocessing from the subsequent forward-pass evaluation, and (iii) include mean and standard-deviation results computed over five independent random seeds for every reported metric. These additions will appear in the main text and in a new appendix containing the full experimental protocol. revision: yes
Referee: [Generalization experiments] Generalization section (likely §4.3 or equivalent): clarify whether the random feature construction is system-size independent and how invariances are explicitly verified (e.g., via conservation checks or symmetry tests) when scaling from 8 to 4096 nodes or to 10,000-particle regimes; any deviation could undermine the zero-shot claim.

Authors: The random-feature map is constructed locally on nodes and edges and therefore does not depend on total system size; we will state this explicitly in Section 4.3. In addition, we will augment the generalization experiments with quantitative invariance checks: we will report relative energy drift and linear-momentum conservation errors on the 4096-node and 10,000-particle test systems, together with a permutation-symmetry test obtained by randomly reordering node indices. These diagnostics will be added to the existing zero-shot generalization figures and tables. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical validation on external benchmarks is self-contained

full rationale

The paper's central claim is an empirical demonstration that random-feature parameter construction yields 150-600x faster training of Hamiltonian Graph Networks with comparable accuracy on N-body and molecular-dynamics benchmarks. This result is obtained by direct experimental comparison against 15 optimizers on externally defined simulation tasks (including zero-shot generalization from 8-node to 4096-node systems) rather than by any mathematical derivation that reduces to a fitted parameter or self-citation. No load-bearing step equates a prediction to its own input by construction, and the cited NeurIPS 2022 benchmark is an independent dataset rather than a self-referential theorem.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The claim rests on the domain assumption that Hamiltonian Graph Networks already encode physical constraints and that random features can substitute for learned parameters without violating those constraints.

axioms (1)

domain assumption Hamiltonian Graph Networks inherently respect physical constraints such as energy conservation and permutation/rotation/translation invariance.
This property is invoked as the basis for why the random-feature construction still yields valid physical models.

pith-pipeline@v0.9.0 · 5765 in / 1273 out tokens · 36616 ms · 2026-05-19T10:12:42.209629+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

replacing iterative optimization with random feature-based parameter construction... retaining essential physical invariances with respect to permutation, rotation, and translation
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Hamiltonian Graph Networks (HGN)... symplectic Störmer-Verlet integrator

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

108 extracted references · 108 canonical work pages · 3 internal anchors

[1]

End-to-end differentiable physics for learning and control

Filipe de Avila Belbute-Peres et al. “End-to-end differentiable physics for learning and control”. In: Advances in neural information processing systems 31 (2018)

work page 2018
[2]

On learning Hamiltonian systems from data

Tom Bertalan et al. “On learning Hamiltonian systems from data”. In:Chaos: An Interdisci- plinary Journal of Nonlinear Science 29.12 (2019)

work page 2019
[3]

Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network

Ravinder Bhattoo, Sayan Ranu, and N M Anoop Krishnan. “Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network”. In: Advances in Neural Information Processing Systems. Ed. by S. Koyejo et al. V ol. 35. Curran Associates, Inc., 2022, pp. 29789– 29800

work page 2022
[4]

Sampling Weights of Deep Neural Networks

Erik L Bolager et al. “Sampling Weights of Deep Neural Networks”. In:Advances in Neural Information Processing Systems. V ol. 36. Curran Associates, Inc., 2023, pp. 63075–63116

work page 2023
[5]

Gradient-Free Training of Recurrent Neural Networks

Erik Lien Bolager et al. Gradient-Free Training of Recurrent Neural Networks. Oct. 30, 2024. arXiv: 2410.23467 [cs]. Pre-published

work page arXiv 2024
[6]

A unifying framework for spectrum- preserving graph sparsification and coarsening

Gecia Bravo-Hermsdorff and Lee M. Gunderson. “A unifying framework for spectrum- preserving graph sparsification and coarsening”. In:Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY , USA: Curran Asso- ciates Inc., 2019

work page 2019
[7]

GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation

Marc Brockschmidt. “GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation”. In: Proceedings of the 37th International Conference on Machine Learning. Ed. by Hal Daumé III and Aarti Singh. V ol. 119. Proceedings of Machine Learning Research. PMLR, July 2020, pp. 1144–1152. 10

work page 2020
[8]

DGCL: an efficient communication library for distributed GNN training

Zhenkun Cai et al. “DGCL: an efficient communication library for distributed GNN training”. In: Proceedings of the Sixteenth European Conference on Computer Systems. EuroSys ’21. Online Event, United Kingdom: Association for Computing Machinery, 2021, pp. 130–144. ISBN : 9781450383349. DOI: 10.1145/3447786.3456233

work page doi:10.1145/3447786.3456233 2021
[9]

Building a knowledge graph to enable precision medicine

Payal Chandak, Kexin Huang, and Marinka Zitnik. “Building a knowledge graph to enable precision medicine”. In: Scientific Data 10.1 (2023), p. 67

work page 2023
[10]

A Compositional Object-Based Approach to Learning Physical Dynamics

Michael B Chang et al. “A compositional object-based approach to learning physical dynam- ics”. In: arXiv (2016). eprint: 1612.00341. Pre-published

work page internal anchor Pith review Pith/arXiv arXiv 2016
[11]

Taming graph kernels with random features

Krzysztof Marcin Choromanski. “Taming graph kernels with random features”. In:Proceed- ings of the 40th International Conference on Machine Learning. Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 5964–5977

work page 2023
[12]

Graph Neural Networks

Gabriele Corso et al. “Graph Neural Networks”. In: Nature Reviews Methods Primers 4.1 (Mar. 2024), pp. 1–13. ISSN : 2662-8449. DOI: 10.1038/s43586-024-00294-7

work page doi:10.1038/s43586-024-00294-7 2024
[13]

Lagrangian Neural Networks

Miles Cranmer et al. “Lagrangian Neural Networks”. In:ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations. 2019

work page 2020
[14]

Fast training of accurate physics-informed neural networks without gradient descent

Chinmay Datar et al. Solving Partial Differential Equations with Sampled Neural Networks. May 31, 2024. arXiv: 2405.20836 [math]. Pre-published

work page internal anchor Pith review Pith/arXiv arXiv 2024
[15]

Robust deep learning–based protein sequence design using Protein- MPNN

Justas Dauparas et al. “Robust deep learning–based protein sequence design using Protein- MPNN”. In: Science 378.6615 (2022), pp. 49–56

work page 2022
[16]

Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems

Shaan A Desai et al. “Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems”. In: Physical Review E 104.3 (2021), p. 034312

work page 2021
[17]

Graph neural networks at the Large Hadron Collider

Gage DeZoort et al. “Graph neural networks at the Large Hadron Collider”. In: Nature Reviews Physics 5.5 (2023), pp. 281–303

work page 2023
[18]

Hamiltonian Neural Networks with Automatic Symmetry Detection

Eva Dierkes et al. “Hamiltonian Neural Networks with Automatic Symmetry Detection”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 33.6 (June 1, 2023), p. 063115. ISSN : 1054-1500, 1089-7682

work page 2023
[19]

Incorporating Nesterov momentum into Adam

Timothy Dozat. “Incorporating Nesterov momentum into Adam”. In:Proceedings of the 4th International Conference on Learning Representations, Workshop Track(May 2016)

work page 2016
[20]

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

John Duchi, Elad Hazan, and Yoram Singer. “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”. In: Journal of Machine Learning Research 12.61 (2011), pp. 2121–2159

work page 2011
[21]

Random Projection Neural Networks of Best Approximation: Convergence Theory and Practical Applications

Gianluca Fabiani. Random Projection Neural Networks of Best Approximation: Convergence Theory and Practical Applications. Feb. 2024. arXiv: 2402.11397 [cs]. Pre-published

work page arXiv 2024
[22]

Numerical Solution and Bifurcation Analysis of Nonlinear Partial Differential Equations with Extreme Learning Machines

Gianluca Fabiani et al. “Numerical Solution and Bifurcation Analysis of Nonlinear Partial Differential Equations with Extreme Learning Machines”. In:Journal of Scientific Computing 89.2 (Nov. 2021), p. 44. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915-021-01650- 5

work page doi:10.1007/s10915-021-01650- 2021
[23]

RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators

Gianluca Fabiani et al. “RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators”. In: Journal of Computational Physics 520 (Jan. 2025), p. 113433. ISSN : 00219991. DOI: 10.1016/j.jcp.2024.113433

work page doi:10.1016/j.jcp.2024.113433 2025
[24]

Structure-Aware Random Fourier Kernel for Graphs

Jinyuan Fang et al. “Structure-Aware Random Fourier Kernel for Graphs”. In:Advances in Neural Information Processing Systems. Ed. by M. Ranzato et al. V ol. 34. Curran Associates, Inc., 2021, pp. 17681–17694

work page 2021
[25]

Numerical Bifurcation Analysis of PDEs From Lattice Boltzmann Model Simulations: A Parsimonious Machine Learning Approach

Evangelos Galaris et al. “Numerical Bifurcation Analysis of PDEs From Lattice Boltzmann Model Simulations: A Parsimonious Machine Learning Approach”. In: Journal of Scientific Computing 92.2 (Aug. 2022), p. 34. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915- 022-01883-y

work page doi:10.1007/s10915- 2022
[26]

Fast and deep graph neural networks

Claudio Gallicchio and Alessio Micheli. “Fast and deep graph neural networks”. In: Pro- ceedings of the AAAI conference on artificial intelligence . V ol. 34. 04. 2020, pp. 3898– 3905

work page 2020
[27]

Graph echo state networks

Claudio Gallicchio and Alessio Micheli. “Graph echo state networks”. In: The 2010 interna- tional joint conference on neural networks (IJCNN). IEEE. 2010, pp. 1–8

work page 2010
[28]

Neural Message Passing for Quantum Chemistry

Justin Gilmer et al. “Neural Message Passing for Quantum Chemistry”. In:Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. V ol. 70. Proceedings of Machine Learning Research. PMLR, Aug. 2017, pp. 1263–1272. 11

work page 2017
[29]

SGD: General Analysis and Improved Rates

Robert Mansel Gower et al. SGD: General Analysis and Improved Rates . 2019. arXiv: 1901.09401 [cs.LG]. Pre-published

work page arXiv 2019
[30]

Hamiltonian Neural Networks

Samuel Greydanus, Misko Dzamba, and Jason Yosinski. “Hamiltonian Neural Networks”. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. V ol. 32. Curran Associates, Inc., 2019

work page 2019
[31]

Efficiently Parameterized Neural Metriplectic Systems

Anthony Gruber et al. Efficiently Parameterized Neural Metriplectic Systems. Jan. 27, 2025. arXiv: 2405.16305 [cs]. Pre-published

work page arXiv 2025
[32]

GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs

Vipul Gupta et al. “GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs”. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. CIKM ’24. Boise, ID, USA: Association for Computing Machinery, 2024, pp. 4514–4521. ISBN : 9798400704369. DOI: 10.1145/3627673.3680021

work page doi:10.1145/3627673.3680021 2024
[33]

Geometric numerical integration illustrated by the Störmer–Verlet method

Ernst Hairer, Christian Lubich, and Gerhard Wanner. “Geometric numerical integration illustrated by the Störmer–Verlet method”. In:Acta numerica 12 (2003), pp. 399–450

work page 2003
[34]

On a General Method in Dynamics

William Rowan Hamilton. “On a General Method in Dynamics”. In:Philosophical Transac- tions of the Royal Society 124 (1834), pp. 247–308

work page
[35]

Second Essay on a General Method in Dynamics

William Rowan Hamilton. “Second Essay on a General Method in Dynamics”. In:Philosoph- ical Transactions of the Royal Society 125 (1835), pp. 95–144

work page
[36]

A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation

Mohammad Hashemi et al. A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation. 2024. arXiv: 2402.03358 [cs.SI]

work page arXiv 2024
[37]

Structure-Preserving Neural Networks

Quercus Hernández et al. “Structure-Preserving Neural Networks”. In: Journal of Computa- tional Physics 426 (Feb. 2021), p. 109950. ISSN : 00219991

work page 2021
[38]

Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes

Guang-Bin Huang, Lei Chen, and Chee Siew. “Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes”. In:IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 17 (2006), pp. 879–92

work page 2006
[39]

Extreme learning machine: a new learning scheme of feedforward neural networks

Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew. “Extreme learning machine: a new learning scheme of feedforward neural networks”. In: 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541). V ol. 2. 2004, pp. 985–990

work page 2004
[40]

Condensing Graphs via One-Step Gradient Matching

Wei Jin et al. “Condensing Graphs via One-Step Gradient Matching”. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’22. Washington DC, USA: Association for Computing Machinery, 2022, pp. 720–730. ISBN : 9781450393850. DOI: 10.1145/3534678.3539429

work page doi:10.1145/3534678.3539429 2022
[41]

Graph Condensation for Graph Neural Networks

Wei Jin et al. “Graph Condensation for Graph Neural Networks”. In:International Confer- ence on Learning Representations. 2022. URL: https://openreview.net/forum?id= WLEx3Jo4QaB

work page 2022
[42]

Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining

Tim Kaler et al. “Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining”. In: Proceedings of Machine Learning and Systems . Ed. by D. Marculescu, Y . Chi, and C. Wu. V ol. 4. 2022, pp. 172–189

work page 2022
[43]

Adam: A Method for Stochastic Optimization

D. P. Kingma and L. J. Ba. “Adam: A Method for Stochastic Optimization”. In:International Conference on Learning Representations ICLR 2015. 2015

work page 2015
[44]

Directional Message Passing for Molecular Graphs

Johannes Klicpera, Janek Groß, Stephan Günnemann, et al. “Directional Message Passing for Molecular Graphs.” In: ICLR. 2020, pp. 1–13

work page 2020
[45]

Fast&Fair: Training Acceleration and Bias Mitigation for GNNs

Oyku Deniz Kose and Yanning Shen. “Fast&Fair: Training Acceleration and Bias Mitigation for GNNs”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=nOk4XEB7Ke

work page 2023
[46]

Featured Graph Coarsening with Similarity Guarantees

Manoj Kumar et al. “Featured Graph Coarsening with Similarity Guarantees”. In:Proceedings of the 40th International Conference on Machine Learning . Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 17953–17975

work page 2023
[47]

Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude

“Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude”. In: COURSERA: Neural networks for machine learning 4.2 (2012), p. 26

work page 2012
[48]

Machine learning structure preserving brack- ets for forecasting irreversible processes

Kookjin Lee, Nathaniel Trask, and Panos Stinis. “Machine learning structure preserving brack- ets for forecasting irreversible processes”. In:Advances in Neural Information Processing Systems 34 (2021), pp. 5696–5707

work page 2021
[49]

Fault and Noise Tolerance in the Incremental Extreme Learning Machine

Ho Chun Leung, Chi Sing Leung, and Eric Wing Ming Wong. “Fault and Noise Tolerance in the Incremental Extreme Learning Machine”. In: IEEE Access 7 (2019), pp. 155171–155183

work page 2019
[50]

Physics-constrained and flow-field-message-informed graph neural network for solving unsteady compressible flows

Siye Li et al. “Physics-constrained and flow-field-message-informed graph neural network for solving unsteady compressible flows”. In:Physics of Fluids 36.4 (2024). 12

work page 2024
[51]

PaGraph: Scaling GNN training on large graphs via computation-aware caching

Zhiqi Lin et al. “PaGraph: Scaling GNN training on large graphs via computation-aware caching”. In: Proceedings of the 11th ACM Symposium on Cloud Computing . SoCC ’20. Virtual Event, USA: Association for Computing Machinery, 2020, pp. 401–415. ISBN : 9781450381376. DOI: 10.1145/3419111.3421281

work page doi:10.1145/3419111.3421281 2020
[52]

On the limited memory BFGS method for large scale optimization

Dong C Liu and Jorge Nocedal. “On the limited memory BFGS method for large scale optimization”. In: Mathematical programming 45.1 (1989), pp. 503–528

work page 1989
[53]

On the Variance of the Adaptive Learning Rate and Beyond

Liyuan Liu et al. On the Variance of the Adaptive Learning Rate and Beyond. 2021. arXiv: 1908.03265 [cs.LG]. Pre-published

work page arXiv 2021
[54]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. 2019. arXiv: 1711.05101 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2019
[55]

Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning

Michael Lutter, Christian Ritter, and Jan Peters. “Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning”. In:International Conference on Learning Representations. 2019

work page 2019
[56]

A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation

Antonio Marino, Claudio Pacchierotti, and Paolo Robuffo Giordano. “A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation”. In:ACM Trans. Intell. Syst. Technol. (Mar. 2025). Just Accepted. ISSN : 2157-6904. DOI: 10.1145/3725857

work page doi:10.1145/3725857 2025
[57]

LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems

Xiangrui Meng, Michael A. Saunders, and Michael W. Mahoney. “LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems”. In: SIAM Journal on Scientific Computing 36.2 (Jan. 2014), pp. C95–C118. ISSN : 1064-8275, 1095-7197. DOI: 10.1137/ 120866580

work page 2014
[58]

arXiv preprint arXiv:2411.17164 , year=

Mohammad Amin Nabian et al. X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation. 2024. arXiv: 2411.17164 [cs.LG]. Pre-published

work page arXiv 2024
[59]

FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks

Amrit Nagarajan and Anand Raghunathan. “FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=1IYJfwJtjQ

work page 2023
[60]

Variational Learning of Euler–Lagrange Dy- namics from Data

Sina Ober-Bloebaum and Christian Offen. “Variational Learning of Euler–Lagrange Dy- namics from Data”. In: Journal of Computational and Applied Mathematics 421 (2023), p. 114780

work page 2023
[61]

Symplectic Integration of Learned Hamiltonian Systems

C. Offen and S. Ober-Bloebaum. “Symplectic Integration of Learned Hamiltonian Systems”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 32.1 (2022), p. 013122

work page 2022
[62]

Functional-link net computing: theory, system architecture, and functionalities

Y-H Pao and Yoshiyasu Takefuji. “Functional-link net computing: theory, system architecture, and functionalities”. In: Computer 25.5 (1992), pp. 76–79

work page 1992
[63]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library”. In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024–8035

work page 2019
[64]

Physics-informed graph convolutional neural network for modeling fluid flow and heat convection

Jiang-Zhou Peng et al. “Physics-informed graph convolutional neural network for modeling fluid flow and heat convection”. In:Physics of Fluids 35.8 (2023)

work page 2023
[65]

Learning Mesh-Based Simulation with Graph Networks

Tobias Pfaff et al. “Learning Mesh-Based Simulation with Graph Networks”. In:International Conference on Learning Representations. 2021

work page 2021
[66]

Uniform approximation of functions with random bases

Ali Rahimi and Benjamin Recht. “Uniform approximation of functions with random bases”. In: 2008 46th annual allerton conference on communication, control, and computing. IEEE. 2008, pp. 555–561

work page 2008
[67]

Training Hamiltonian Neural Net- works without Backpropagation

Atamert Rahma, Chinmay Datar, and Felix Dietrich. “Training Hamiltonian Neural Net- works without Backpropagation”. In: NeurIPS 2024 Workshop on Machine Learning and the Physical Sciences. NeurIPS 2024, Nov. 26, 2024

work page 2024
[68]

Quasi-Monte Carlo Graph Random Features

Isaac Reid, Krzysztof M Choromanski, and Adrian Weller. “Quasi-Monte Carlo Graph Random Features”. In: Advances in Neural Information Processing Systems. Ed. by A. Oh et al. V ol. 36. Curran Associates, Inc., 2023, pp. 14770–14796

work page 2023
[69]

General Graph Random Features

Isaac Reid et al. General Graph Random Features. 2023. arXiv: 2310.04859 [stat.ML]. Pre-published

work page arXiv 2023
[70]

A direct adaptive method for faster backpropagation learning: The RPROP algorithm

Martin Riedmiller and Heinrich Braun. “A direct adaptive method for faster backpropagation learning: The RPROP algorithm”. In: IEEE international conference on neural networks . IEEE. 1993, pp. 586–591

work page 1993
[71]

A Stochastic Approximation Method

Herbert E. Robbins. “A Stochastic Approximation Method”. In: Annals of Mathematical Statistics 22 (1951), pp. 400–407. 13

work page 1951
[72]

Roth et al

Fabian J. Roth et al. Stable Port-Hamiltonian Neural Networks. Feb. 4, 2025. arXiv: 2502. 02480 [cs]. Pre-published

work page 2025
[73]

Hamiltonian Graph Networks with ODE Integrators

Alvaro Sanchez-Gonzalez et al. “Hamiltonian Graph Networks with ODE Integrators”. In: Second Workshop on Machine Learning and the Physical Sciences (NeurIPS 2019), Vancouver, Canada. NeurIPS 2019, Sept. 27, 2019

work page 2019
[74]

Learning to simulate complex physics with graph networks

Alvaro Sanchez-Gonzalez et al. “Learning to simulate complex physics with graph networks”. In: International conference on machine learning. PMLR. 2020, pp. 8459–8468

work page 2020
[75]

Modeling Relational Data with Graph Convolutional Networks

Michael Schlichtkrull et al. “Modeling Relational Data with Graph Convolutional Networks”. In: The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings. Heraklion, Greece: Springer-Verlag, 2018, pp. 593–607. ISBN : 978-3-319-93416-7

work page 2018
[76]

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

Robin M Schmidt, Frank Schneider, and Philipp Hennig. “Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers”. In: Proceedings of the 38th Interna- tional Conference on Machine Learning. Ed. by Marina Meila and Tong Zhang. V ol. 139. Proceedings of Machine Learning Research. PMLR, 2021, pp. 9367–9376

work page 2021
[77]

Feed forward neural networks with random weights

Wouter F Schmidt, Martin A Kraaijveld, Robert PW Duin, et al. “Feed forward neural networks with random weights”. In: International conference on pattern recognition. IEEE Computer Society Press. 1992, pp. 1–1

work page 1992
[78]

Schnet: A continuous-filter convolutional neural network for modeling quantum interactions

Kristof Schütt et al. “Schnet: A continuous-filter convolutional neural network for modeling quantum interactions”. In: Advances in neural information processing systems 30 (2017)

work page 2017
[79]

ACM Comput

Yingxia Shao et al. “Distributed Graph Neural Network Training: A Survey”. In: ACM Comput. Surv. 56.8 (Apr. 2024). ISSN : 0360-0300. DOI: 10.1145/3648358

work page doi:10.1145/3648358 2024
[80]

arXiv preprint arXiv:2501.07373 , year =

Vinay Sharma and Olga Fink. Dynami-CAL GraphNet: A Physics-Informed Graph Neural Network Conserving Linear and Angular Momentum for Dynamical Systems. 2025. arXiv: 2501.07373 [cs.LG]. Pre-published

work page arXiv 2025

Showing first 80 references.

[1] [1]

End-to-end differentiable physics for learning and control

Filipe de Avila Belbute-Peres et al. “End-to-end differentiable physics for learning and control”. In: Advances in neural information processing systems 31 (2018)

work page 2018

[2] [2]

On learning Hamiltonian systems from data

Tom Bertalan et al. “On learning Hamiltonian systems from data”. In:Chaos: An Interdisci- plinary Journal of Nonlinear Science 29.12 (2019)

work page 2019

[3] [3]

Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network

Ravinder Bhattoo, Sayan Ranu, and N M Anoop Krishnan. “Learning Articulated Rigid Body Dynamics with Lagrangian Graph Neural Network”. In: Advances in Neural Information Processing Systems. Ed. by S. Koyejo et al. V ol. 35. Curran Associates, Inc., 2022, pp. 29789– 29800

work page 2022

[4] [4]

Sampling Weights of Deep Neural Networks

Erik L Bolager et al. “Sampling Weights of Deep Neural Networks”. In:Advances in Neural Information Processing Systems. V ol. 36. Curran Associates, Inc., 2023, pp. 63075–63116

work page 2023

[5] [5]

Gradient-Free Training of Recurrent Neural Networks

Erik Lien Bolager et al. Gradient-Free Training of Recurrent Neural Networks. Oct. 30, 2024. arXiv: 2410.23467 [cs]. Pre-published

work page arXiv 2024

[6] [6]

A unifying framework for spectrum- preserving graph sparsification and coarsening

Gecia Bravo-Hermsdorff and Lee M. Gunderson. “A unifying framework for spectrum- preserving graph sparsification and coarsening”. In:Proceedings of the 33rd International Conference on Neural Information Processing Systems. Red Hook, NY , USA: Curran Asso- ciates Inc., 2019

work page 2019

[7] [7]

GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation

Marc Brockschmidt. “GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modu- lation”. In: Proceedings of the 37th International Conference on Machine Learning. Ed. by Hal Daumé III and Aarti Singh. V ol. 119. Proceedings of Machine Learning Research. PMLR, July 2020, pp. 1144–1152. 10

work page 2020

[8] [8]

DGCL: an efficient communication library for distributed GNN training

Zhenkun Cai et al. “DGCL: an efficient communication library for distributed GNN training”. In: Proceedings of the Sixteenth European Conference on Computer Systems. EuroSys ’21. Online Event, United Kingdom: Association for Computing Machinery, 2021, pp. 130–144. ISBN : 9781450383349. DOI: 10.1145/3447786.3456233

work page doi:10.1145/3447786.3456233 2021

[9] [9]

Building a knowledge graph to enable precision medicine

Payal Chandak, Kexin Huang, and Marinka Zitnik. “Building a knowledge graph to enable precision medicine”. In: Scientific Data 10.1 (2023), p. 67

work page 2023

[10] [10]

A Compositional Object-Based Approach to Learning Physical Dynamics

Michael B Chang et al. “A compositional object-based approach to learning physical dynam- ics”. In: arXiv (2016). eprint: 1612.00341. Pre-published

work page internal anchor Pith review Pith/arXiv arXiv 2016

[11] [11]

Taming graph kernels with random features

Krzysztof Marcin Choromanski. “Taming graph kernels with random features”. In:Proceed- ings of the 40th International Conference on Machine Learning. Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 5964–5977

work page 2023

[12] [12]

Graph Neural Networks

Gabriele Corso et al. “Graph Neural Networks”. In: Nature Reviews Methods Primers 4.1 (Mar. 2024), pp. 1–13. ISSN : 2662-8449. DOI: 10.1038/s43586-024-00294-7

work page doi:10.1038/s43586-024-00294-7 2024

[13] [13]

Lagrangian Neural Networks

Miles Cranmer et al. “Lagrangian Neural Networks”. In:ICLR 2020 Workshop on Integration of Deep Neural Models and Differential Equations. 2019

work page 2020

[14] [14]

Fast training of accurate physics-informed neural networks without gradient descent

Chinmay Datar et al. Solving Partial Differential Equations with Sampled Neural Networks. May 31, 2024. arXiv: 2405.20836 [math]. Pre-published

work page internal anchor Pith review Pith/arXiv arXiv 2024

[15] [15]

Robust deep learning–based protein sequence design using Protein- MPNN

Justas Dauparas et al. “Robust deep learning–based protein sequence design using Protein- MPNN”. In: Science 378.6615 (2022), pp. 49–56

work page 2022

[16] [16]

Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems

Shaan A Desai et al. “Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems”. In: Physical Review E 104.3 (2021), p. 034312

work page 2021

[17] [17]

Graph neural networks at the Large Hadron Collider

Gage DeZoort et al. “Graph neural networks at the Large Hadron Collider”. In: Nature Reviews Physics 5.5 (2023), pp. 281–303

work page 2023

[18] [18]

Hamiltonian Neural Networks with Automatic Symmetry Detection

Eva Dierkes et al. “Hamiltonian Neural Networks with Automatic Symmetry Detection”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 33.6 (June 1, 2023), p. 063115. ISSN : 1054-1500, 1089-7682

work page 2023

[19] [19]

Incorporating Nesterov momentum into Adam

Timothy Dozat. “Incorporating Nesterov momentum into Adam”. In:Proceedings of the 4th International Conference on Learning Representations, Workshop Track(May 2016)

work page 2016

[20] [20]

Adaptive Subgradient Methods for Online Learning and Stochastic Optimization

John Duchi, Elad Hazan, and Yoram Singer. “Adaptive Subgradient Methods for Online Learning and Stochastic Optimization”. In: Journal of Machine Learning Research 12.61 (2011), pp. 2121–2159

work page 2011

[21] [21]

Random Projection Neural Networks of Best Approximation: Convergence Theory and Practical Applications

Gianluca Fabiani. Random Projection Neural Networks of Best Approximation: Convergence Theory and Practical Applications. Feb. 2024. arXiv: 2402.11397 [cs]. Pre-published

work page arXiv 2024

[22] [22]

Numerical Solution and Bifurcation Analysis of Nonlinear Partial Differential Equations with Extreme Learning Machines

Gianluca Fabiani et al. “Numerical Solution and Bifurcation Analysis of Nonlinear Partial Differential Equations with Extreme Learning Machines”. In:Journal of Scientific Computing 89.2 (Nov. 2021), p. 44. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915-021-01650- 5

work page doi:10.1007/s10915-021-01650- 2021

[23] [23]

RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators

Gianluca Fabiani et al. “RandONets: Shallow Networks with Random Projections for Learn- ing Linear and Nonlinear Operators”. In: Journal of Computational Physics 520 (Jan. 2025), p. 113433. ISSN : 00219991. DOI: 10.1016/j.jcp.2024.113433

work page doi:10.1016/j.jcp.2024.113433 2025

[24] [24]

Structure-Aware Random Fourier Kernel for Graphs

Jinyuan Fang et al. “Structure-Aware Random Fourier Kernel for Graphs”. In:Advances in Neural Information Processing Systems. Ed. by M. Ranzato et al. V ol. 34. Curran Associates, Inc., 2021, pp. 17681–17694

work page 2021

[25] [25]

Numerical Bifurcation Analysis of PDEs From Lattice Boltzmann Model Simulations: A Parsimonious Machine Learning Approach

Evangelos Galaris et al. “Numerical Bifurcation Analysis of PDEs From Lattice Boltzmann Model Simulations: A Parsimonious Machine Learning Approach”. In: Journal of Scientific Computing 92.2 (Aug. 2022), p. 34. ISSN : 0885-7474, 1573-7691. DOI: 10.1007/s10915- 022-01883-y

work page doi:10.1007/s10915- 2022

[26] [26]

Fast and deep graph neural networks

Claudio Gallicchio and Alessio Micheli. “Fast and deep graph neural networks”. In: Pro- ceedings of the AAAI conference on artificial intelligence . V ol. 34. 04. 2020, pp. 3898– 3905

work page 2020

[27] [27]

Graph echo state networks

Claudio Gallicchio and Alessio Micheli. “Graph echo state networks”. In: The 2010 interna- tional joint conference on neural networks (IJCNN). IEEE. 2010, pp. 1–8

work page 2010

[28] [28]

Neural Message Passing for Quantum Chemistry

Justin Gilmer et al. “Neural Message Passing for Quantum Chemistry”. In:Proceedings of the 34th International Conference on Machine Learning. Ed. by Doina Precup and Yee Whye Teh. V ol. 70. Proceedings of Machine Learning Research. PMLR, Aug. 2017, pp. 1263–1272. 11

work page 2017

[29] [29]

SGD: General Analysis and Improved Rates

Robert Mansel Gower et al. SGD: General Analysis and Improved Rates . 2019. arXiv: 1901.09401 [cs.LG]. Pre-published

work page arXiv 2019

[30] [30]

Hamiltonian Neural Networks

Samuel Greydanus, Misko Dzamba, and Jason Yosinski. “Hamiltonian Neural Networks”. In: Advances in Neural Information Processing Systems. Ed. by H. Wallach et al. V ol. 32. Curran Associates, Inc., 2019

work page 2019

[31] [31]

Efficiently Parameterized Neural Metriplectic Systems

Anthony Gruber et al. Efficiently Parameterized Neural Metriplectic Systems. Jan. 27, 2025. arXiv: 2405.16305 [cs]. Pre-published

work page arXiv 2025

[32] [32]

GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs

Vipul Gupta et al. “GraphScale: A Framework to Enable Machine Learning over Billion-node Graphs”. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management. CIKM ’24. Boise, ID, USA: Association for Computing Machinery, 2024, pp. 4514–4521. ISBN : 9798400704369. DOI: 10.1145/3627673.3680021

work page doi:10.1145/3627673.3680021 2024

[33] [33]

Geometric numerical integration illustrated by the Störmer–Verlet method

Ernst Hairer, Christian Lubich, and Gerhard Wanner. “Geometric numerical integration illustrated by the Störmer–Verlet method”. In:Acta numerica 12 (2003), pp. 399–450

work page 2003

[34] [34]

On a General Method in Dynamics

William Rowan Hamilton. “On a General Method in Dynamics”. In:Philosophical Transac- tions of the Royal Society 124 (1834), pp. 247–308

work page

[35] [35]

Second Essay on a General Method in Dynamics

William Rowan Hamilton. “Second Essay on a General Method in Dynamics”. In:Philosoph- ical Transactions of the Royal Society 125 (1835), pp. 95–144

work page

[36] [36]

A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation

Mohammad Hashemi et al. A Comprehensive Survey on Graph Reduction: Sparsification, Coarsening, and Condensation. 2024. arXiv: 2402.03358 [cs.SI]

work page arXiv 2024

[37] [37]

Structure-Preserving Neural Networks

Quercus Hernández et al. “Structure-Preserving Neural Networks”. In: Journal of Computa- tional Physics 426 (Feb. 2021), p. 109950. ISSN : 00219991

work page 2021

[38] [38]

Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes

Guang-Bin Huang, Lei Chen, and Chee Siew. “Universal Approximation Using Incremental Constructive Feedforward Networks With Random Hidden Nodes”. In:IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 17 (2006), pp. 879–92

work page 2006

[39] [39]

Extreme learning machine: a new learning scheme of feedforward neural networks

Guang-Bin Huang, Qin-Yu Zhu, and Chee-Kheong Siew. “Extreme learning machine: a new learning scheme of feedforward neural networks”. In: 2004 IEEE international joint conference on neural networks (IEEE Cat. No. 04CH37541). V ol. 2. 2004, pp. 985–990

work page 2004

[40] [40]

Condensing Graphs via One-Step Gradient Matching

Wei Jin et al. “Condensing Graphs via One-Step Gradient Matching”. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’22. Washington DC, USA: Association for Computing Machinery, 2022, pp. 720–730. ISBN : 9781450393850. DOI: 10.1145/3534678.3539429

work page doi:10.1145/3534678.3539429 2022

[41] [41]

Graph Condensation for Graph Neural Networks

Wei Jin et al. “Graph Condensation for Graph Neural Networks”. In:International Confer- ence on Learning Representations. 2022. URL: https://openreview.net/forum?id= WLEx3Jo4QaB

work page 2022

[42] [42]

Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining

Tim Kaler et al. “Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining”. In: Proceedings of Machine Learning and Systems . Ed. by D. Marculescu, Y . Chi, and C. Wu. V ol. 4. 2022, pp. 172–189

work page 2022

[43] [43]

Adam: A Method for Stochastic Optimization

D. P. Kingma and L. J. Ba. “Adam: A Method for Stochastic Optimization”. In:International Conference on Learning Representations ICLR 2015. 2015

work page 2015

[44] [44]

Directional Message Passing for Molecular Graphs

Johannes Klicpera, Janek Groß, Stephan Günnemann, et al. “Directional Message Passing for Molecular Graphs.” In: ICLR. 2020, pp. 1–13

work page 2020

[45] [45]

Fast&Fair: Training Acceleration and Bias Mitigation for GNNs

Oyku Deniz Kose and Yanning Shen. “Fast&Fair: Training Acceleration and Bias Mitigation for GNNs”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=nOk4XEB7Ke

work page 2023

[46] [46]

Featured Graph Coarsening with Similarity Guarantees

Manoj Kumar et al. “Featured Graph Coarsening with Similarity Guarantees”. In:Proceedings of the 40th International Conference on Machine Learning . Ed. by Andreas Krause et al. V ol. 202. Proceedings of Machine Learning Research. PMLR, July 2023, pp. 17953–17975

work page 2023

[47] [47]

Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude

“Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude”. In: COURSERA: Neural networks for machine learning 4.2 (2012), p. 26

work page 2012

[48] [48]

Machine learning structure preserving brack- ets for forecasting irreversible processes

Kookjin Lee, Nathaniel Trask, and Panos Stinis. “Machine learning structure preserving brack- ets for forecasting irreversible processes”. In:Advances in Neural Information Processing Systems 34 (2021), pp. 5696–5707

work page 2021

[49] [49]

Fault and Noise Tolerance in the Incremental Extreme Learning Machine

Ho Chun Leung, Chi Sing Leung, and Eric Wing Ming Wong. “Fault and Noise Tolerance in the Incremental Extreme Learning Machine”. In: IEEE Access 7 (2019), pp. 155171–155183

work page 2019

[50] [50]

Physics-constrained and flow-field-message-informed graph neural network for solving unsteady compressible flows

Siye Li et al. “Physics-constrained and flow-field-message-informed graph neural network for solving unsteady compressible flows”. In:Physics of Fluids 36.4 (2024). 12

work page 2024

[51] [51]

PaGraph: Scaling GNN training on large graphs via computation-aware caching

Zhiqi Lin et al. “PaGraph: Scaling GNN training on large graphs via computation-aware caching”. In: Proceedings of the 11th ACM Symposium on Cloud Computing . SoCC ’20. Virtual Event, USA: Association for Computing Machinery, 2020, pp. 401–415. ISBN : 9781450381376. DOI: 10.1145/3419111.3421281

work page doi:10.1145/3419111.3421281 2020

[52] [52]

On the limited memory BFGS method for large scale optimization

Dong C Liu and Jorge Nocedal. “On the limited memory BFGS method for large scale optimization”. In: Mathematical programming 45.1 (1989), pp. 503–528

work page 1989

[53] [53]

On the Variance of the Adaptive Learning Rate and Beyond

Liyuan Liu et al. On the Variance of the Adaptive Learning Rate and Beyond. 2021. arXiv: 1908.03265 [cs.LG]. Pre-published

work page arXiv 2021

[54] [54]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled Weight Decay Regularization. 2019. arXiv: 1711.05101 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2019

[55] [55]

Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning

Michael Lutter, Christian Ritter, and Jan Peters. “Deep Lagrangian Networks: Using Physics as Model Prior for Deep Learning”. In:International Conference on Learning Representations. 2019

work page 2019

[56] [56]

A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation

Antonio Marino, Claudio Pacchierotti, and Paolo Robuffo Giordano. “A Gated Graph Neural Network Approach to Fast-Convergent Dynamic Average Estimation”. In:ACM Trans. Intell. Syst. Technol. (Mar. 2025). Just Accepted. ISSN : 2157-6904. DOI: 10.1145/3725857

work page doi:10.1145/3725857 2025

[57] [57]

LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems

Xiangrui Meng, Michael A. Saunders, and Michael W. Mahoney. “LSRN: A Parallel Iterative Solver for Strongly Over- or Underdetermined Systems”. In: SIAM Journal on Scientific Computing 36.2 (Jan. 2014), pp. C95–C118. ISSN : 1064-8275, 1095-7197. DOI: 10.1137/ 120866580

work page 2014

[58] [58]

arXiv preprint arXiv:2411.17164 , year=

Mohammad Amin Nabian et al. X-MeshGraphNet: Scalable Multi-Scale Graph Neural Networks for Physics Simulation. 2024. arXiv: 2411.17164 [cs.LG]. Pre-published

work page arXiv 2024

[59] [59]

FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks

Amrit Nagarajan and Anand Raghunathan. “FASTRAIN-GNN: Fast and Accurate Self- Training for Graph Neural Networks”. In: Transactions on Machine Learning Research (2023). ISSN : 2835-8856. URL: https://openreview.net/forum?id=1IYJfwJtjQ

work page 2023

[60] [60]

Variational Learning of Euler–Lagrange Dy- namics from Data

Sina Ober-Bloebaum and Christian Offen. “Variational Learning of Euler–Lagrange Dy- namics from Data”. In: Journal of Computational and Applied Mathematics 421 (2023), p. 114780

work page 2023

[61] [61]

Symplectic Integration of Learned Hamiltonian Systems

C. Offen and S. Ober-Bloebaum. “Symplectic Integration of Learned Hamiltonian Systems”. In: Chaos: An Interdisciplinary Journal of Nonlinear Science 32.1 (2022), p. 013122

work page 2022

[62] [62]

Functional-link net computing: theory, system architecture, and functionalities

Y-H Pao and Yoshiyasu Takefuji. “Functional-link net computing: theory, system architecture, and functionalities”. In: Computer 25.5 (1992), pp. 76–79

work page 1992

[63] [63]

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Adam Paszke et al. “PyTorch: An Imperative Style, High-Performance Deep Learning Library”. In: Advances in Neural Information Processing Systems 32. Curran Associates, Inc., 2019, pp. 8024–8035

work page 2019

[64] [64]

Physics-informed graph convolutional neural network for modeling fluid flow and heat convection

Jiang-Zhou Peng et al. “Physics-informed graph convolutional neural network for modeling fluid flow and heat convection”. In:Physics of Fluids 35.8 (2023)

work page 2023

[65] [65]

Learning Mesh-Based Simulation with Graph Networks

Tobias Pfaff et al. “Learning Mesh-Based Simulation with Graph Networks”. In:International Conference on Learning Representations. 2021

work page 2021

[66] [66]

Uniform approximation of functions with random bases

Ali Rahimi and Benjamin Recht. “Uniform approximation of functions with random bases”. In: 2008 46th annual allerton conference on communication, control, and computing. IEEE. 2008, pp. 555–561

work page 2008

[67] [67]

Training Hamiltonian Neural Net- works without Backpropagation

Atamert Rahma, Chinmay Datar, and Felix Dietrich. “Training Hamiltonian Neural Net- works without Backpropagation”. In: NeurIPS 2024 Workshop on Machine Learning and the Physical Sciences. NeurIPS 2024, Nov. 26, 2024

work page 2024

[68] [68]

Quasi-Monte Carlo Graph Random Features

Isaac Reid, Krzysztof M Choromanski, and Adrian Weller. “Quasi-Monte Carlo Graph Random Features”. In: Advances in Neural Information Processing Systems. Ed. by A. Oh et al. V ol. 36. Curran Associates, Inc., 2023, pp. 14770–14796

work page 2023

[69] [69]

General Graph Random Features

Isaac Reid et al. General Graph Random Features. 2023. arXiv: 2310.04859 [stat.ML]. Pre-published

work page arXiv 2023

[70] [70]

A direct adaptive method for faster backpropagation learning: The RPROP algorithm

Martin Riedmiller and Heinrich Braun. “A direct adaptive method for faster backpropagation learning: The RPROP algorithm”. In: IEEE international conference on neural networks . IEEE. 1993, pp. 586–591

work page 1993

[71] [71]

A Stochastic Approximation Method

Herbert E. Robbins. “A Stochastic Approximation Method”. In: Annals of Mathematical Statistics 22 (1951), pp. 400–407. 13

work page 1951

[72] [72]

Roth et al

Fabian J. Roth et al. Stable Port-Hamiltonian Neural Networks. Feb. 4, 2025. arXiv: 2502. 02480 [cs]. Pre-published

work page 2025

[73] [73]

Hamiltonian Graph Networks with ODE Integrators

Alvaro Sanchez-Gonzalez et al. “Hamiltonian Graph Networks with ODE Integrators”. In: Second Workshop on Machine Learning and the Physical Sciences (NeurIPS 2019), Vancouver, Canada. NeurIPS 2019, Sept. 27, 2019

work page 2019

[74] [74]

Learning to simulate complex physics with graph networks

Alvaro Sanchez-Gonzalez et al. “Learning to simulate complex physics with graph networks”. In: International conference on machine learning. PMLR. 2020, pp. 8459–8468

work page 2020

[75] [75]

Modeling Relational Data with Graph Convolutional Networks

Michael Schlichtkrull et al. “Modeling Relational Data with Graph Convolutional Networks”. In: The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings. Heraklion, Greece: Springer-Verlag, 2018, pp. 593–607. ISBN : 978-3-319-93416-7

work page 2018

[76] [76]

Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers

Robin M Schmidt, Frank Schneider, and Philipp Hennig. “Descending through a Crowded Valley - Benchmarking Deep Learning Optimizers”. In: Proceedings of the 38th Interna- tional Conference on Machine Learning. Ed. by Marina Meila and Tong Zhang. V ol. 139. Proceedings of Machine Learning Research. PMLR, 2021, pp. 9367–9376

work page 2021

[77] [77]

Feed forward neural networks with random weights

Wouter F Schmidt, Martin A Kraaijveld, Robert PW Duin, et al. “Feed forward neural networks with random weights”. In: International conference on pattern recognition. IEEE Computer Society Press. 1992, pp. 1–1

work page 1992

[78] [78]

Schnet: A continuous-filter convolutional neural network for modeling quantum interactions

Kristof Schütt et al. “Schnet: A continuous-filter convolutional neural network for modeling quantum interactions”. In: Advances in neural information processing systems 30 (2017)

work page 2017

[79] [79]

ACM Comput

Yingxia Shao et al. “Distributed Graph Neural Network Training: A Survey”. In: ACM Comput. Surv. 56.8 (Apr. 2024). ISSN : 0360-0300. DOI: 10.1145/3648358

work page doi:10.1145/3648358 2024

[80] [80]

arXiv preprint arXiv:2501.07373 , year =

Vinay Sharma and Olga Fink. Dynami-CAL GraphNet: A Physics-Informed Graph Neural Network Conserving Linear and Angular Momentum for Dynamical Systems. 2025. arXiv: 2501.07373 [cs.LG]. Pre-published

work page arXiv 2025