pith. machine review for the scientific record. sign in

arxiv: 2603.10992 · v5 · submitted 2026-03-11 · 📊 stat.ML · cs.LG· physics.chem-ph· physics.comp-ph

Recognition: no theorem link

A Tutorial Review of Bayesian Optimization with Gaussian Processes to Accelerate Stationary Point Searches

Authors on Pith no claims yet

Pith reviewed 2026-05-15 12:49 UTC · model grok-4.3

classification 📊 stat.ML cs.LGphysics.chem-phphysics.comp-ph
keywords Bayesian optimizationGaussian processesstationary point searchespotential energy surfacesactive learningsaddle point locationpath optimizationsurrogate models
0
0 comments X

The pith

Bayesian optimization with Gaussian processes unifies minimization, saddle searches, and double-ended paths on potential energy surfaces via one shared surrogate loop.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents a unified Bayesian optimization framework for three stationary point tasks on potential energy surfaces: simple energy minimization, locating single saddle points, and tracing double-ended reaction paths. All three tasks reduce to the same six-step loop that builds local Gaussian process surrogates from derivative observations and inverse-distance kernels, then uses an acquisition function to decide the next electronic structure evaluation. A sympathetic reader would care because the expensive quantum calculations are the dominant cost in computational chemistry, and the surrogate approach claims to cut those evaluations by roughly an order of magnitude without changing the final accuracy. The review supplies pedagogical Rust code showing that only the inner optimization target and acquisition criterion need to change between the three applications.

Core claim

Minimization, single-point saddle searches, and double-ended path searches all follow the identical six-step Bayesian optimization surrogate loop that employs Gaussian process regression with derivative observations and inverse-distance kernels; the tasks differ solely in the choice of inner optimization target and acquisition criterion, with optional extensions such as farthest-point sampling via Earth Mover's Distance, MAP regularization, adaptive trust radius, and random Fourier features available for production scaling.

What carries the argument

The six-step surrogate loop of Gaussian process regression with derivative observations and inverse-distance kernels that generates acquisition-guided proposals for the next electronic structure calculation.

If this is right

  • The same code structure implements minimization, saddle searches, and path searches, with only the target function and acquisition criterion changed.
  • Electronic structure evaluations drop by roughly an order of magnitude depending on oracle cost, search distance, and availability of analytical forces.
  • Optional production extensions such as adaptive trust radius and random Fourier features maintain the core loop while improving scalability.
  • The accompanying Rust code demonstrates direct translation from the unified theoretical formulation to executable searches.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same surrogate loop could be adapted to other local optimization problems in quantum chemistry where derivative information is available.
  • Combining the framework with pre-trained machine-learned potentials might further lower the cost for very large systems.
  • Systematic tests on surfaces with varying curvature would clarify how the observed speedup scales with problem difficulty.

Load-bearing premise

Gaussian process regression with inverse-distance kernels and derivative observations can serve as reliable local surrogates that reduce electronic structure evaluations by roughly an order of magnitude while preserving the accuracy of the underlying theory.

What would settle it

Running the surrogate loop and a standard search side-by-side on the same set of molecular benchmarks and finding that the surrogate version requires more than half as many electronic structure calls to reach stationary points of equivalent accuracy would falsify the claimed reduction.

Figures

Figures reproduced from arXiv: 2603.10992 by \'Ecole polytechnique f\'ed\'erale de Lausanne (EPFL), Lab-COSMO, Lausanne, Rohit Goswami (1) ((1) Institute IMX, Switzerland).

Figure 1
Figure 1. Figure 1: Geometry of the dimer and Householder reflection. The dimer pair (R1, R2) straddles the midpoint R0 with axis Nˆ . The true force F (blue) is reflected about the hyperplane perpendicular to Nˆ , producing the modified force F † (coral) that climbs along the minimum mode while relaxing perpendicular to it. The first case (C < 0) applies the Householder reflection just described: the dimer is already in a re… view at source ↗
Figure 2
Figure 2. Figure 2: GP conditioning in three panels. (Left) Before any data, the prior (Eq. 3.1) admits a wide family of smooth functions around a chosen mean function. (Center) Oracle evaluations supply energies and forces at selected configurations. (Right) Conditioning on the data collapses the posterior near training points while preserving wide uncertainty elsewhere; the posterior mean serves as the surrogate surface VGP… view at source ↗
Figure 3
Figure 3. Figure 3: GP surrogate fidelity as a function of training set size on the Muller-Brown surface. Each panel shows the GP posterior mean contours after training on N = 3, 8, 15, 30 Latin hypercube-sampled configurations (white markers). With three points the surrogate captures only crude basin structure; by 30 points the contours closely match the true PES ( [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: GP predictive variance on the Muller-Brown surface after 20 training evaluations clustered near minimum A and saddle S1 (black dots). The variance is near zero close to training data and grows with distance, reaching a maximum (coral diamond) in the unexplored region. This landscape is a map of sampling density in the kernel geometry, not of accuracy against the true surface: it tells the active-learning l… view at source ↗
Figure 5
Figure 5. Figure 5: Block structure of the full covariance matrix Kfull. The base kernel in feature space generates four Cartesian-space blocks through differentiation via the feature Jacobian J. Darker shading indicates higher computational cost. Energies and forces have different magnitudes and units, so separate noise variances σ 2 E and σ 2 F are assigned to each block. Because the electronic structure data is determinist… view at source ↗
Figure 6
Figure 6. Figure 6: The inverse-distance feature map. Cartesian coordinates (R 3N , not invariant) are mapped to pairwise inverse distances (N(N−1)/2 features, invariant to rotation and translation). The SE kernel operates in this feature space. The Jacobian J = ∂ϕ/∂x propagates through the kernel via the chain rule to produce the derivative blocks needed for force predictions. k(x, x ′ ) = σ 2 c + σ 2 f exp  − 1 2 X i X j>… view at source ↗
Figure 7
Figure 7. Figure 7: Hyperparameter sensitivity on the Muller-Brown surface. Each panel shows a 1D slice at y = 0.5 with the true PES (black dashed), the GP posterior mean (teal), and the ±2σ confidence band (light blue), for nine combinations of length scale ℓ ∈ {0.05, 0.3, 2.0} (columns) and signal variance σf ∈ {0.1, 1.0, 100.0} (rows). Small ℓ produces noisy interpolation; large ℓ over-smooths and misses barrier structure.… view at source ↗
Figure 8
Figure 8. Figure 8: Visual overview of the Bayesian surrogate loop (Algorithm 4). Numbered steps proceed clockwise: (1) train the GP, (2) optimize on the surrogate, (3) check trust constraints, (4) evaluate the oracle, (5) select the next query point, (6) update the training set. The oracle (coral) is the only expensive step; all others operate on the cheap surrogate. Method-specific annotations indicate how each algorithm in… view at source ↗
Figure 9
Figure 9. Figure 9: Convergence comparison of the standard dimer, GP-dimer, and OT-GP dimer (OTGPD) on a molecular system (C3H5 allyl radical, 8 atoms, 24 DOF) via eOn serve mode. The maximum per-atom force (eV/Å) is plotted against oracle evaluations on a logarithmic scale. The OTGPD variant reaches the convergence threshold (gray dashed, 0.1 eV/Å) with the fewest oracle calls; the GP-dimer shows oscillations from surrogate … view at source ↗
Figure 10
Figure 10. Figure 10: Trust region mechanism. (Left) A GP-proposed step (coral) that exceeds the trust boundary dEMD ≤ Θ is clipped to the boundary; the oracle evaluates at the clipped location. (Right) The trust radius grows with accumulated data via an exponential saturation curve (Θearned), capped by a system-size-dependent physical ceiling (Θphys). 26 [PITH_FULL_IMAGE:figures/full_fig_p026_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Trust region mechanism on a 1D slice (y = 0.5) of the Muller-Brown surface. The GP posterior mean (teal) and ±2σ confidence band (light blue) are accurate near the training data (black dots) but diverge from the true surface (black dashed) outside the trust boundaries (magenta dotted verticals). A hypothetical GP-proposed step at x = 1.0 (coral cross, labeled "GP step") falls outside the trust region, whe… view at source ↗
Figure 12
Figure 12. Figure 12: Muller-Brown potential energy surface with NEB path overlay. Filled contours show the energy landscape with three local minima (A, B, C) and two saddle points (S1, S2). Eleven NEB images (coral circles, numbered) trace the minimum energy path from A to B through S2. The climbing image (highest-energy interior image) approximates the saddle point. Energy values are reported in the conventional Muller-Brown… view at source ↗
Figure 13
Figure 13. Figure 13: LEPS potential energy surface with NEB path overlay. The collinear atom transfer reaction A + BC → AB + C is plotted as a function of bond distances rAB and rBC . Seven interior NEB images and two fixed endpoints (nine path points, coral circles numbered; endpoints also marked by yellow stars) are optimized in the full 9-dimensional coordinate space and projected onto the (rAB, rBC ) plane. The climbing i… view at source ↗
Figure 14
Figure 14. Figure 14: Convergence of NEB variants on the LEPS surface. Maximum per-atom force versus oracle evaluations on a logarithmic scale. Standard NEB (156 calls), AIE (100 calls), and OIE (42 calls) all reach the convergence threshold (dashed, 0.1 eV/Å). The OIE variant evaluates only the highest-variance image per cycle (Eq. 4.3) and converges fastest. Path initialization matters for the GP-NEB. The sequential image-de… view at source ↗
Figure 15
Figure 15. Figure 15: Convergence comparison of the GP-minimizer and classical L-BFGS on the LEPS surface. With a force convergence threshold of 10−2 eV/Å, the GP surrogate reaches the threshold in 9 oracle calls, compared with 57 for direct L-BFGS on the same starting configuration. Force values plotted on a logarithmic scale. minimum or when the electronic structure cost per evaluation is high (large systems, high-level meth… view at source ↗
Figure 16
Figure 16. Figure 16: Decision flow of the OT-GP framework. The training pipeline (FPS, MAP regularization) and adaptive trust region (EMD-based) address the failure modes of the basic GP-dimer. The optional prior-mean branch shown here is included to place later extensions in the same design space, not to redefine the main algorithmic thread discussed in the text. 35 [PITH_FULL_IMAGE:figures/full_fig_p035_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: shows the FPS selection rule and the EMD-based structural comparison used for molecular configurations. Farthest Point Sampling x1 x3 dmax Greedy: each new point maximizes min distance to existing subset N → Msub ≪ N Earth Mover’s Distance H C N config x H C N config x ′ dEMD = 1 Nt minΠ ∑ ij Πij cij Optimal assignment cost invariant to rotation + indexing [PITH_FULL_IMAGE:figures/full_fig_p036_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: Farthest point sampling (FPS) on the LEPS surface. Approximately 50 candidate configurations from a GP-NEB run are projected onto their first two principal components in inverse-distance feature space. FPS selects 20 points (teal diamonds) that maximally cover the feature space; pruned points (gray circles) lie near already-selected configurations and would add redundancy to the training set without impro… view at source ↗
Figure 19
Figure 19. Figure 19: RFF approximation quality on the LEPS surface (H3, 3 inverse-distance features). Energy MAE (top) and gradient MAE (bottom) between RFF and exact GP predictions on held-out test points, plotted against Drff. The kernel is well approximated by Drff ∼ 100. The RFF extension presented here extends the OT-GP framework to larger systems across the Dimer, NEB and minimization. In practice, molecular systems ben… view at source ↗
Figure 20
Figure 20. Figure 20: Convergence of GP-accelerated minimization on a real molecular system (PET-MAD potential). The maximum per-atom force is plotted against the number of oracle evaluations on a logarithmic scale. For NEB, a 9-atom cycloaddition system (C2H4 + N2O, 27 degrees of freedom, 36 inverse-distance features) on the PET-MAD surface illustrates how the GP-NEB variants scale to molecular reactions [PITH_FULL_IMAGE:fig… view at source ↗
Figure 21
Figure 21. Figure 21: Illustrative convergence comparison of GP-NEB variants on a 9-atom cycloaddition (PET-MAD surface, 27 DOF). Climbing-image force is plotted against oracle evaluations on a logarithmic scale for the classical NEB, AIE, and OIE update patterns. The figure is used here to compare the qualitative behavior of the three outer-loop choices on the same molecular path; the tighter CI-targeted literature-style benc… view at source ↗
Figure 22
Figure 22. Figure 22: Energy profiles along the converged MEP for the cycloaddition system on the PET-MAD surface. All three NEB variants recover the same barrier and exothermic product basin. The AIE profile is slightly shifted near the saddle region but converges to the same endpoints. saddle search, CI-NEB, OTGPD refinements, and prior-mean or meta-GP variants be discussed within one tutorial without pretending that every b… view at source ↗
Figure 23
Figure 23. Figure 23: Reaction valley projection [94] of the NEB paths on the PET-MAD surface. Reaction progress s and orthogonal deviation d are projected RMSD coordinates. Standard NEB (blue stars) and GP-NEB AIE (orange stars) saddle points are compared; both lie near the true saddle (black square). Dashed contours show GP variance (σ 2 ); the paths remain within the well-sampled region where the surrogate has accumulated d… view at source ↗
Figure 24
Figure 24. Figure 24: Classical dimer (left) and GP-dimer (right) flowcharts. The classical version evaluates the true PES at every rotation step; the GP-accelerated version performs rotations and translations on the surrogate surface, querying the oracle only when trust radius violations occur or GP-level convergence is achieved. 47 [PITH_FULL_IMAGE:figures/full_fig_p047_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Classical NEB (left) and GP-NEB (right) flowcharts. The classical version optimizes all images via L-BFGS; the GP-NEB variant uses one-image evaluation with a configurable acquisition rule (pure variance or UCB in the current implementation) to reduce oracle queries. B Marginal Likelihood Landscape 3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 0.5 ln 2 1 0 1 2 3 4 ln 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 lo g10 ( M A… view at source ↗
Figure 26
Figure 26. Figure 26: MAP-regularized negative log marginal likelihood landscape on the LEPS surface. The filled contours show the NLL as a function of the log-hyperparameters ln σ 2 and ln θ, computed on a 40 × 40 grid from 5 training points near the reactant. The dashed black contours show the gradient norm. The coral star marks the MAP optimum that SCG converges to. Regions where the Cholesky factorization fails (hyperparam… view at source ↗
read the original abstract

Building local surrogates to accelerate stationary point searches on potential energy surfaces spans decades of effort. Done correctly, surrogates can reduce the number of expensive electronic structure evaluations by roughly an order of magnitude while preserving the accuracy of the underlying theory, with the gain depending on oracle cost, search distance, and the availability of analytical forces. We present a unified Bayesian optimization view of minimization, single-point saddle searches, and double-ended path searches: all three share one six-step surrogate loop and differ only in the inner optimization target and the acquisition criterion. The framework uses Gaussian process regression with derivative observations, inverse-distance kernels, and active learning, and we develop optional extensions for production use, including farthest-point sampling with the Earth Mover's Distance, MAP regularization, an adaptive trust radius, and random Fourier features for scaling. Accompanying pedagogical Rust code demonstrates that all three applications use the same Bayesian optimization loop, bridging the gap between theoretical formulation and practical execution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The manuscript presents a unified Bayesian optimization framework for accelerating stationary point searches on potential energy surfaces. It claims that minimization, single-point saddle searches, and double-ended path searches all follow the same six-step surrogate loop using Gaussian process regression with derivative observations and inverse-distance kernels, differing only in the inner optimization target and acquisition criterion. The work includes optional extensions (farthest-point sampling, MAP regularization, adaptive trust radius, random Fourier features) and provides accompanying pedagogical Rust code to demonstrate practical execution.

Significance. If the central unification holds and is verified by the provided code, the paper offers a coherent tutorial synthesis that could lower the barrier for applying surrogate methods in computational chemistry. The reproducible code is a clear strength, enabling independent verification of the shared loop structure across the three applications. The performance claim of roughly order-of-magnitude reduction in electronic structure evaluations is conditional on oracle cost and search distance, which appropriately limits overstatement.

major comments (2)
  1. [Abstract and §3] Abstract and §3 (six-step loop): the central unification claim that all three searches 'share one six-step surrogate loop and differ only in the inner optimization target and the acquisition criterion' is load-bearing but would be strengthened by an explicit comparison table listing the target function and acquisition function for each of the three cases; without it the 'differ only in' assertion remains conceptual rather than directly verifiable.
  2. [Abstract] Abstract: the statement that surrogates 'can reduce the number of expensive electronic structure evaluations by roughly an order of magnitude' is presented as a key practical benefit yet lacks a specific benchmark, timing table, or reference to a controlled comparison within the manuscript; this quantitative claim should be evidenced or qualified with the conditions under which it holds.
minor comments (3)
  1. [Methods] The inverse-distance kernel and its derivative observations are central but the explicit functional form and hyperparameter handling could be stated in a single equation block for clarity.
  2. [Introduction] Consider adding a short related-work subsection that distinguishes the present synthesis from earlier GP-based PES surrogate papers to better highlight the tutorial contribution.
  3. [Code availability] The Rust code repository link should be accompanied by a permanent archive (e.g., Zenodo DOI) to ensure long-term reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the positive assessment, the recommendation of minor revision, and the constructive comments that help strengthen the presentation of the unified framework. We address each major comment below.

read point-by-point responses
  1. Referee: [Abstract and §3] Abstract and §3 (six-step loop): the central unification claim that all three searches 'share one six-step surrogate loop and differ only in the inner optimization target and the acquisition criterion' is load-bearing but would be strengthened by an explicit comparison table listing the target function and acquisition function for each of the three cases; without it the 'differ only in' assertion remains conceptual rather than directly verifiable.

    Authors: We agree that an explicit table would make the unification directly verifiable rather than conceptual. In the revised manuscript we will insert a comparison table in §3 that enumerates, for each of the three applications (minimization, single-point saddle search, double-ended path search): the inner optimization target, the precise acquisition criterion, the form of the derivative observations used, and the specific instantiation of the six-step loop. This table will sit alongside the existing algorithmic description and will be cross-referenced from the abstract. revision: yes

  2. Referee: [Abstract] Abstract: the statement that surrogates 'can reduce the number of expensive electronic structure evaluations by roughly an order of magnitude' is presented as a key practical benefit yet lacks a specific benchmark, timing table, or reference to a controlled comparison within the manuscript; this quantitative claim should be evidenced or qualified with the conditions under which it holds.

    Authors: The manuscript already qualifies the claim by stating that the gain depends on oracle cost, search distance, and availability of analytical forces. To address the request for substantiation, we will add two short sentences in the abstract and §1 that cite representative controlled comparisons from the surrogate-assisted saddle-search literature (e.g., reductions of 5–20× reported for similar systems) and will include a one-paragraph illustrative example drawn from the pedagogical Rust code that reports the number of oracle calls with and without the surrogate for a model problem. Because the paper is a tutorial synthesis rather than a new benchmark study, we do not add a full timing table, but the added references and example provide the requested evidence. revision: partial

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper is a tutorial review synthesizing established Bayesian optimization and Gaussian process techniques for stationary point searches on potential energy surfaces. The central claim is a conceptual unification of minimization, saddle searches, and path searches under one standard six-step surrogate loop using derivative observations and inverse-distance kernels. No load-bearing step reduces by construction to a self-definition, fitted input renamed as prediction, or self-citation chain; the framework is presented as a reframing of known methods with optional extensions and accompanying code for independent verification. The presentation is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard domain assumptions from Bayesian optimization and Gaussian process modeling applied to potential energy surfaces; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Gaussian process regression with derivative observations and inverse-distance kernels can accurately approximate local regions of potential energy surfaces
    This assumption underpins the surrogate acceleration described for all three search types.

pith-pipeline@v0.9.0 · 5504 in / 1215 out tokens · 52353 ms · 2026-05-15T12:49:22.213846+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

110 extracted references · 110 canonical work pages · 1 internal anchor

  1. [1]

    Hänggi, P

    Peter Hänggi, Peter Talkner, and Michal Borkovec. Reaction-rate theory: Fifty years after Kramers. Reviews of Modern Physics, 62(2):251–341, April 1990. doi:10.1103/RevModPhys.62.251

  2. [2]

    The activated complex in chemical reactions.Journal of Chemical Physics, 3(2):107–115, February 1935

    Henry Eyring. The activated complex in chemical reactions.Journal of Chemical Physics, 3(2):107–115, February 1935. ISSN 0021-9606. doi:10.1063/1.1749604

  3. [3]

    Lyon and W

    George H. Vineyard. Frequency factors and isotope effects in solid state rate processes.Journal of Physics and Chemistry of Solids, 3(1):121–127, January 1957. ISSN 0022-3697. doi:10.1016/0022- 3697(57)90059-8

  4. [4]

    Elsevier, Amsterdam ; Cambrige, MA, 2017

    Baron Peters.Reaction Rate Theory and Rare Events. Elsevier, Amsterdam ; Cambrige, MA, 2017. ISBN 978-0-444-56349-1

  5. [5]

    Bianchi, Magnus A

    Michele Re Fiorentin, Michele G. Bianchi, Magnus A. H. Christiansen, Anna Ciotti, Francesca Risplendi, Wei Wang, Elvar Ö. Jónsson, Hannes Jónsson, and Giancarlo Cicero. Methodological frameworks for computational electrocatalysis: From theory to practice.Small Methods, page e01542, February 2026. ISSN 2366-9608, 2366-9608. doi:10.1002/smtd.202501542

  6. [6]

    Springer New York, New York, NY, 2013

    Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani.An Introduction to Statistical Learning, volume 103 ofSpringer Texts in Statistics. Springer New York, New York, NY, 2013. ISBN 978-1-4614-7137-0 978-1-4614-7138-7. doi:10.1007/978-1-4614-7138-7

  7. [7]

    Madsen, R

    D. Madsen, R. Pearman, and M. Gruebele. Approximate factorization of molecular potential surfaces. I. Basic approach.Journal of Chemical Physics, 106(14):5874–5893, April 1997. ISSN 0021-9606, 1089-7690. doi:10.1063/1.473253

  8. [8]

    AiiDA: Automated interactive infrastructure and database for computational science.Computational Materials Science, 111:218–230, January 2016

    Giovanni Pizzi, Andrea Cepellotti, Riccardo Sabatini, Nicola Marzari, and Boris Kozinsky. AiiDA: Automated interactive infrastructure and database for computational science.Computational Materials Science, 111:218–230, January 2016. ISSN 0927-0256. doi:10.1016/j.commatsci.2015.09.013

  9. [9]

    Hall, Christopher H

    Felix Mölder, Kim Philipp Jablonski, Brice Letcher, Michael B. Hall, Christopher H. Tomkins-Tinch, Vanessa Sochat, Jan Forster, Soohyun Lee, Sven O. Twardziok, Alexander Kanitz, Andreas Wilm, Manuel Holtgrewe, Sven Rahmann, Sven Nahnsen, and Johannes Köster. Sustainable data analysis with Snakemake, April 2021

  10. [10]

    Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table.The Journal of Chemical Physics, 115(21):9657–9666, November 2001

    Graeme Henkelman and Hannes Jónsson. Long time scale kinetic Monte Carlo simulations without lattice approximation and predefined event table.The Journal of Chemical Physics, 115(21):9657–9666, November 2001. ISSN 0021-9606. doi:10.1063/1.1415500

  11. [11]

    Efficient exploration of chemical kinetics, October 2025

    Rohit Goswami. Efficient exploration of chemical kinetics, October 2025

  12. [12]

    Bartók, Christoph Ortner, Gábor Csányi, and Michele Ceriotti

    Felix Musil, Andrea Grisafi, Albert P. Bartók, Christoph Ortner, Gábor Csányi, and Michele Ceriotti. Physics-Inspired Structural Representations for Molecules and Materials.Chemical Reviews, 121(16): 9759–9815, August 2021. ISSN 0009-2665, 1520-6890. doi:10.1021/acs.chemrev.1c00021

  13. [13]

    Wilkins, Michael J

    Andrea Grisafi, David M. Wilkins, Michael J. Willatt, and Michele Ceriotti. Atomic-Scale Representation and Statistical Learning of Tensorial Properties. In Edward O. Pyzer-Knapp and Teodoro Laino, editors, ACS Symposium Series, volume 1326, pages 1–21. American Chemical Society, Washington, DC, January

  14. [14]

    doi:10.1021/bk-2019-1326.ch001

    ISBN 978-0-8412-3505-2 978-0-8412-3504-5. doi:10.1021/bk-2019-1326.ch001

  15. [15]

    Vassilev, P.A

    Albert Pártay Bart\ok.The Gaussian Approximation Potential. Springer Theses. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010. ISBN 978-3-642-14066-2 978-3-642-14067-9. doi:10.1007/978-3- 642-14067-9

  16. [16]

    Alexander V. Shapeev. Moment Tensor Potentials: A Class of Systematically Improvable Interatomic Potentials.Multiscale Modeling & Simulation, 14(3):1153–1173, January 2016. ISSN 1540-3459. doi:10.1137/15M1054183

  17. [17]

    Generalized Neural-Network Representation of High- Dimensional Potential-Energy Surfaces.Physical Review Letters, 98(14):146401, April 2007

    Jörg Behler and Michele Parrinello. Generalized Neural-Network Representation of High- Dimensional Potential-Energy Surfaces.Physical Review Letters, 98(14):146401, April 2007. doi:10.1103/PhysRevLett.98.146401. 60 A Preprint

  18. [18]

    Deringer, Albert P

    Volker L. Deringer, Albert P. Bartók, Noam Bernstein, David M. Wilkins, Michele Ceriotti, and Gábor Csányi. Gaussian Process Regression for Materials and Molecules.Chemical Reviews, 121(16): 10073–10141, August 2021. ISSN 0009-2665, 1520-6890. doi:10.1021/acs.chemrev.1c00022

  19. [19]

    PET-MAD as a lightweight universal interatomic potential for advanced materials modeling.Nature Communications, 16(1):10653, November 2025

    Arslan Mazitov, Filippo Bigi, Matthias Kellner, Paolo Pegolo, Davide Tisi, Guillaume Fraux, Sergey Pozdnyakov, Philip Loche, and Michele Ceriotti. PET-MAD as a lightweight universal interatomic potential for advanced materials modeling.Nature Communications, 16(1):10653, November 2025. ISSN 2041-1723. doi:10.1038/s41467-025-65662-7

  20. [20]

    Pushing the limits of unconstrained machine-learned interatomic potentials.Machine Learning: Science and Technology, April 2026

    Filippo Bigi, Paolo Pegolo, Arslan Mazitov, Jonathan Schmidt, and Michele Ceriotti. Pushing the limits of unconstrained machine-learned interatomic potentials.Machine Learning: Science and Technology, April 2026. doi:10.1088/2632-2153/ae6417. DOI: 10.1088/2632-2153/ae6417

  21. [21]

    Elena, Dávid P

    Ilyes Batatia, Philipp Benner, Yuan Chiang, Alin M. Elena, Dávid P. Kovács, Janosh Riebesell, Xavier R. Advincula, Mark Asta, Matthew Avaylon, William J. Baldwin, Fabian Berger, Noam Bernstein, Arghya Bhowmik, Filippo Bigi, Samuel M. Blau, Vlad Cărare, Michele Ceriotti, Sanggyu Chong, James P. Darby, Sandip De, Flaviano Della Pia, Volker L. Deringer, Roka...

  22. [22]

    Transition1x - a dataset for building generalizable reactive machine learning potentials.Scientific Data, 9(1):779, December

    Mathias Schreiner, Arghya Bhowmik, Tejs Vegge, Jonas Busk, and Ole Winther. Transition1x - a dataset for building generalizable reactive machine learning potentials.Scientific Data, 9(1):779, December

  23. [23]

    doi:10.1038/s41597-022-01870-w

    ISSN 2052-4463. doi:10.1038/s41597-022-01870-w

  24. [24]

    Probabilistic Machine Learning

    Philipp Hennig. Probabilistic Machine Learning. Lecture notes, Tübingen AI Center, University of Tübingen, September 2023

  25. [25]

    Sutton and Andrew G

    Richard S. Sutton and Andrew G. Barto.Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning Series. The MIT Press, Cambridge, Massachusetts, second edition edition, 2018. ISBN 978-0-262-03924-6

  26. [26]

    Practical Bayesian Optimization of Machine Learning Algorithms

    Jasper Snoek, Hugo Larochelle, and Ryan P Adams. Practical Bayesian Optimization of Machine Learning Algorithms. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors,Advances in Neural Information Processing Systems 25, pages 2960–2968. Curran Associates, Inc., 2012

  27. [27]

    Rohit Goswami, Maxim Masterov, Satish Kamath, Alejandro Pena-Torres, and Hannes Jónsson. Efficient Implementation of Gaussian Process Regression Accelerated Saddle Point Searches with Application to Molecular Reactions.Journal of Chemical Theory and Computation, 21(16):7935–7943, July 2025. doi:10.1021/acs.jctc.5c00866

  28. [28]

    Adaptive Pruning for Increased Robustness and Reduced Computational Overhead in Gaussian Process Accelerated Saddle Point Searches.ChemPhysChem, 27 (4):e202500730, February 2026

    Rohit Goswami and Hannes Jónsson. Adaptive Pruning for Increased Robustness and Reduced Computational Overhead in Gaussian Process Accelerated Saddle Point Searches.ChemPhysChem, 27 (4):e202500730, February 2026. ISSN 1439-7641. doi:10.1002/cphc.202500730

  29. [29]

    Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M

    Jonathan Vandermause, Steven B. Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M. Kolpak, and Boris Kozinsky. On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events. npj Computational Materials, 6(1):20, 2020. doi:10.1038/s41524-020-0283-z

  30. [30]

    Albert P. Bartók. Gaussian Approximation Potential: An interatomic potential derived from first principles Quantum Mechanics.arXiv:1003.2817 [cond-mat, physics:physics], March 2010. 61 A Preprint

  31. [31]

    Galván, Christian L

    Gerardo Raggi, Ignacio Fdez. Galván, Christian L. Ritterhoff, Morgane Vacher, and Roland Lindh. Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging.Journal of Chemical Theory and Computation, 16(6):3989–4001, June 2020. ISSN 1549-9618, 1549-9626. doi:10.1021/acs.jctc.0c00257

  32. [32]

    Galván, Gerardo Raggi, and Roland Lindh

    Ignacio Fdez. Galván, Gerardo Raggi, and Roland Lindh. Restricted-Variance Constrained, Re- action Path, and Transition State Molecular Optimizations Using Gradient-Enhanced Kriging. Journal of Chemical Theory and Computation, 17(1):571–582, January 2021. ISSN 1549-9618. doi:10.1021/acs.jctc.0c01163

  33. [33]

    Dagbjartsdóttir, Vilhjálmur Ásgeirsson, Aki Vehtari, and Hannes Jónsson

    Olli-Pekka Koistinen, Freyja B. Dagbjartsdóttir, Vilhjálmur Ásgeirsson, Aki Vehtari, and Hannes Jónsson. Nudged elastic band calculations accelerated with Gaussian process regression.The Journal of Chemical Physics, 147(15):152720, September 2017. ISSN 0021-9606. doi:10.1063/1.4986787

  34. [34]

    Nudged Elastic Band Calculations Accelerated with Gaussian Process Regression Based on Inverse Interatomic Distances

    Olli-Pekka Koistinen, Vilhjálmur Ásgeirsson, Aki Vehtari, and Hannes Jónsson. Nudged Elastic Band Calculations Accelerated with Gaussian Process Regression Based on Inverse Interatomic Distances. Journal of Chemical Theory and Computation, 15(12):6738–6751, December 2019. ISSN 1549-9618. doi:10.1021/acs.jctc.9b00692

  35. [35]

    Gaussian Process Regression for Minimum Energy Path Optimization and Transition State Search.The Journal of Physical Chemistry A, 123(44): 9600–9611, November 2019

    Alexander Denzel, Bernard Haasdonk, and Johannes Kästner. Gaussian Process Regression for Minimum Energy Path Optimization and Transition State Search.The Journal of Physical Chemistry A, 123(44): 9600–9611, November 2019. ISSN 1089-5639. doi:10.1021/acs.jpca.9b08239

  36. [36]

    PhD thesis, Technical University of Denmark, Kgs

    Andreas Lynge Vishart.Accelerating Catalysis Simulations Using Surrogate Machine Learning Models. PhD thesis, Technical University of Denmark, Kgs. Lyngby, 2023

  37. [37]

    Garrido Torres, Paul C

    José A. Garrido Torres, Paul C. Jennings, Martin H. Hansen, Jacob R. Boes, and Thomas Bligaard. Low-Scaling Algorithm for Nudged Elastic Band Calculations Using a Surrogate Machine Learning Model.Physical Review Letters, 122(15):156001, April 2019. doi:10.1103/PhysRevLett.122.156001

  38. [38]

    Chong Teng, Daniel Huang, and Junwei Lucas Bao. A spur to molecular geometry optimization: Gradient-enhanced universal kriging with on-the-fly adaptive ab initio prior mean functions in curvilinear coordinates.The Journal of Chemical Physics, 158(2):024112, 2023. doi:10.1063/5.0133675

  39. [39]

    Chong Teng, Yang Wang, and Junwei Lucas Bao. Physical prior mean function-driven gaussian processes search for minimum-energy reaction paths with a climbing-image nudged elastic band: A general method for gas-phase, interfacial, and bulk-phase reactions.Journal of Chemical Theory and Computation, 20 (10):4308–4324, 2024. doi:10.1021/acs.jctc.4c00291

  40. [40]

    Exploring torsional conformer space with physical prior mean function-driven meta-gaussian processes.The Journal of Chemical Physics, 159(21):214111, 2023

    Chong Teng, Daniel Huang, Elizabeth Donahue, and Junwei Lucas Bao. Exploring torsional conformer space with physical prior mean function-driven meta-gaussian processes.The Journal of Chemical Physics, 159(21):214111, 2023. doi:10.1063/5.0176709

  41. [41]

    Bayesian Data Analysis

    Andrew Gelman, John B Carlin, Hal S Stern, David B Dunson, Aki Vehtari, and Donald B Rubin. Bayesian Data Analysis. Chapman & Hall/CRC Texts in Statistical Science. CRC Press, Boca Raton, third edition edition, 2013. ISBN 978-1-4398-4095-5. doi:10.1201/b16018

  42. [42]

    Gramacy.Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences

    Robert B. Gramacy.Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences. CRC Press ; Taylor & Francis Group, New York, NY, 2020. ISBN 978-0-367-41542-6

  43. [43]

    Miguel A. Caro. Optimizing many-body atomic descriptors for enhanced computational performance of machine learning based interatomic potentials.Physical Review B, 100(2):024112, July 2019. doi:10.1103/PhysRevB.100.024112

  44. [44]

    Lewars.Computational Chemistry

    Errol G. Lewars.Computational Chemistry. Springer International Publishing, Cham, 2016. ISBN 978-3-319-30914-9 978-3-319-30916-3. doi:10.1007/978-3-319-30916-3

  45. [45]

    Klaus Müller and Leo D. Brown. Location of saddle points and minimum energy paths by a constrained simplex optimization procedure.Theoretica chimica acta, 53(1):75–93, March 1979. ISSN 1432-2234. doi:10.1007/BF00547608

  46. [46]

    A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives.The Journal of Chemical Physics, 111(15):7010–7022, October 1999

    Graeme Henkelman and Hannes Jónsson. A dimer method for finding saddle points on high dimensional potential surfaces using only first derivatives.The Journal of Chemical Physics, 111(15):7010–7022, October 1999. ISSN 0021-9606, 1089-7690. doi:10.1063/1.480097. 62 A Preprint

  47. [47]

    Jacobsen

    Hannes Jonsson, Greg Mills, and Karsten W. Jacobsen. Nudged elastic band method for finding minimum energy paths of transitions. InClassical and Quantum Dynamics in Condensed Phase Simulations, pages 385–404. World Scientific, June 1998. ISBN 978-981-02-3498-0. doi:10.1142/9789812839664_0016

  48. [48]

    Uberuaga, and Hannes Jónsson

    Graeme Henkelman, Blas P. Uberuaga, and Hannes Jónsson. A climbing image nudged elastic band method for finding saddle points and minimum energy paths.The Journal of Chemical Physics, 113 (22):9901–9904, November 2000. ISSN 0021-9606. doi:10.1063/1.1329672

  49. [49]

    Liu and Jorge Nocedal

    Dong C. Liu and Jorge Nocedal. On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45(1):503–528, August 1989. ISSN 1436-4646. doi:10.1007/BF01589116

  50. [50]

    A lim- ited memory algorithm for bound constrained op- timization,

    Richard H. Byrd, Peihuang Lu, Jorge Nocedal, and Ciyou Zhu. A limited memory algorithm for bound constrained optimization.SIAM Journal on Scientific Computing, 16:1190–1208, September 1995. ISSN 1064-8275. doi:10.1137/0916069

  51. [51]

    Wright.Numerical Optimization

    Jorge Nocedal and Stephen J. Wright.Numerical Optimization. Springer Series in Operations Research. Springer, New York, 2nd ed edition, 2006. ISBN 978-0-387-30303-1

  52. [52]

    Olli-Pekka Koistinen, Vilhjálmur Ásgeirsson, Aki Vehtari, and Hannes Jónsson. Minimum Mode Saddle Point Searches Using Gaussian Process Regression with Inverse-Distance Covariance Func- tion.Journal of Chemical Theory and Computation, 16(1):499–509, January 2020. ISSN 1549-9618. doi:10.1021/acs.jctc.9b01038

  53. [53]

    Gaussian Process Regression for Transition State Search

    Alexander Denzel and Johannes Kästner. Gaussian Process Regression for Transition State Search. Journal of Chemical Theory and Computation, 14(11):5777–5786, November 2018. ISSN 1549-9618. doi:10.1021/acs.jctc.8b00708

  54. [54]

    R. A. Olsen, G. J. Kroes, G. Henkelman, A. Arnaldsson, and H. Jónsson. Comparison of methods for finding saddle points without knowledge of the final states.The Journal of Chemical Physics, 121(20): 9776–9792, November 2004. ISSN 0021-9606. doi:10.1063/1.1809574

  55. [55]

    Enhanced climbing image nudged elastic band method with hessian eigenmode alignment, January 2026

    Rohit Goswami, Miha Gunde, and Hannes Jónsson. Enhanced climbing image nudged elastic band method with hessian eigenmode alignment, January 2026

  56. [56]

    Wiley, Hoboken, NJ, second edition edition, 2012

    James F Epperson.An Introduction to Numerical Methods and Analysis. Wiley, Hoboken, NJ, second edition edition, 2012

  57. [57]

    Polak and G

    E. Polak and G. Ribiere. Note sur la convergence de méthodes de directions conjuguées.Revue française d’informatique et de recherche opérationnelle. Série rouge, 3(16):35–43, 1969. ISSN 0373-8000. doi:10.1051/m2an/196903R100351

  58. [58]

    Bell, and Frerich J

    Andreas Heyden, Alexis T. Bell, and Frerich J. Keil. Efficient methods for finding transition states in chemical reactions: Comparison of improved dimer method and partitioned rational function optimization method.The Journal of Chemical Physics, 123(22):224101, December 2005. ISSN 0021-9606. doi:10.1063/1.2104507

  59. [59]

    Superlinearly converging dimer method for transition state search

    Johannes Kästner and Paul Sherwood. Superlinearly converging dimer method for transition state search. The Journal of Chemical Physics, 128(1):014106, January 2008. ISSN 0021-9606. doi:10.1063/1.2815812

  60. [60]

    Bayesian hierarchical models for quantitative estimates for performance metrics applied to saddle search algorithms.AIP Advances, 15(8):85210, August 2025

    Rohit Goswami. Bayesian hierarchical models for quantitative estimates for performance metrics applied to saddle search algorithms.AIP Advances, 15(8):85210, August 2025. ISSN 2158-3226. doi:10.1063/5.0283639

  61. [61]

    Efficient softest mode finding in transi- tion states calculations.Journal of Chemical Physics, 138(9):94110, March 2013

    Jing Leng, Weiguo Gao, Cheng Shang, and Zhi-Pan Liu. Efficient softest mode finding in transi- tion states calculations.Journal of Chemical Physics, 138(9):94110, March 2013. ISSN 0021-9606. doi:10.1063/1.4792644

  62. [62]

    Removing External Degrees of Freedom from Transition-State Search Methods using Quaternions.Journal of Chemical Theory and Computation, 11 (3):1055–1062, March 2015

    Marko Melander, Kari Laasonen, and Hannes Jónsson. Removing External Degrees of Freedom from Transition-State Search Methods using Quaternions.Journal of Chemical Theory and Computation, 11 (3):1055–1062, March 2015. ISSN 1549-9618. doi:10.1021/ct501155k

  63. [63]

    String method for the study of rare events.Physical Review B, 66(5):52301, August 2002

    Weinan E, Weiqing Ren, and Eric Vanden-Eijnden. String method for the study of rare events.Physical Review B, 66(5):52301, August 2002. doi:10.1103/PhysRevB.66.052301. 63 A Preprint

  64. [64]

    Simplified and improved string method for computing the minimum energy paths in barrier-crossing events.The Journal of Chemical Physics, 126(16):164103, April 2007

    Weinan E, Weiqing Ren, and Eric Vanden-Eijnden. Simplified and improved string method for computing the minimum energy paths in barrier-crossing events.The Journal of Chemical Physics, 126(16):164103, April 2007. ISSN 0021-9606, 1089-7690. doi:10.1063/1.2720838

  65. [65]

    Optimization methods for finding mini- mum energy paths.The Journal of Chemical Physics, 128(13):134106, April 2008

    Daniel Sheppard, Rye Terrell, and Graeme Henkelman. Optimization methods for finding mini- mum energy paths.The Journal of Chemical Physics, 128(13):134106, April 2008. ISSN 0021-9606. doi:10.1063/1.2841941

  66. [66]

    Vilhjálmur Ásgeirsson, Benedikt Orri Birgisson, Ragnar Bjornsson, Ute Becker, Frank Neese, Christoph Riplinger, and Hannes Jónsson. Nudged Elastic Band Method for Molecular Reactions Using Energy- Weighted Springs Combined with Eigenvector Following.Journal of Chemical Theory and Computation, 17(8):4929–4945, August 2021. ISSN 1549-9618. doi:10.1021/acs.j...

  67. [67]

    Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points.The Journal of Chemical Physics, 113(22): 9978–9985, December 2000

    Graeme Henkelman and Hannes Jónsson. Improved tangent estimate in the nudged elastic band method for finding minimum energy paths and saddle points.The Journal of Chemical Physics, 113(22): 9978–9985, December 2000. ISSN 0021-9606. doi:10.1063/1.1323224

  68. [68]

    Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge, Mass, 2006. ISBN 978-0-262- 18253-9

  69. [69]

    CBMS-NSF Regional Conference Series in Applied Mathematics

    Grace Wahba.Spline Models for Observational Data. CBMS-NSF Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, 1990. doi:10.1137/1.9781611970128

  70. [70]

    Solak, R

    E. Solak, R. Murray-smith, W. Leithead, D. Leith, and Carl Rasmussen. Derivative observations in gaussian process models of dynamic systems. In S. Becker, S. Thrun, and K. Obermayer, editors, Advances in Neural Information Processing Systems, volume 15. MIT Press, 2002

  71. [71]

    MIT Press, Cambridge, MA,

    Mykel J Kochenderfer and Tim A Wheeler.Algorithms for Optimization. MIT Press, Cambridge, MA,

  72. [72]

    ISBN 978-0-262-03942-0

  73. [73]

    Bartók, Risi Kondor, and Gábor Csányi

    Albert P. Bartók, Risi Kondor, and Gábor Csányi. On representing chemical environments.Physical Review B, 87(18):184115, May 2013. doi:10.1103/PhysRevB.87.184115

  74. [74]

    Atomic cluster expansion for accurate and transferable interatomic potentials.Physical Review B, 99(1):014104, January 2019

    Ralf Drautz. Atomic cluster expansion for accurate and transferable interatomic potentials.Physical Review B, 99(1):014104, January 2019. doi:10.1103/PhysRevB.99.014104

  75. [75]

    Machine-learning-enabled optimization of atomic structures using atoms with fractional existence

    Casper Larsen, Sami Kaappa, Andreas Lynge Vishart, Thomas Bligaard, and Karsten Wedel Jacobsen. Machine-learning-enabled optimization of atomic structures using atoms with fractional existence. Physical Review B, 107(21):214101, June 2023. doi:10.1103/PhysRevB.107.214101

  76. [76]

    PhD thesis, Aalto University School of Science, Espoo, 2019

    Olli-Pekka Koistinen.Algorithms for Finding Saddle Points and Minimum Energy Paths Using Gaussian Process Regression. PhD thesis, Aalto University School of Science, Espoo, 2019

  77. [77]

    Anatole von Lilienfeld

    Matthias Rupp, Alexandre Tkatchenko, Klaus-Robert Müller, and O. Anatole von Lilienfeld. Fast and Accurate Modeling of Molecular Atomization Energies with Machine Learning.Physical Review Letters, 108(5):058301, January 2012. ISSN 0031-9007, 1079-7114. doi:10.1103/PhysRevLett.108.058301

  78. [78]

    Slicedgradient-enhancedkrigingforhigh-dimensionalfunctionapprox- imation.SIAM Journal on Scientific Computing, 45(6):A2858–A2885, 2023

    KaiChengandRalfZimmermann. Slicedgradient-enhancedkrigingforhigh-dimensionalfunctionapprox- imation.SIAM Journal on Scientific Computing, 45(6):A2858–A2885, 2023. doi:10.1137/22M154315X

  79. [79]

    Performance study of gradient-enhanced kriging.Engineering with Computers, 32(1):15–34, February 2015

    Selvakumar Ulaganathan, Ivo Couckuyt, Tom Dhaene, Joris Degroote, and Eric Laermans. Performance study of gradient-enhanced kriging.Engineering with Computers, 32(1):15–34, February 2015. ISSN 1435-5663. doi:10.1007/s00366-015-0397-y

  80. [80]

    A scaled conjugate gradient algorithm for fast supervised learning.Neural Networks, 6(4):525–533, January 1993

    Martin Fodslette Møller. A scaled conjugate gradient algorithm for fast supervised learning.Neural Networks, 6(4):525–533, January 1993. ISSN 0893-6080. doi:10.1016/S0893-6080(05)80056-5

Showing first 80 references.