pith. the verified trust layer for science. sign in

arxiv: 2510.03046 · v2 · submitted 2025-10-03 · 💻 cs.LG

Bayesian E(3)-Equivariant Interatomic Potential with Iterative Restratification of Many-body Message Passing

Pith reviewed 2026-05-18 09:58 UTC · model grok-4.3

classification 💻 cs.LG
keywords Bayesian machine learningequivariant neural networksinteratomic potentialsuncertainty quantificationactive learningmolecular dynamicsatomistic simulationsE(3) equivariance
0
0 comments X p. Extension

The pith

Bayesian E(3)-equivariant interatomic potentials trained with a joint energy-force loss and iterative restratification deliver competitive accuracy while supplying usable uncertainty estimates for active learning and calibration.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Machine learning potentials for atomistic simulations have long lacked built-in ways to report when their predictions of energy or forces are unreliable. This paper introduces Bayesian versions of E(3)-equivariant models that treat uncertainty explicitly through a new joint negative log-likelihood loss on both energies and forces. It adds iterative restratification inside the many-body message-passing layers to improve representation of complex interactions. Multiple Bayesian inference methods are benchmarked on uncertainty prediction, out-of-distribution detection, calibration, and active learning, with the joint loss and restratification shown to support more efficient data selection via Bayesian active learning by disagreement. A reader would care because these additions let large-scale simulations flag their own limits without sacrificing the accuracy needed for reliable materials or molecular modeling.

Core claim

The central claim is that Bayesian E(3)-equivariant MLPs equipped with iterative restratification of many-body message passing and trained under the joint energy-force negative log-likelihood loss achieve accuracy comparable to leading non-Bayesian models while enabling effective uncertainty quantification that improves active learning, out-of-distribution detection, and energy/force calibration.

What carries the argument

The joint energy-force negative log-likelihood (NLL_JEF) loss together with iterative restratification inside many-body message passing layers of Bayesian E(3)-equivariant networks, which jointly models predictive uncertainty on energies and interatomic forces.

If this is right

  • The joint NLL_JEF loss yields substantially better accuracy than conventional negative log-likelihood losses when uncertainty is modeled.
  • Bayesian active learning by disagreement using both energy and force uncertainties outperforms random sampling and energy-only uncertainty sampling.
  • Multiple Bayesian techniques, including deep ensembles and Laplace approximation, can be systematically compared for their utility in atomistic tasks.
  • Uncertainty estimates support reliable out-of-distribution detection and energy/forces calibration without loss of predictive accuracy.
  • Iterative restratification improves the many-body message passing component while preserving E(3) equivariance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same uncertainty framework could be applied to guide sampling of rare events in long molecular-dynamics trajectories where forces are especially uncertain.
  • If the joint loss generalizes, it might reduce the number of expensive ab initio calculations needed to train potentials for new material classes.
  • Extending the approach to other symmetry groups beyond E(3) would allow uncertainty-aware potentials for systems with different invariances.
  • Deployment in automated materials-discovery loops becomes safer because the model can request new data precisely when its force predictions are least trustworthy.

Load-bearing premise

The chosen datasets and evaluation protocols for uncertainty, calibration, and active learning tasks accurately reflect the real-world performance gains delivered by the joint loss and iterative restratification.

What would settle it

A new benchmark dataset of complex or out-of-domain atomic configurations on which the Bayesian models show no gain in active-learning sample efficiency or produce miscalibrated uncertainty estimates would falsify the central claim.

read the original abstract

Machine learning potentials (MLPs) have become essential for large-scale atomistic simulations, enabling ab initio-level accuracy with computational efficiency. However, current MLPs struggle with uncertainty quantification, limiting their reliability for active learning, calibration, and out-of-distribution (OOD) detection. We address these challenges by developing Bayesian E(3) equivariant MLPs with iterative restratification of many-body message passing. Our approach introduces the joint energy-force negative log-likelihood (NLL$_\text{JEF}$) loss function, which explicitly models uncertainty in both energies and interatomic forces, yielding substantially improved accuracy compared to conventional NLL losses. We systematically benchmark multiple Bayesian approaches, including deep ensembles with mean-variance estimation, stochastic weight averaging Gaussian, improved variational online Newton, and Laplace approximation by evaluating their performance on uncertainty prediction, OOD detection, calibration, and active learning tasks. We further demonstrate that NLL$_\text{JEF}$ facilitates efficient active learning by quantifying energy and force uncertainties. Using Bayesian active learning by disagreement (BALD), our framework outperforms random sampling and energy-uncertainty-based sampling. Our results demonstrate that Bayesian MLPs achieve competitive accuracy with state-of-the-art models while enabling uncertainty-guided active learning, OOD detection, and energy/forces calibration. This work establishes Bayesian equivariant neural networks as a powerful framework for developing uncertainty-aware MLPs for atomistic simulations at scale.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes Bayesian E(3)-equivariant interatomic potentials that incorporate iterative restratification of many-body message passing and a joint energy-force negative log-likelihood (NLL_JEF) loss. It benchmarks several Bayesian methods (deep ensembles with mean-variance estimation, SWAG, improved variational online Newton, Laplace approximation) on uncertainty prediction, OOD detection, calibration, and active learning tasks, claiming competitive accuracy with SOTA models and superior performance of BALD over random or energy-uncertainty sampling for active learning.

Significance. If the reported gains from NLL_JEF and iterative restratification hold under rigorous protocols, the work would strengthen the case for uncertainty-aware equivariant MLPs in atomistic modeling, directly supporting practical applications in active learning and reliable extrapolation. The systematic comparison of multiple Bayesian inference techniques on force and energy uncertainty tasks is a useful contribution to the field.

major comments (2)
  1. [Abstract] The abstract states that NLL_JEF 'yields substantially improved accuracy compared to conventional NLL losses' and that BALD 'outperforms random sampling and energy-uncertainty-based sampling,' yet no quantitative metrics (e.g., MAE, RMSE, ECE, or active-learning curve areas) appear in the provided text. This absence is load-bearing for the central empirical claim and prevents assessment of effect sizes or statistical significance.
  2. [Experimental section] §4 (or equivalent experimental section): the skeptic concern about OOD construction is material. If the OOD sets are limited to mild interpolations within the same chemical family rather than shifts in elements, temperatures, or structural motifs, the reported superiority of BALD for uncertainty-guided active learning may not generalize. Concrete details on how OOD datasets are generated and how calibration is quantified (e.g., ECE on forces or NLL on held-out data) are required to substantiate the practical gains.
minor comments (2)
  1. [Method] Clarify the precise definition and implementation of 'iterative restratification' of many-body message passing, including any new hyperparameters it introduces.
  2. [Experiments] Ensure all benchmark datasets, splits, and evaluation protocols are fully specified with references to standard repositories or prior works.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive comments on our manuscript. We address each of the major comments point by point below, providing clarifications and committing to revisions that will strengthen the presentation of our results.

read point-by-point responses
  1. Referee: [Abstract] The abstract states that NLL_JEF 'yields substantially improved accuracy compared to conventional NLL losses' and that BALD 'outperforms random sampling and energy-uncertainty-based sampling,' yet no quantitative metrics (e.g., MAE, RMSE, ECE, or active-learning curve areas) appear in the provided text. This absence is load-bearing for the central empirical claim and prevents assessment of effect sizes or statistical significance.

    Authors: We acknowledge that the abstract, as currently written, does not include specific quantitative metrics to support the claims of improved accuracy and superior performance of BALD. To address this, we will revise the abstract to include key quantitative results from our experiments, such as specific MAE/RMSE values demonstrating the improvement from NLL_JEF, and metrics like ECE or active learning curve improvements for BALD. This will make the central empirical claims more concrete and allow for better assessment of effect sizes. revision: yes

  2. Referee: [Experimental section] §4 (or equivalent experimental section): the skeptic concern about OOD construction is material. If the OOD sets are limited to mild interpolations within the same chemical family rather than shifts in elements, temperatures, or structural motifs, the reported superiority of BALD for uncertainty-guided active learning may not generalize. Concrete details on how OOD datasets are generated and how calibration is quantified (e.g., ECE on forces or NLL on held-out data) are required to substantiate the practical gains.

    Authors: We agree that providing concrete details on OOD dataset construction is essential for assessing the generalizability of our findings. In the revised manuscript, we will expand §4 to include explicit descriptions of how the OOD sets were generated, specifying whether they involve shifts in elements, temperatures, structural motifs, or other factors beyond mild interpolations within the same chemical family. We will also detail the quantification of calibration, including Expected Calibration Error (ECE) computed on forces and negative log-likelihood (NLL) on held-out data. These additions will help substantiate the practical gains and address concerns about the scope of our OOD evaluations. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical benchmarks are self-contained

full rationale

The paper proposes a Bayesian E(3)-equivariant MLP architecture with a new joint energy-force NLL loss and iterative restratification, then validates via systematic benchmarks on uncertainty quantification, OOD detection, calibration, and active learning using standard datasets and protocols. No derivation chain reduces predictions to fitted inputs by construction, nor does any load-bearing claim rest on self-citations that themselves presuppose the target result. The central results are externally falsifiable through the reported experimental comparisons rather than being tautological with the model definition or prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework rests on standard domain assumptions from equivariant ML and Bayesian neural networks; no new free parameters or invented entities are described in the abstract beyond the introduced loss function and restratification procedure.

axioms (2)
  • domain assumption E(3) equivariance is a suitable inductive bias for interatomic potentials
    Invoked implicitly as the base architecture for the MLPs; standard in the field but not derived in the abstract.
  • domain assumption Bayesian methods can be effectively combined with message-passing networks for uncertainty quantification
    Central to the benchmarking of multiple Bayesian approaches.

pith-pipeline@v0.9.0 · 5823 in / 1396 out tokens · 42657 ms · 2026-05-18T09:58:18.521459+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

76 extracted references · 76 canonical work pages · 5 internal anchors

  1. [1]

    Batatia, D

    Batatia, I., Kov´ acs, D.P., Simm, G.N.C., Ortner, C., & Cs´ anyi,G. MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate For ce Fields. arXiv (2023). https://doi.org/10.48550/arXiv.2206.07697

  2. [2]

    A foundation model for atomistic materials chemistry

    Batatia, I., et al. A foundation model for atomistic materials chemistr y. Preprint at https://arxiv.org/abs/2401.00096 (2023)

  3. [3]

    Active sparse Bayesian committee machine potential for isotherm al–isobaric molecular dynamics simulations

    Willow, S.Y., Kim, D.G., Sundheep, R., Hajibabaei, A., Kim, K.S ., Myung, C.W. Active sparse Bayesian committee machine potential for isotherm al–isobaric molecular dynamics simulations. Phys. Chem. Chem. Phys. 26, 22073–22082 (2024)

  4. [4]

    No ´e, S

    Coley, C.W., Thomas, D.A., Lummiss, J.A.M., Jaworski, J.N., Breen , C.P., Schultz, V., Hart, T., Fishman, J.S., Rogers, L., Gao, H., Hicklin, R.W., Ple- hiers, P.P., Byington, J., Piotti, J.S., Green, W.H., Hart, A.J., Jam ison, T.F., Jensen, K.F. A robotic platform for flow synthesis of organic compounds in formed by AI planning. Science 365(6453), 1566 (...

  5. [5]

    Chemical reaction networks and opportunities for mac hine learning

    Wen, M., Spotte-Smith, E.W.C., Blau, S.M., McDermott, M.J., Krishnapriyan, A.S., Persson, K.A. Chemical reaction networks and opportunities for mac hine learning. Nature Computational Science 3(1), 12–24 (2023) https://doi.org/10. 1038/s43588-022-00369-z

  6. [6]

    , Topol, E.J., Rajpurkar, P

    Moor, M., Banerjee, O., Abad, Z.S.H., Krumholz, H.M., Leskovec, J. , Topol, E.J., Rajpurkar, P. Foundation models for generalist medical artifici al intelligence. Nature 616(7956), 259–265 (2023) https://doi.org/10.1038/s41586-023-05881-4

  7. [7]

    Multireference Approaches for Excited States of Molecules

    R¨ ockert, A., Kullgren, J., Hermansson, K. Predicting frequency from the external chemical environment: OH vibrations on hydrated and hydroxylated su rfaces. J. Chem. Theory Comput. 18(12), 7683–7694 (2022) https://doi.org/10.1021/acs. jctc.2c00135

  8. [8]

    Machine-learning interatomic potentials for materials s cience

    Mishin, Y. Machine-learning interatomic potentials for materials s cience. Acta Materialia 214, 116980 (2021) https://doi.org/10.1016/j.actamat.2021.116980

  9. [9]

    Materials property predi ction with uncertainty quantification: A benchmark study

    Varivoda, D., Dong, R., Omee, S.S., Hu, J. Materials property predi ction with uncertainty quantification: A benchmark study. Applied Physics Reviews 10(2), 021409 (2023) https://doi.org/10.1063/5.0133528

  10. [10]

    Uncertainty quanti fi- cation in multivariable regression for material property prediction wi th Bayesian neural networks

    Li, L., Chang, J., Vakanski, A., Wang, Y., Yao, T., Xian, M. Uncertainty quanti fi- cation in multivariable regression for material property prediction wi th Bayesian neural networks. Scientific Reports 14(1), 10543 (2024) https://doi.org/10.1038/ s41598-024-61189-x 30

  11. [11]

    doi:10.1007/s10462-023-10562-9

    Gawlikowski, J., Tassi, C.R.N., Ali, M., Lee, J., Humt, M., Feng, J., Kruspe, A., Triebel, R., Jung, P., Roscher, R., Shahzad, M., Yang, W., Bamler, R. , Zhu, X.X. A survey of uncertainty in deep neural networks. Artificial Intelligence Review 56(S1), 1513–1589 (2023) https://doi.org/10.1007/s10462-023-10562-9

  12. [12]

    Accurate Uncertainties for Dee p Learning Using Calibrated Regression

    Kuleshov, V., Fenner, N., Ermon, S. Accurate Uncertainties for Dee p Learning Using Calibrated Regression. arXiv (2018). https://doi.org/10.48550/arXiv.1807. 00263

  13. [13]

    A Simple Base- line for Bayesian Uncertainty in Deep Learning

    Maddox, W., Garipov, T., Izmailov, P., Vetrov, D., Wilson, A.G. A Simple Base- line for Bayesian Uncertainty in Deep Learning. arXiv (2019). https://doi.org/ 10.48550/arXiv.1902.02476

  14. [14]

    B ayesian statistics and modelling

    Van De Schoot, R., Depaoli, S., King, R., Kramer, B., M¨ artens, K., Tadesse, M.G., Vannucci, M., Gelman, A., Veen, D., Willemsen, J., Yau, C. B ayesian statistics and modelling. Nature Reviews Methods Primers 1(1), 1 (2021) https: //doi.org/10.1038/s43586-020-00001-2

  15. [15]

    Bayesian Optimization with Gradients

    Wu, J., Poloczek, M., Wilson, A.G., Frazier, P.I. Bayesian Opti mization with Gradients. arXiv (2018). https://doi.org/10.48550/arXiv.1703.04389

  16. [16]

    Bayesian Neural Networks: An Introduction and Sur vey

    Goan, E., Fookes, C. Bayesian Neural Networks: An Introduction and Sur vey. arXiv (2020). https://doi.org/10.1007/978-3-030-42553-1 3

  17. [17]

    Bayesian Deep Learning and a Probabilist ic Per- spective of Generalization

    Wilson, A.G., Izmailov, P. Bayesian Deep Learning and a Probabilist ic Per- spective of Generalization. arXiv (2022). https://doi.org/10.48550/arXiv.2002. 08791

  18. [18]

    Ba yesian uncertainty quantification for machine-learned models in physics

    Gal, Y., Koumoutsakos, P., Lanusse, F., Louppe, G., Papadimitriou, C. Ba yesian uncertainty quantification for machine-learned models in physics. Nature Reviews Physics 4(9), 573–577 (2022) https://doi.org/10.1038/s42254-022-00498-4

  19. [19]

    Beyond deep ens embles: A large-scale evaluation of bayesian deep learning under distribution s hift

    Seligmann, F., Becker, P., Volpp, M., Neumann, G. Beyond deep ens embles: A large-scale evaluation of bayesian deep learning under distribution s hift. Advances in Neural Information Processing Systems 36, 29372–29405 (2023)

  20. [20]

    Learning Probabi listic Symmetrization for Architecture Agnostic Equivariance

    Kim, J., Nguyen, T.D., Suleymanzade, A., An, H., Hong, S. Learning Probabi listic Symmetrization for Architecture Agnostic Equivariance. arXiv (2024). https:// doi.org/10.48550/arXiv.2306.02866

  21. [21]

    A practical bayesian framework for backpropagation netw orks

    MacKay, D.J.C. A practical bayesian framework for backpropagation netw orks. Neural Comput. 4(3), 448–472 (1992) https://doi.org/10.1162/neco.1992.4.3.448

  22. [22]

    Bayesian Learning for Neural Networks

    Neal, R.M. Bayesian Learning for Neural Networks. Lecture Notes in Statis tics, vol. 118. Springer. https://doi.org/10.1007/978-1-4612-0745-0

  23. [23]

    MCMC Using Hamiltonian Dynamics, pp

    Neal, R.M. MCMC Using Hamiltonian Dynamics, pp. 113–162. Chapman and 31 Hall/CRC. https://doi.org/10.1201/b10905-6

  24. [24]

    We ight uncer- tainty in neural network

    Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D. We ight uncer- tainty in neural network. In: Bach, F., Blei, D. (eds.) Proceedin gs of the 32nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 37, pp. 1613–1622. PMLR, Lille, France (2015). https://proceedings.mlr.press/v37/blundell15.html

  25. [25]

    Dropout as a bayesian approximation: Represent ing model uncertainty in deep learning

    Gal, Y., Ghahramani, Z. Dropout as a bayesian approximation: Represent ing model uncertainty in deep learning. In: Balcan, M.F., Weinberger, K.Q. (eds.) Proceedings of The 33rd International Conference on Machine Learning. Pro ceed- ings of Machine Learning Research, vol. 48, pp. 1050–1059. PMLR, New York, New York, USA (2016). https://proceedings.mlr.pre...

  26. [26]

    Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles

    Lakshminarayanan, B., Pritzel, A., Blundell, C. Simple and Scalabl e Predictive Uncertainty Estimation Using Deep Ensembles. arXiv (2017). https://doi.org/10. 48550/arXiv.1612.01474

  27. [27]

    Variati onal Learning is Effective for Large Deep Networks

    Shen, Y., Daheim, N., Cong, B., Nickl, P., Marconi, G.M., Bazan, C., Yok ota, R., Gurevych, I., Cremers, D., Khan, M.E., M¨ ollenhoff, T. Variati onal Learning is Effective for Large Deep Networks. arXiv (2024). https://doi.org/10.48550/ arXiv.2402.17641

  28. [28]

    Laplace Redux – Effortless Bayesian Deep Learning

    Daxberger, E., Kristiadi, A., Immer, A., Eschenhagen, R., Bauer, M ., Hennig, P. Laplace Redux – Effortless Bayesian Deep Learning. arXiv (2022). https://doi. org/10.48550/arXiv.2106.14806

  29. [29]

    Estimating the mean and variance of the target p rob- ability distribution

    Nix, D.A., Weigend, A.S. Estimating the mean and variance of the target p rob- ability distribution. In: Proceedings of 1994 IEEE International Confe rence on Neural Networks (ICNN’94), pp. 55–601. IEEE. https://doi.org/10.1109/ICNN. 1994.374138

  30. [30]

    Atom-centered symmetry functions for constructin g high-dimensional neural network potentials

    Behler, J. Atom-centered symmetry functions for constructin g high-dimensional neural network potentials. J. Chem. Phys. 134(7), 074106 (2011) https://doi. org/10.1063/1.3553717

  31. [31]

    Atomic cluster expansion for accurate and transferable in ter- atomic potentials

    Drautz, R. Atomic cluster expansion for accurate and transferable in ter- atomic potentials. Phys. Rev. B 99(1), 014104 (2019) https://doi.org/10.1103/ PhysRevB.99.014104

  32. [32]

    Dusson, M

    Dusson, G., Bachmayr, M., Csanyi, G., Drautz, R., Etter, S., Oor d, C., Ortner, C. Atomic Cluster Expansion: Completeness, Efficiency and Stability. arXiv (2021). https://doi.org/10.48550/arXiv.1911.03550

  33. [33]

    Behler, M

    Behler, J., Parrinello, M. Generalized Neural-Network Represen tation of High- Dimensional Potential-Energy Surfaces. Physical Review Letters 98(14), 146401 (2007) https://doi.org/10.1103/PhysRevLett.98.146401 32

  34. [34]

    Gaussian Approx imation Potentials: The Accuracy of Quantum Mechanics, without the Electrons

    Bart´ ok, A.P., Payne, M.C., Kondor, R., Cs´ anyi, G. Gaussian Approx imation Potentials: The Accuracy of Quantum Mechanics, without the Electrons . Phys. Rev. Lett. 104(13), 136403 (2010) https://doi.org/10.1103/PhysRevLett.104. 136403

  35. [35]

    , M¨ uller, K.-R

    Sch¨ utt, K.T., Sauceda, H.E., Kindermans, P.-J., Tkatchenko, A. , M¨ uller, K.-R. SchNet – A deep learning architecture for molecules and materials. J. Chem. Phys. 148(24), 241722 (2018) https://doi.org/10.1063/1.5019779

  36. [36]

    The Te nsorMol-0.1 model chemistry: a neural network augmented with long-range physics

    Yao, K., Herr, J.E., Toth, D., Mckintyre, R., Parkhill, J. The Te nsorMol-0.1 model chemistry: a neural network augmented with long-range physics. Chem. Sci. 9(8), 2261–2269 (2018) https://doi.org/10.1039/C7SC04934J

  37. [37]

    PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges

    Unke, O.T., Meuwly, M. PhysNet: A neural network for predicting energies, forces, dipole moments, and partial charges. J. Chem. Theory Comput. 15(6), 3678–3693 (2019) https://doi.org/10.1021/acs.jctc.9b00181

  38. [38]

    End-to-end Sym- metry Preserving Inter-atomic Potential Energy Model for Finite an d Extended Systems

    Zhang, L., Han, J., Wang, H., Saidi, W.A., Car, R., E, W. End-to-end Sym- metry Preserving Inter-atomic Potential Energy Model for Finite an d Extended Systems. arXiv (2018). https://doi.org/10.48550/arXiv.1805.09003

  39. [39]

    Geiger and T

    Geiger, M., Smidt, T. E3nn: Euclidean Neural Networks. arXiv (2022). https: //doi.org/10.48550/arXiv.2207.09453

  40. [40]

    Equiformer: Equivariant graph attention transformer for 3d atomistic graphs,

    Liao, Y.-L., Smidt, T. Equiformer: Equivariant Graph Attention Trans former for 3D Atomistic Graphs. arXiv (2023). https://doi.org/10.48550/arXiv.2206.11990

  41. [41]

    Equiformerv2: Improved equivariant transformer for scaling to higher-degree representations,

    Liao, Y.-L., Wood, B., Das, A., Smidt, T. EquiformerV2: Improved Equi variant Transformer for Scaling to Higher-Degree Representations. arXiv (2024). https: //doi.org/10.48550/arXiv.2306.12059

  42. [42]

    Tensor field networks: Rotation- and translation-equivariant neural networks for 3D point clouds

    Thomas, N., Smidt, T., Kearnes, S., Yang, L., Li, L., Kohlhoff, K., Riley , P. Tensor Field Networks: Rotation- and Translation-Equivariant Neural Networks for 3D Point Clouds. arXiv (2018). https://doi.org/10.48550/arXiv.1802.08219

  43. [43]

    Cormorant: Covariant Molecular Neur al Networks

    Anderson, B., Hy, T.-S., Kondor, R. Cormorant: Covariant Molecular Neur al Networks. arXiv. https://doi.org/10.48550/arXiv.1906.04015 . http://arxiv.org/ abs/1906.04015 Accessed 2025-08-14

  44. [44]

    Batzner, A

    Batzner, S., Musaelian, A., Sun, L., Geiger, M., Mailoa, J.P., Kornbl uth, M., Molinari, N., Smidt, T.E., Kozinsky, B. E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nat. Commun. 13(1), 2453 (2022) https://doi.org/10.1038/s41467-022-29939-5

  45. [45]

    , Cubuk, E.D

    Merchant, A., Batzner, S., Schoenholz, S.S., Aykol, M., Cheon, G. , Cubuk, E.D. Scaling deep learning for materials discovery. Nature 624(7990), 80–85 (2023) https://doi.org/10.1038/s41586-023-06735-9 33

  46. [46]

    Scalable parallel algorithm for graph neu ral network interatomic potentials in molecular dynamics simulations 20(11), 4857– 4868 (2024) https://doi.org/10.1021/acs.jctc.4c00190

    Park, Y., Kim, J., Hwang, S., Han, S. Scalable parallel algorithm for graph neu ral network interatomic potentials in molecular dynamics simulations 20(11), 4857– 4868 (2024) https://doi.org/10.1021/acs.jctc.4c00190

  47. [47]

    Learning local equivariant representations for large- scale atomistic dynamics

    Musaelian, A., Batzner, S., Johansson, A., Sun, L., Owen, C.J., Kornb luth, M., Kozinsky, B. Learning local equivariant representations for large- scale atomistic dynamics. Nat. Commun. 14(1), 579 (2023) https://doi.org/10.1038/ s41467-023-36329-y

  48. [48]

    Quality of uncertainty estimates from neural n etwork potential ensembles

    Kahle, L., Zipoli, F. Quality of uncertainty estimates from neural n etwork potential ensembles. Phys. Rev. E 105, 015311 (2022) https://doi.org/10.1103/ PhysRevE.105.015311

  49. [49]

    Uncertainty quantification in molecular simul ations with dropout neural network potentials

    Wen, M., Tadmor, E.B. Uncertainty quantification in molecular simul ations with dropout neural network potentials. Npj Comput. Mater. 6(1), 124 (2020) https: //doi.org/10.1038/s41524-020-00390-8

  50. [50]

    Fast uncertainty estimates in deep learning interatomic potentials

    Zhu, A., Batzner, S., Musaelian, A., Kozinsky, B. Fast uncertainty estimates in deep learning interatomic potentials. J. Chem. Phys. 158(16), 164111 (2023) https://doi.org/10.1063/5.0136574

  51. [51]

    Uncertainty quantification by direct p ropagation of shal- low ensembles

    Kellner, M., Ceriotti, M. Uncertainty quantification by direct p ropagation of shal- low ensembles. Mach. Learn.: Sci. Technol. 5(3), 035006 (2024) https://doi.org/ 10.1088/2632-2153/ad594a

  52. [52]

    Overcoming Systematic Softening in Universal Machi ne Learning Interatomic Potentials by Fine-Tuning

    Deng, B., Choi, Y., Zhong, P., Riebesell, J., Anand, S., Li, Z., Jun, K., Persson, K.A., Ceder, G. Overcoming Systematic Softening in Universal Machi ne Learning Interatomic Potentials by Fine-Tuning. arXiv (2024). https://doi.org/10.48550/ arXiv.2405.07105

  53. [53]

    Data-efficient fine-tuning of foundational mod els for first-principles quality sublimation enthalpies

    Kaur, H., Pia, F.D., Batatia, I., Advincula, X.R., Shi, B.X., Lan, J., Cs ´ anyi, G., Michaelides, A., Kapil, V. Data-efficient fine-tuning of foundational mod els for first-principles quality sublimation enthalpies. arXiv (2024). https://doi.org/10. 48550/arXiv.2405.20217

  54. [54]

    Enum eration of 166 billion organic small molecules in the chemical universe database GDB -17

    Ruddigkeit, L., Van Deursen, R., Blum, L.C., Reymond, J.-L. Enum eration of 166 billion organic small molecules in the chemical universe database GDB -17. J. Chem. Inf. Model. 52(11), 2864–2875 (2012) https://doi.org/10.1021/ci300415d

  55. [55]

    Quantum chemistry structures and properties of 134 kilo molecules

    Ramakrishnan, R., Dral, P.O., Rupp, M., Lilienfeld, O.A. Quantum chemistry structures and properties of 134 kilo molecules. Scientific Data 1, 140022 (2014)

  56. [56]

    On the role of gradients for machine learn- ing of molecular energies and forces

    Christensen, A.S., Von Lilienfeld, O.A. On the role of gradients for machine learn- ing of molecular energies and forces. Mach. Learn.: Sci. Technol. 1(4), 045018 (2020) https://doi.org/10.1088/2632-2153/abba6f

  57. [57]

    Moon, S.W., Willow, S.Y., Park, T.H., Min, S.K., Myung, C.W. Mach ine Learning 34 Nonadiabatic Dynamics: Eliminating Phase Freedom of Nonadiabatic Couplings with the State-Interaction State-Averaged Spin-Restricted Ensem ble-Referenced Kohn–Sham Approach. J. Chem. Theory Comput. 21(4), 1521–1529 (2025) https: //doi.org/10.1021/acs.jctc.4c01475

  58. [58]

    Linear atomic cluster expansion force fields for organic molecu les: Beyond RMSE

    Kov´ acs, D.P., Oord, C.v.d., Kucera, J., Allen, A.E.A., Cole, D.J ., Ortner, C., Cs´ anyi, G. Linear atomic cluster expansion force fields for organic molecu les: Beyond RMSE. J. Chem. Theory Comput. 17(12), 7696–7711 (2021) https://doi. org/10.1021/acs.jctc.1c00647

  59. [59]

    Gasteiger, J

    Gasteiger, J., Groß, J., G¨ unnemann, S. Directional Message Passing for Molecular Graphs. arXiv (2022). https://doi.org/10.48550/arXiv.2003.03123

  60. [60]

    Deep residual learning for image recognition

    He, K., Zhang, X., Ren, S., Sun, J. Deep residual learning for image re cognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE. https://doi.org/10.1109/CVPR.2016.90

  61. [61]

    E(n) Equivariant Graph Neural Networks

    Satorras, V.G., Hoogeboom, E., Welling, M. E(n) Equivariant Graph Neural Networks. arXiv (2022). https://doi.org/10.48550/arXiv.2102.09844

  62. [62]

    Equivariant message passi ng for the prediction of tensorial properties and molecular spectra

    Sch¨ utt, K.T., Unke, O.T., Gastegger, M. Equivariant message passi ng for the prediction of tensorial properties and molecular spectra. arXiv (2021). https:// doi.org/10.48550/arXiv.2102.03150

  63. [63]

    TorchMD-NET: Equivariant Transform ers for Neu- ral Network based Molecular Potentials

    Th¨ olke, P., Fabritiis, G.D. TorchMD-NET: Equivariant Transform ers for Neu- ral Network based Molecular Potentials. arXiv (2022). https://doi.org/10.48550/ arXiv.2202.02541

  64. [64]

    The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain

    Liu, Y., Wang, L., Liu, M., Zhang, X., Oztekin, B., Ji, S. Spherical Mes sage Passing for 3D Graph Networks. arXiv (2022). https://doi.org/10.48550/arXiv. 2102.05013

  65. [65]

    Geometric and physical quantities improve E(3) equivariant message passing.arXiv preprint arXiv:2110.02905,

    Brandstetter, J., Hesselink, R., Pol, E.v.d., Bekkers, E.J. , Welling, M. Geometric and Physical Quantities Improve E(3) Equivariant Message Passing. arXiv (2022). https://doi.org/10.48550/arXiv.2110.02905

  66. [66]

    Equivariant Graph Attention Networ ks for Molec- ular Property Prediction

    Le, T., No´ e, F., Clevert, D.-A. Equivariant Graph Attention Networ ks for Molec- ular Property Prediction. arXiv (2022). https://doi.org/10.48550/arXiv.2202. 09891

  67. [67]

    Enhancing geometric representations for molecules with eq uivariant vector-scalar interactive message passing 15(1), 313 (2024) https://doi.org/10

    Wang, Y., Wang, T., Li, S., He, X., Li, M., Wang, Z., Zheng, N., Shao, B., Liu, T.-Y. Enhancing geometric representations for molecules with eq uivariant vector-scalar interactive message passing 15(1), 313 (2024) https://doi.org/10. 1038/s41467-023-43720-2

  68. [68]

    Surface hopping dynamics beyond non adiabatic couplings for quantum coherence

    Ha, J.-K., Lee, I.S., Min, S.K. Surface hopping dynamics beyond non adiabatic couplings for quantum coherence. J. Phys. Chem. Lett. 9(5), 1097–1104 (2018) 35 https://doi.org/10.1021/acs.jpclett.8b00060

  69. [69]

    Rohrdanz, M.A., Martins, K.M., Herbert, J.M. A long-range-correcte d den- sity functional that performs well for both ground-state properties and time- dependent density functional theory excitation energies, includ ing charge-transfer excited states. J. Chem. Phys. 130(5), 054112 (2019) https://doi.org/10.1063/1. 3073302

  70. [70]

    Self-cons istent molecular orbital methods

    Krishnan, R., Binkley, J.S., Seeger, R., Pople, J.A. Self-cons istent molecular orbital methods. XX. a basis set for correlated wave functions. J. Chem. Phys. 72(1), 650–654 (1980) https://doi.org/10.1063/1.438955

  71. [71]

    The design space of e(3)-e quivariant atom-centred interatomic potentials

    Batatia, I., Batzner, S., Kov´ acs, D.P., Musaelian, A., Simm, G.N.C ., Drautz, R., Ortner, C., Kozinsky, B., Cs´ anyi, G. The design space of e(3)-e quivariant atom-centred interatomic potentials. Nat. Mach. Intell. 7(1), 56–67 (2025) https: //doi.org/10.1038/s42256-024-00956-x

  72. [72]

    Alch emical and structural distribution based representation for improved QML

    Faber, F.A., Christensen, A.S., Huang, B., Lilienfeld, O.A.v. Alch emical and structural distribution based representation for improved QML. J. Chem. Phys. 148(24), 241717 (2018) https://doi.org/10.1063/1.5020710

  73. [73]

    TorchANI: A free and open source PyTorch-based deep learning implementation of t he ANI neural network potentials

    Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J.S., Roitberg, A.E. TorchANI: A free and open source PyTorch-based deep learning implementation of t he ANI neural network potentials. J. Chem. Inf. Model. 60(7), 3408–3415 (2020) https: //doi.org/10.1021/acs.jcim.0c00451 36 Supplementary Information: Bayesian E(3)- Equivariant Interatomic Potential with I...

  74. [74]

    Epistemic uncertainty σ 2 epistemic(x) is large (models disagree)

  75. [75]

    Aleatoric uncertainty varies across models, with geometric mean smaller than arithmetic mean

  76. [76]

    By the AM-GM inequality, the geometric mean is maximized when all σ 2m(x) are equal, so disagreement in predicted variances also contributes to higher acquisition values

    The ratio σ 2 epistemic(x)/σ 2 aleatoric(x) is large Proof The acquisition function increases with σ 2 total(x) in the numerator and decreases with the geometric mean of variances in the denominator. By the AM-GM inequality, the geometric mean is maximized when all σ 2m(x) are equal, so disagreement in predicted variances also contributes to higher acquis...