Force-Aware Neural Tangent Kernels for Scalable and Robust Active Learning of MLIPs

Eszter Varga-Umbrich; Jules Tilly; Olivier Peltre; Paul Duckworth; Shikha Surana; Zachary Weller-Davies

arxiv: 2605.13788 · v2 · pith:3DZXNGPVnew · submitted 2026-05-13 · 💻 cs.LG

Force-Aware Neural Tangent Kernels for Scalable and Robust Active Learning of MLIPs

Eszter Varga-Umbrich , Zachary Weller-Davies , Paul Duckworth , Jules Tilly , Olivier Peltre , Shikha Surana This is my paper

Pith reviewed 2026-05-19 16:40 UTC · model grok-4.3

classification 💻 cs.LG

keywords active learningneural tangent kernelmachine learning interatomic potentialsforce predictionOC20 datasetscalable acquisitionrobust active learning

0 comments

The pith

Extending the Neural Tangent Kernel to forces yields scalable, robust active learning for MLIPs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors develop a chunked feature-space posterior-variance method that lets acquisition functions screen hundreds of thousands of candidate structures in hours without materializing full kernels. They extend the Neural Tangent Kernel through mixed parameter-coordinate derivatives to produce both a force NTK and a joint energy-force NTK. These kernels supply natural similarity measures for predicting energies together with vector forces. On the OC20 dataset the joint kernel produces the lowest energy and force errors across all reported metrics and splits. The same approach stays competitive with committee baselines on other benchmarks while showing lower variance when the candidate pool differs from the target distribution.

Core claim

By extending the Neural Tangent Kernel via mixed parameter-coordinate derivatives, the work obtains a force NTK and a joint energy-force NTK that serve as natural similarity metrics for vector-field prediction. When these kernels are used inside a linearly scaling acquisition framework, the resulting force-aware selection achieves the lowest energy and force MAE and RMSE on the OC20 dataset and remains more stable than committee methods under controlled distribution shifts on T1x.

What carries the argument

The joint energy-force Neural Tangent Kernel obtained from mixed derivatives with respect to model parameters and atomic coordinates, serving as a similarity metric that jointly scores energy and force predictions.

If this is right

Candidate pools of roughly 200,000 structures can be screened in hours using the chunked shortlisting approach.
Force-aware acquisition simultaneously lowers both energy and force prediction errors on large datasets such as OC20.
The NTK-based methods match or exceed committee performance on T1x, PMechDB, and RGD while running significantly faster.
Acquisition remains stable under shifts between the candidate pool and the target distribution, unlike committee baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same chunked kernel technique could accelerate active learning for any kernel method that scores candidates by feature-space similarity.
Force-aware kernels may improve sample efficiency in other domains where models predict both scalars and vector fields, such as fluid dynamics or robotics control.
A single pretrained MLIP could serve as a reusable starting point for fine-tuning across many different chemical systems without full retraining.

Load-bearing premise

Mixed parameter-coordinate derivatives of the Neural Tangent Kernel produce similarity metrics that correctly capture the joint statistics of energies and forces.

What would settle it

On the OC20 dataset, replace the joint energy-force NTK acquisition with an energy-only version and observe whether it still records the lowest MAE and RMSE for both energy and force predictions across all distribution splits.

Figures

Figures reproduced from arXiv: 2605.13788 by Eszter Varga-Umbrich, Jules Tilly, Olivier Peltre, Paul Duckworth, Shikha Surana, Zachary Weller-Davies.

**Figure 1.** Figure 1: OC20 final-round test errors on val_is (ID) and the mean of the three out-of-distribution splits val_oos_ads, val_oos_bulk, and val_oos_ads_bulk. The joint energy–force NTK achieves the lowest energy and force errors across all panels. We consider active learning on a randomly selected 200k subset of OC20 Chanussot et al. [2021]. To handle the pool size we use feature-space PV to shortlist candidates, foll… view at source ↗

**Figure 2.** Figure 2: Cost, accuracy, and memory scaling of feature-space acquisition on OC20. (a,b) Mean [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Energy RMSE learning curves (meV) for the methods presented in the main text. [PITH_FULL_IMAGE:figures/full_fig_p024_3.png] view at source ↗

**Figure 4.** Figure 4: Energy MAE learning curves (meV) for the methods presented in the main text. [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗

**Figure 5.** Figure 5: Force RMSE learning curves (meV/Å) for the methods presented in the main text. [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗

**Figure 6.** Figure 6: Force MAE learning curves (meV/Å) for the methods presented in the main text. [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗

**Figure 7.** Figure 7: Inter-reaction bias. Left: energy RMSE AUC (meV [PITH_FULL_IMAGE:figures/full_fig_p027_7.png] view at source ↗

**Figure 8.** Figure 8: Intra-reaction (frame) bias. Left: energy RMSE AUC (meV [PITH_FULL_IMAGE:figures/full_fig_p028_8.png] view at source ↗

**Figure 9.** Figure 9: Global kernel matrices on a five-reaction T1x subset. Structures are sorted by reaction [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗

**Figure 10.** Figure 10: OC20 test error vs. training-set size on the [PITH_FULL_IMAGE:figures/full_fig_p031_10.png] view at source ↗

**Figure 11.** Figure 11: OC20 final-round test errors resolved by validation split ( [PITH_FULL_IMAGE:figures/full_fig_p031_11.png] view at source ↗

read the original abstract

Active learning for machine-learning interatomic potentials (MLIPs) must address several challenges to be practical: scaling to large candidate pools, leveraging energy-force supervision, and maintaining robustness when candidate pools are biased relative to the target distribution. In this work, we jointly address these challenges. We first introduce a linearly scaling acquisition framework based on chunked feature-space posterior-variance shortlisting. By avoiding materialisation of the candidate and train set kernels, this approach enables screening of ~200k structures within hours and applies broadly to acquisition strategies that score candidates based on molecular similarity metrics. We then extend the Neural Tangent Kernel (NTK) to a force-aware setting via mixed parameter-coordinate derivatives, yielding a force NTK and a joint energy-force NTK that provide natural similarity metrics for vector-field prediction. We demonstrate the effectiveness of the joint energy-force NTK on the OC20 dataset, where force-aware acquisition is crucial: it achieves the lowest energy and force MAE and RMSE across all metrics and distribution splits. Across T1x, PMechDB, and RGD benchmarks, our force NTK methods remain competitive with established baselines while being significantly more efficient than committee-based approaches. Under a controlled candidate-pool shift case study on T1x, acquisition based on pretrained MLIP embeddings and NTKs remains robust, whereas committee-based methods exhibit higher variance. Overall, these results show that a single pretrained MLIP can enable scalable, force-aware, and distribution-robust active learning for foundation-model fine-tuning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The force-aware NTK extension plus chunked posterior-variance shortlisting gives a practical, committee-free route to scalable active learning for MLIPs with solid OC20 numbers and better shift robustness.

read the letter

Colleague, the core advance here is a force-aware Neural Tangent Kernel built from mixed parameter-coordinate derivatives, combined with a chunked feature-space trick that lets them score hundreds of thousands of candidates without ever forming the full kernel matrices. That combination directly targets the scaling and force-supervision problems that make active learning slow for interatomic potentials. They show the joint energy-force version delivers the lowest energy and force errors on OC20 across multiple distribution splits, stays competitive on T1x, PMechDB and RGD, and runs far cheaper than committee baselines. In the controlled T1x shift experiment the NTK-based acquisition also exhibits lower variance than the committee approach. The engineering is straightforward once the mixed-derivative step is accepted, and the empirical ordering is reported cleanly on public data. A softer spot is that the abstract leaves the exact implementation of the chunking, the choice of pretrained embedding model, and the statistical significance of the gains a bit thin; without those details it is hard to judge how much of the win comes from the force NTK itself versus the shortlisting or the base model. The robustness claim rests on one controlled case study, which is useful but narrow. This work is aimed at people building active-learning loops for MLIPs or foundation-model fine-tuning in chemistry and materials. It is coherent enough on its own terms to merit referee time even if some sections will need expansion.

Referee Report

3 major / 2 minor

Summary. The paper introduces a linearly scaling chunked posterior-variance acquisition framework for active learning of MLIPs that avoids materializing full kernels, and extends the NTK to force-aware and joint energy-force variants via mixed parameter-coordinate derivatives. It reports that the joint energy-force NTK yields the lowest energy and force MAE/RMSE on OC20 across distribution splits, remains competitive with baselines on T1x/PMechDB/RGD while being more efficient than committees, and shows greater robustness than committees under a controlled candidate-pool shift on T1x.

Significance. If the empirical ordering holds, the work supplies a practical, single-model alternative to committee-based uncertainty for force-aware active learning at scale, directly addressing the three stated bottlenecks (pool size, force supervision, distribution shift) for foundation-model fine-tuning of interatomic potentials.

major comments (3)

[§3.2] §3.2 and Eq. (force-NTK definition): the claim that the mixed derivative yields a 'natural similarity metric for vector-field prediction' requires an explicit statement of the resulting kernel form and a short proof that it is positive semi-definite; without this, the extension from scalar NTK to force NTK is not fully load-bearing.
[Table 2] Table 2 (OC20 results): the reported lowest MAE/RMSE for the joint NTK must be accompanied by the exact committee size, the force-aware baseline variants, and standard errors over at least three random seeds; the current ordering is central to the main claim but cannot be assessed for statistical significance from the given numbers.
[§4.3] §4.3 (shift case study): the controlled T1x shift experiment needs an explicit description of how the candidate-pool bias is constructed and whether the same pretrained MLIP is used for both embedding and NTK acquisition; otherwise the robustness comparison to committees is not isolated from model choice.

minor comments (2)

[Abstract] The abstract packs four distinct contributions into one paragraph; separating the acquisition framework, the NTK extension, the OC20 results, and the robustness study would improve readability.
[§2] Notation for the chunked posterior variance (e.g., the short-list size and chunk size) should be introduced once in §2 and used consistently in the experimental section.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2 and Eq. (force-NTK definition): the claim that the mixed derivative yields a 'natural similarity metric for vector-field prediction' requires an explicit statement of the resulting kernel form and a short proof that it is positive semi-definite; without this, the extension from scalar NTK to force NTK is not fully load-bearing.

Authors: We agree that the force-NTK extension requires additional mathematical detail for full rigor. In the revised manuscript we will explicitly state the resulting kernel forms obtained from the mixed parameter-coordinate derivatives (both the pure force NTK and the joint energy-force NTK) and include a short proof of positive semi-definiteness. The proof will note that the NTK is PSD by construction as a Gram matrix and that the relevant mixed derivatives preserve this property because they correspond to inner products in an appropriately differentiated feature space. revision: yes
Referee: [Table 2] Table 2 (OC20 results): the reported lowest MAE/RMSE for the joint NTK must be accompanied by the exact committee size, the force-aware baseline variants, and standard errors over at least three random seeds; the current ordering is central to the main claim but cannot be assessed for statistical significance from the given numbers.

Authors: We acknowledge the need for greater transparency in Table 2. We will update the table to report the exact committee size used for the baseline (5 models), explicitly list the force-aware variants of each baseline, and add standard errors computed over three independent random seeds for all methods. These changes will enable readers to assess the statistical significance of the reported performance ordering. revision: yes
Referee: [§4.3] §4.3 (shift case study): the controlled T1x shift experiment needs an explicit description of how the candidate-pool bias is constructed and whether the same pretrained MLIP is used for both embedding and NTK acquisition; otherwise the robustness comparison to committees is not isolated from model choice.

Authors: We thank the referee for highlighting this point. In the revised §4.3 we will provide a detailed description of the candidate-pool bias construction, including the precise procedure used to create the controlled distribution shift. We will also clarify that the identical pretrained MLIP is used both to generate the embeddings and to compute the NTK-based acquisition scores, thereby isolating the comparison from differences in the underlying model. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents a methodological pipeline consisting of a chunked posterior-variance acquisition framework and a direct mathematical extension of the NTK via mixed parameter-coordinate derivatives to obtain force and joint energy-force kernels. These steps are defined explicitly in terms of standard NTK constructions and linear algebra operations without reducing to fitted parameters or self-referential definitions. The central claims are supported by empirical evaluations on public benchmarks (OC20, T1x, etc.) comparing against external baselines, with no load-bearing self-citations or ansatzes that collapse the derivation to its inputs by construction. The approach remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that NTK approximations extend meaningfully to force predictions and that the chunked variance computation preserves the necessary ranking properties for acquisition. No free parameters or invented entities are explicitly introduced in the abstract.

axioms (1)

domain assumption The Neural Tangent Kernel can be extended to force-aware settings via mixed parameter-coordinate derivatives to yield valid similarity metrics for vector-field prediction.
This is the core technical premise stated in the abstract for creating the force NTK and joint energy-force NTK.

pith-pipeline@v0.9.0 · 5827 in / 1410 out tokens · 56508 ms · 2026-05-19T16:40:39.234878+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

extend the Neural Tangent Kernel (NTK) to a force-aware setting via mixed parameter–coordinate derivatives, yielding a force NTK and a joint energy–force NTK
IndisputableMonolith/Foundation/AlphaCoordinateFixation.lean J_uniquely_calibrated_via_higher_derivative unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

ϕEF(x) = [√wE ϕE(x), √wF ϕF(x)] … KEF(x,x′) = wE kE(x,x′) + wF kF(x,x′)

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

118 extracted references · 118 canonical work pages · 2 internal anchors

[1]

Bartók, Risi Kondor, and Gábor Csányi

Albert P. Bartók, Risi Kondor, and Gábor Csányi. On representing chemical environments. Physical Review B, 87 0 (18): 0 184115, 2013. doi:10.1103/PhysRevB.87.184115

work page doi:10.1103/physrevb.87.184115 2013
[3]

A foundation model for atomistic materials chemistry

Ilyes Batatia, Philipp Benner, Yuan Chiang, Alin M. Elena, Dávid P. Kovács, Janosh Riebesell, Xavier R. Advincula, Mark Asta, Matthew Avaylon, et al. A foundation model for atomistic materials chemistry, 2023 a . URL https://arxiv.org/abs/2401.00096

work page internal anchor Pith review Pith/arXiv arXiv 2023
[4]

Ilyes Batatia, Dávid Péter Kovács, Gregor N. C. Simm, Christoph Ortner, and Gábor Csányi. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields, 2023 b . URL https://arxiv.org/abs/2206.07697

work page arXiv 2023
[5]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky. E(3) -equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications, 13: 0 2453, 2022. doi:10.1038/s41467-022-29939-5

work page doi:10.1038/s41467-022-29939-5 2022
[6]

Schaaf, Ondrej Marsalek, and Christoph Schran

Hubert Beck, Pavol Simko, Lars L. Schaaf, Ondrej Marsalek, and Christoph Schran. Multi-head committees enable direct uncertainty prediction for atomistic foundation models. The Journal of Chemical Physics, 163 0 (23): 0 234103, 2025. doi:10.1063/5.0288994

work page doi:10.1063/5.0288994 2025
[7]

Generalized neural-network representation of high-dimensional potential-energy surfaces

Jörg Behler and Michele Parrinello. Generalized neural-network representation of high-dimensional potential-energy surfaces. Physical Review Letters, 98 0 (14): 0 146401, 2007. doi:10.1103/PhysRevLett.98.146401

work page doi:10.1103/physrevlett.98.146401 2007
[8]

JAX : composable transformations of P ython+ N um P y programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Yash Katariya, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/jax-ml/jax

work page 2018
[9]

Machine learning interatomic potentials: library for efficient training, model development and simulation of molecular systems, 2025

Christoph Brunken, Olivier Peltre, Heloise Chomet, Lucien Walewski, Manus McAuliffe, Valentin Heyraud, Solal Attias, Martin Maarand, Yessine Khanfir, Edan Toledo, Fabio Falcioni, Marie Bluntzer, Silvia Acosta-Gutiérrez, and Jules Tilly. Machine learning interatomic potentials: library for efficient training, model development and simulation of molecular s...

work page arXiv 2025
[11]

BLIPs: Bayesian Learned Interatomic Potentials

Dario Coscia, Pim de Haan, and Max Welling. Blips: Bayesian learned interatomic potentials, 2026. URL https://arxiv.org/abs/2508.14022

work page arXiv 2026
[13]

Nature Machine Intelligence5(9), 1031– 1041 (2023)

Bowen Deng, Peichen Zhong, KyuJung Jun, Janosh Riebesell, Kevin Han, Christopher J. Bartel, and Gerbrand Ceder. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nature Machine Intelligence, 5: 0 1031--1041, 2023. doi:10.1038/s42256-023-00716-3

work page doi:10.1038/s42256-023-00716-3 2023
[14]

Persson, and Gerbrand Ceder

Bowen Deng, Yunyeong Choi, Peichen Zhong, Janosh Riebesell, Shashwat Anand, Zhuoying Li, KyuJung Jun, Kristin A. Persson, and Gerbrand Ceder. Systematic softening in universal machine learning interatomic potentials. npj Computational Materials, 11: 0 9, 2025. doi:10.1038/s41524-024-01500-6

work page doi:10.1038/s41524-024-01500-6 2025
[18]

Lauri Himanen, Marc O. J. J \"a ger, Eiaki V. Morooka, Filippo Federici Canova, Yashasvi S. Ranawat, David Z. Gao, Patrick Rinke, and Adam S. Foster. DScribe: Library of descriptors for machine learning in materials science . Computer Physics Communications, 247: 0 106949, 2020. ISSN 0010-4655. doi:10.1016/j.cpc.2019.106949. URL https://doi.org/10.1016/j....

work page doi:10.1016/j.cpc.2019.106949 2020
[19]

A framework and benchmark for deep batch active learning for regression

David Holzmüller, Viktor Zaverkin, Johannes Kästner, and Ingo Steinwart. A framework and benchmark for deep batch active learning for regression. Journal of Machine Learning Research, 24 0 (164): 0 1--81, 2023. URL https://www.jmlr.org/papers/v24/22-0937.html

work page 2023
[22]

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Arthur Jacot, Franck Gabriel, and Clément Hongler. Neural tangent kernel: Convergence and generalization in neural networks. In Advances in Neural Information Processing Systems, 2018. URL https://arxiv.org/abs/1806.07572

work page arXiv 2018
[23]

On-the-fly machine learning force field generation: Application to melting points

Ryosuke Jinnouchi, Ferenc Karsai, and Georg Kresse. On-the-fly machine learning force field generation: Application to melting points. Physical Review B, 100 0 (1): 0 014105, 2019. doi:10.1103/PhysRevB.100.014105

work page doi:10.1103/physrevb.100.014105 2019
[26]

Uncertainty quantification by direct propagation of shallow ensembles

Matthias Kellner and Michele Ceriotti. Uncertainty quantification by direct propagation of shallow ensembles. Machine Learning: Science and Technology, 5 0 (3): 0 035006, 2024. doi:10.1088/2632-2153/ad594a

work page doi:10.1088/2632-2153/ad594a 2024
[27]

Smith, and Benjamin Nebgen

Maksim Kulichenko, Kipton Barros, Nicholas Lubbers, Ying Wai Li, Richard Messerly, Sergei Tretiak, Justin S. Smith, and Benjamin Nebgen. Uncertainty-driven dynamics for active learning of interatomic potentials. Nature Computational Science, 3: 0 230--239, 2023. doi:10.1038/s43588-023-00406-5

work page doi:10.1038/s43588-023-00406-5 2023
[29]

How accurate are dft forces? unexpectedly large uncertainties in molecular datasets, 2025

Domantas Kuryla, Fabian Berger, Gábor Csányi, and Angelos Michaelides. How accurate are dft forces? unexpectedly large uncertainties in molecular datasets, 2025. URL https://arxiv.org/abs/2510.19774

work page arXiv 2025
[32]

Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G

Daniel S. Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G. Taylor, Muhammad R. Hasyim, Kyle Michel, Ilyes Batatia, Gábor Csányi, Misko Dzamba, Peter Eastman, Nathan C. Frey, Xiang Fu, Vahe Gharakhanyan, Aditi S. Krishnapriyan, Joshua A. Rackers, Sanjeev Raja, Ammar Rizvi, Andrew S. Rosen, Zachary Ulissi, Santiago Vargas, C. Lawrence Zi...

work page arXiv 2025
[33]

A critical review of machine learning interatomic potentials and hamiltonian

Yifan Li, Xiuying Zhang, Mingkang Liu, and Lei Shen. A critical review of machine learning interatomic potentials and hamiltonian. Journal of Materials Informatics, 5 0 (4), 2025. ISSN 2770-372X. doi:10.20517/jmi.2025.17. URL https://www.oaepublish.com/articles/jmi.2025.17

work page doi:10.20517/jmi.2025.17 2025
[37]

Peterson, Rune Christensen, and Alireza Khorshidi

Andrew A. Peterson, Rune Christensen, and Alireza Khorshidi. Addressing uncertainty in atomistic machine learning. Physical Chemistry Chemical Physics, 19: 0 10978--10985, 2017. doi:10.1039/C7CP00375G

work page doi:10.1039/c7cp00375g 2017
[38]

Podryabinkin, A.V

Evgeny V. Podryabinkin and Alexander V. Shapeev. Active learning of linearly parametrized interatomic potentials. Computational Materials Science, 140: 0 171--180, 2017. doi:10.1016/j.commatsci.2017.08.031

work page doi:10.1016/j.commatsci.2017.08.031 2017
[41]

Orb-v3: atomistic simulation at scale, 2025

Benjamin Rhodes, Sander Vandenhaute, Vaidotas Šimkus, James Gin, Jonathan Godwin, Tim Duignan, and Mark Neumann. Orb-v3: atomistic simulation at scale, 2025. URL https://arxiv.org/abs/2504.06231

work page arXiv 2025
[42]

Extended-connectivity fingerprints.J

David Rogers and Mathew Hahn. Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50 0 (5): 0 742--754, 2010. doi:10.1021/ci100050t

work page doi:10.1021/ci100050t 2010
[43]

Levine, Zachary Ulissi, C

Sushree Jagriti Sahoo, Mikael Maraschin, Daniel S. Levine, Zachary Ulissi, C. Lawrence Zitnick, Joel B Varley, Joseph A. Gauthier, Nitish Govindarajan, and Muhammed Shuaibi. The open catalyst 2025 (oc25) dataset and models for solid-liquid interfaces, 2025. URL https://arxiv.org/abs/2509.17862

work page arXiv 2025
[45]

Committee neural network potentials control generalization errors and enable active learning

Christoph Schran, Krystof Brezina, and Ondrej Marsalek. Committee neural network potentials control generalization errors and enable active learning. The Journal of Chemical Physics, 153 0 (10): 0 104105, 2020. doi:10.1063/5.0016004

work page doi:10.1063/5.0016004 2020
[47]

Sch¨ utt, Huziel E

Kristof T. Schütt, Huziel E. Sauceda, Pieter-Jan Kindermans, Alexandre Tkatchenko, and Klaus-Robert Müller. SchNet -- a deep learning architecture for molecules and materials. The Journal of Chemical Physics, 148 0 (24): 0 241722, 2018. doi:10.1063/1.5019779

work page doi:10.1063/1.5019779 2018
[49]

Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, and Adrian E

Justin S. Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, and Adrian E. Roitberg. Less is more: Sampling chemical space with active learning. The Journal of Chemical Physics, 148 0 (24): 0 241733, 2018. doi:10.1063/1.5023802

work page doi:10.1063/1.5023802 2018
[51]

Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M

Jonathan Vandermause, Steven B. Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M. Kolpak, and Boris Kozinsky. On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events. npj Computational Materials, 6: 0 20, 2020. doi:10.1038/s41524-020-0283-z

work page doi:10.1038/s41524-020-0283-z 2020
[52]

Pretrained Model Representations as Acquisition Signals for Active Learning of MLIPs

Eszter Varga-Umbrich, Shikha Surana, Paul Duckworth, Jules Tilly, Olivier Peltre, and Zachary Weller-Davies. Pretrained model representations as acquisition signals for active learning of mlips, 2026. URL https://arxiv.org/abs/2605.03964

work page internal anchor Pith review Pith/arXiv arXiv 2026
[55]

Uma: A family of universal models for atoms

Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Kareem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, Sushree Jagriti Sahoo, Zachary W. Ulissi, and C. Lawrence Zitnick. Uma: A family of universal models for atoms, 2026. URL htt...

work page arXiv 2026
[56]

Exploring chemical and conformational spaces by batch mode deep active learning

Viktor Zaverkin, David Holzmüller, Ingo Steinwart, and Johannes Kästner. Exploring chemical and conformational spaces by batch mode deep active learning. Digital Discovery, 1: 0 605--620, 2022. doi:10.1039/D2DD00034B

work page doi:10.1039/d2dd00034b 2022
[57]

Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, and Johannes Kästner. Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials. npj Computational Materials, 10: 0 83, 2024. doi:10.1038/s41524-024-01254-1

work page doi:10.1038/s41524-024-01254-1 2024
[58]

Active learning of uniformly accurate interatomic potentials for materials simulation

Linfeng Zhang, De-Ye Lin, Han Wang, Roberto Car, and Weinan E. Active learning of uniformly accurate interatomic potentials for materials simulation. Physical Review Materials, 3 0 (2): 0 023804, 2019. doi:10.1103/PhysRevMaterials.3.023804

work page doi:10.1103/physrevmaterials.3.023804 2019
[60]

Data curation for machine learning interatomic potentials by determinantal point processes, 2026

Joanna Zou and Youssef Marzouk. Data curation for machine learning interatomic potentials by determinantal point processes, 2026. URL https://arxiv.org/abs/2603.22160

work page arXiv 2026
[61]

and Becke, A

Kohn, W. and Becke, A. D. and Parr, R. G. , title =. The Journal of Physical Chemistry , year =. doi:10.1021/jp960669l , url =

work page doi:10.1021/jp960669l
[62]

2023 , eprint=

MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields , author=. 2023 , eprint=

work page 2023
[63]

2025 , eprint=

Machine Learning Interatomic Potentials: library for efficient training, model development and simulation of molecular systems , author=. 2025 , eprint=

work page 2025
[64]

Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations , journal =

Guanjie Wang and Changrui Wang and Xuanguang Zhang and Zefeng Li and Jian Zhou and Zhimei Sun , keywords =. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations , journal =. 2024 , issn =. doi:https://doi.org/10.1016/j.isci.2024.109673 , url =

work page doi:10.1016/j.isci.2024.109673 2024
[65]

Quality of uncertainty estimates from neural network potential ensembles , volume=

Kahle, Leonid and Zipoli, Federico , year=. Quality of uncertainty estimates from neural network potential ensembles , volume=. Physical Review E , publisher=. doi:10.1103/physreve.105.015311 , number=

work page doi:10.1103/physreve.105.015311
[66]

Journal of Materials Informatics , VOLUME =

Yifan Li and Xiuying Zhang and Mingkang Liu and Lei Shen , TITLE =. Journal of Materials Informatics , VOLUME =. 2025 , NUMBER =

work page 2025
[67]

Jacobs, Ryan and Morgan, Dane and Attarian, Siamak and Meng, Jun and Shen, Chen and Wu, Zhenghao and Xie, Clare Yijia and Yang, Julia H. and Artrith, Nongnuch and Blaiszik, Ben and Ceder, Gerbrand and Choudhary, Kamal and Csanyi, Gabor and Cubuk, Ekin Dogus and Deng, Bowen and Drautz, Ralf and Fu, Xiang and Godwin, Jonathan and Honavar, Vasant and Isayev,...

work page doi:10.1016/j.cossms.2025.101214 2025
[68]

Physical Review Letters , volume=

Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons , author=. Physical Review Letters , volume=. 2010 , doi=

work page 2010
[69]

Physical Review Letters , volume=

Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics , author=. Physical Review Letters , volume=. 2018 , doi=

work page 2018
[70]

and Sauceda, Huziel E

Schütt, Kristof T. and Sauceda, Huziel E. and Kindermans, Pieter-Jan and Tkatchenko, Alexandre and Müller, Klaus-Robert , journal=. 2018 , doi=

work page 2018
[71]

Physical Review B , volume=

Atomic cluster expansion for accurate and transferable interatomic potentials , author=. Physical Review B , volume=. 2019 , doi=

work page 2019
[72]

and Kornbluth, Mordechai and Molinari, Nicola and Smidt, Tess E

Batzner, Simon and Musaelian, Albert and Sun, Lixin and Geiger, Mario and Mailoa, Jonathan P. and Kornbluth, Mordechai and Molinari, Nicola and Smidt, Tess E. and Kozinsky, Boris , journal=. 2022 , doi=

work page 2022
[73]

Nature Communications , volume=

Learning local equivariant representations for large-scale atomistic dynamics , author=. Nature Communications , volume=. 2023 , doi=

work page 2023
[74]

2026 , eprint=

UMA: A Family of Universal Models for Atoms , author=. 2026 , eprint=

work page 2026
[75]

Nature Communications , year =

Mazitov, Arslan and Bigi, Filippo and Kellner, Matthias and Pegolo, Paolo and Tisi, Davide and Fraux, Guillaume and Pozdnyakov, Sergey and Loche, Philip and Ceriotti, Michele , title =. Nature Communications , year =. doi:10.1038/s41467-025-65662-7 , url =

work page doi:10.1038/s41467-025-65662-7
[76]

Physical Review Letters , volume=

Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces , author=. Physical Review Letters , volume=. 2007 , doi=

work page 2007
[77]

Physical Review B , volume=

On representing chemical environments , author=. Physical Review B , volume=. 2013 , doi=

work page 2013
[78]

and De, Sandip and Poelking, Carl and Bernstein, Noam and Kermode, James R

Bartók, Albert P. and De, Sandip and Poelking, Carl and Bernstein, Noam and Kermode, James R. and Csányi, Gábor and Ceriotti, Michele , year=. Machine learning unifies the modeling of materials and molecules , volume=. Science Advances , publisher=. doi:10.1126/sciadv.1701816 , number=

work page doi:10.1126/sciadv.1701816
[79]

Journal of Chemical Information and Modeling , volume=

Extended-Connectivity Fingerprints , author=. Journal of Chemical Information and Modeling , volume=. 2010 , doi=

work page 2010
[80]

and Barros, Kipton and Allen, Alice E

Kulichenko, Maksim and Nebgen, Benjamin and Lubbers, Nicholas and Smith, Justin S. and Barros, Kipton and Allen, Alice E. A. and Habib, Adela and Shinkle, Emily and Fedik, Nikita and Li, Ying Wai and Messerly, Richard A. and Tretiak, Sergei , year =. Data Generation for Machine Learning Interatomic Potentials and Beyond , volume =. Chemical Reviews , publ...

work page doi:10.1021/acs.chemrev.4c00572
[81]

The Journal of Chemical Physics , volume=

Less is more: Sampling chemical space with active learning , author=. The Journal of Chemical Physics , volume=. 2018 , doi=

work page 2018
[82]

Computational Materials Science , volume=

Active learning of linearly parametrized interatomic potentials , author=. Computational Materials Science , volume=. 2017 , doi=

work page 2017
[83]

Physical Review B , volume=

On-the-fly machine learning force field generation: Application to melting points , author=. Physical Review B , volume=. 2019 , doi=

work page 2019
[84]

and Batzner, Simon and Xie, Yu and Sun, Lixin and Kolpak, Alexie M

Vandermause, Jonathan and Torrisi, Steven B. and Batzner, Simon and Xie, Yu and Sun, Lixin and Kolpak, Alexie M. and Kozinsky, Boris , journal=. On-the-fly active learning of interpretable. 2020 , doi=

work page 2020
[85]

Physical Review Materials , volume=

Active learning of uniformly accurate interatomic potentials for materials simulation , author=. Physical Review Materials , volume=. 2019 , doi=

work page 2019
[86]

A comprehensive benchmark of active learning strategies with AutoML for small-sample regression in materials science , volume =

Bi, Jinghou and Xu, Yuanhao and Conrad, Felix and Wiemer, Hajo and Ihlenfeldt, Steffen , year =. A comprehensive benchmark of active learning strategies with AutoML for small-sample regression in materials science , volume =. Scientific Reports , publisher =. doi:10.1038/s41598-025-24613-4 , number =

work page doi:10.1038/s41598-025-24613-4
[87]

Nature Computational Science , volume=

Uncertainty-driven dynamics for active learning of interatomic potentials , author=. Nature Computational Science , volume=. 2023 , doi=

work page 2023
[88]

npj Computational Materials , volume=

De novo exploration and self-guided learning of potential-energy surfaces , author=. npj Computational Materials , volume=. 2019 , doi=

work page 2019
[89]

The Journal of Chemical Physics , volume=

An entropy-maximization approach to automated training set generation for interatomic potentials , author=. The Journal of Chemical Physics , volume=. 2020 , doi=

work page 2020
[90]

Active learning meets metadynamics: automated workflow for reactive machine learning interatomic potentials , volume =

Vitartas, Valdas and Zhang, Hanwen and Juraskova, Veronika and Johnston-Wood, Tristan and Duarte, Fernanda , year =. Active learning meets metadynamics: automated workflow for reactive machine learning interatomic potentials , volume =. Digital Discovery , publisher =. doi:10.1039/d5dd00261c , number =

work page doi:10.1039/d5dd00261c
[91]

The Journal of Chemical Physics , volume=

Committee neural network potentials control generalization errors and enable active learning , author=. The Journal of Chemical Physics , volume=. 2020 , doi=

work page 2020
[92]

Physical Chemistry Chemical Physics , volume=

Addressing uncertainty in atomistic machine learning , author=. Physical Chemistry Chemical Physics , volume=. 2017 , doi=

work page 2017
[93]

The Journal of Chemical Physics , volume=

Fast uncertainty estimates in deep learning interatomic potentials , author=. The Journal of Chemical Physics , volume=. 2023 , doi=

work page 2023
[94]

Machine Learning: Science and Technology , volume=

Uncertainty quantification by direct propagation of shallow ensembles , author=. Machine Learning: Science and Technology , volume=. 2024 , doi=

work page 2024
[95]

The Journal of Chemical Physics , volume=

Multi-head committees enable direct uncertainty prediction for atomistic foundation models , author=. The Journal of Chemical Physics , volume=. 2025 , doi=

work page 2025
[96]

Physical Review Materials , volume =

Ouyang, Xinjian and Wang, Zhilong and Jie, Xiao and Zhang, Feng and Zhang, Yanxing and Liu, Laijun and Wang, Dawei , title =. Physical Review Materials , volume =. 2024 , month = oct, publisher =. doi:10.1103/PhysRevMaterials.8.103804 , url =

work page doi:10.1103/physrevmaterials.8.103804 2024
[97]

2026 , eprint=

Cutting Through the Noise: On-the-fly Outlier Detection for Robust Training of Machine Learning Interatomic Potentials , author=. 2026 , eprint=

work page 2026
[98]

2026 , eprint=

BLIPs: Bayesian Learned Interatomic Potentials , author=. 2026 , eprint=

work page 2026
[99]

2025 , eprint=

Orb-v3: atomistic simulation at scale , author=. 2025 , eprint=

work page 2025
[100]

Tan, Aik Rui and Urata, Shingo and Goldman, Samuel and Dietschreit, Johannes C. B. and Gómez-Bombarelli, Rafael , year=. Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles , volume=. npj Computational Materials , publisher=. doi:10.1038/s41524-023-01180-8 , number=

work page doi:10.1038/s41524-023-01180-8
[101]

Advances in Neural Information Processing Systems , year=

Neural Tangent Kernel: Convergence and Generalization in Neural Networks , author=. Advances in Neural Information Processing Systems , year=

work page
[102]

Journal of Machine Learning Research , volume=

A Framework and Benchmark for Deep Batch Active Learning for Regression , author=. Journal of Machine Learning Research , volume=. 2023 , url=

work page 2023
[103]

2023 , eprint=

Black-Box Batch Active Learning for Regression , author=. 2023 , eprint=

work page 2023
[104]

, year =

Vazirani, Vijay V. , year =. k-Center , ISBN =. doi:10.1007/978-3-662-04565-7_5 , booktitle =

work page doi:10.1007/978-3-662-04565-7_5
[105]

Rasmussen, Carl Edward and Williams, Christopher K. I. , year =. Gaussian Processes for Machine Learning , ISBN =. doi:10.7551/mitpress/3206.001.0001 , publisher =

work page doi:10.7551/mitpress/3206.001.0001

Showing first 80 references.

[1] [1]

Bartók, Risi Kondor, and Gábor Csányi

Albert P. Bartók, Risi Kondor, and Gábor Csányi. On representing chemical environments. Physical Review B, 87 0 (18): 0 184115, 2013. doi:10.1103/PhysRevB.87.184115

work page doi:10.1103/physrevb.87.184115 2013

[2] [3]

A foundation model for atomistic materials chemistry

Ilyes Batatia, Philipp Benner, Yuan Chiang, Alin M. Elena, Dávid P. Kovács, Janosh Riebesell, Xavier R. Advincula, Mark Asta, Matthew Avaylon, et al. A foundation model for atomistic materials chemistry, 2023 a . URL https://arxiv.org/abs/2401.00096

work page internal anchor Pith review Pith/arXiv arXiv 2023

[3] [4]

Ilyes Batatia, Dávid Péter Kovács, Gregor N. C. Simm, Christoph Ortner, and Gábor Csányi. Mace: Higher order equivariant message passing neural networks for fast and accurate force fields, 2023 b . URL https://arxiv.org/abs/2206.07697

work page arXiv 2023

[4] [5]

Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E

Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky. E(3) -equivariant graph neural networks for data-efficient and accurate interatomic potentials. Nature Communications, 13: 0 2453, 2022. doi:10.1038/s41467-022-29939-5

work page doi:10.1038/s41467-022-29939-5 2022

[5] [6]

Schaaf, Ondrej Marsalek, and Christoph Schran

Hubert Beck, Pavol Simko, Lars L. Schaaf, Ondrej Marsalek, and Christoph Schran. Multi-head committees enable direct uncertainty prediction for atomistic foundation models. The Journal of Chemical Physics, 163 0 (23): 0 234103, 2025. doi:10.1063/5.0288994

work page doi:10.1063/5.0288994 2025

[6] [7]

Generalized neural-network representation of high-dimensional potential-energy surfaces

Jörg Behler and Michele Parrinello. Generalized neural-network representation of high-dimensional potential-energy surfaces. Physical Review Letters, 98 0 (14): 0 146401, 2007. doi:10.1103/PhysRevLett.98.146401

work page doi:10.1103/physrevlett.98.146401 2007

[7] [8]

JAX : composable transformations of P ython+ N um P y programs, 2018

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Yash Katariya, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake Vander P las, Skye Wanderman- M ilne, and Qiao Zhang. JAX : composable transformations of P ython+ N um P y programs, 2018. URL http://github.com/jax-ml/jax

work page 2018

[8] [9]

Machine learning interatomic potentials: library for efficient training, model development and simulation of molecular systems, 2025

Christoph Brunken, Olivier Peltre, Heloise Chomet, Lucien Walewski, Manus McAuliffe, Valentin Heyraud, Solal Attias, Martin Maarand, Yessine Khanfir, Edan Toledo, Fabio Falcioni, Marie Bluntzer, Silvia Acosta-Gutiérrez, and Jules Tilly. Machine learning interatomic potentials: library for efficient training, model development and simulation of molecular s...

work page arXiv 2025

[9] [11]

BLIPs: Bayesian Learned Interatomic Potentials

Dario Coscia, Pim de Haan, and Max Welling. Blips: Bayesian learned interatomic potentials, 2026. URL https://arxiv.org/abs/2508.14022

work page arXiv 2026

[10] [13]

Nature Machine Intelligence5(9), 1031– 1041 (2023)

Bowen Deng, Peichen Zhong, KyuJung Jun, Janosh Riebesell, Kevin Han, Christopher J. Bartel, and Gerbrand Ceder. CHGNet as a pretrained universal neural network potential for charge-informed atomistic modelling. Nature Machine Intelligence, 5: 0 1031--1041, 2023. doi:10.1038/s42256-023-00716-3

work page doi:10.1038/s42256-023-00716-3 2023

[11] [14]

Persson, and Gerbrand Ceder

Bowen Deng, Yunyeong Choi, Peichen Zhong, Janosh Riebesell, Shashwat Anand, Zhuoying Li, KyuJung Jun, Kristin A. Persson, and Gerbrand Ceder. Systematic softening in universal machine learning interatomic potentials. npj Computational Materials, 11: 0 9, 2025. doi:10.1038/s41524-024-01500-6

work page doi:10.1038/s41524-024-01500-6 2025

[12] [18]

Lauri Himanen, Marc O. J. J \"a ger, Eiaki V. Morooka, Filippo Federici Canova, Yashasvi S. Ranawat, David Z. Gao, Patrick Rinke, and Adam S. Foster. DScribe: Library of descriptors for machine learning in materials science . Computer Physics Communications, 247: 0 106949, 2020. ISSN 0010-4655. doi:10.1016/j.cpc.2019.106949. URL https://doi.org/10.1016/j....

work page doi:10.1016/j.cpc.2019.106949 2020

[13] [19]

A framework and benchmark for deep batch active learning for regression

David Holzmüller, Viktor Zaverkin, Johannes Kästner, and Ingo Steinwart. A framework and benchmark for deep batch active learning for regression. Journal of Machine Learning Research, 24 0 (164): 0 1--81, 2023. URL https://www.jmlr.org/papers/v24/22-0937.html

work page 2023

[14] [22]

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

Arthur Jacot, Franck Gabriel, and Clément Hongler. Neural tangent kernel: Convergence and generalization in neural networks. In Advances in Neural Information Processing Systems, 2018. URL https://arxiv.org/abs/1806.07572

work page arXiv 2018

[15] [23]

On-the-fly machine learning force field generation: Application to melting points

Ryosuke Jinnouchi, Ferenc Karsai, and Georg Kresse. On-the-fly machine learning force field generation: Application to melting points. Physical Review B, 100 0 (1): 0 014105, 2019. doi:10.1103/PhysRevB.100.014105

work page doi:10.1103/physrevb.100.014105 2019

[16] [26]

Uncertainty quantification by direct propagation of shallow ensembles

Matthias Kellner and Michele Ceriotti. Uncertainty quantification by direct propagation of shallow ensembles. Machine Learning: Science and Technology, 5 0 (3): 0 035006, 2024. doi:10.1088/2632-2153/ad594a

work page doi:10.1088/2632-2153/ad594a 2024

[17] [27]

Smith, and Benjamin Nebgen

Maksim Kulichenko, Kipton Barros, Nicholas Lubbers, Ying Wai Li, Richard Messerly, Sergei Tretiak, Justin S. Smith, and Benjamin Nebgen. Uncertainty-driven dynamics for active learning of interatomic potentials. Nature Computational Science, 3: 0 230--239, 2023. doi:10.1038/s43588-023-00406-5

work page doi:10.1038/s43588-023-00406-5 2023

[18] [29]

How accurate are dft forces? unexpectedly large uncertainties in molecular datasets, 2025

Domantas Kuryla, Fabian Berger, Gábor Csányi, and Angelos Michaelides. How accurate are dft forces? unexpectedly large uncertainties in molecular datasets, 2025. URL https://arxiv.org/abs/2510.19774

work page arXiv 2025

[19] [32]

Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G

Daniel S. Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G. Taylor, Muhammad R. Hasyim, Kyle Michel, Ilyes Batatia, Gábor Csányi, Misko Dzamba, Peter Eastman, Nathan C. Frey, Xiang Fu, Vahe Gharakhanyan, Aditi S. Krishnapriyan, Joshua A. Rackers, Sanjeev Raja, Ammar Rizvi, Andrew S. Rosen, Zachary Ulissi, Santiago Vargas, C. Lawrence Zi...

work page arXiv 2025

[20] [33]

A critical review of machine learning interatomic potentials and hamiltonian

Yifan Li, Xiuying Zhang, Mingkang Liu, and Lei Shen. A critical review of machine learning interatomic potentials and hamiltonian. Journal of Materials Informatics, 5 0 (4), 2025. ISSN 2770-372X. doi:10.20517/jmi.2025.17. URL https://www.oaepublish.com/articles/jmi.2025.17

work page doi:10.20517/jmi.2025.17 2025

[21] [37]

Peterson, Rune Christensen, and Alireza Khorshidi

Andrew A. Peterson, Rune Christensen, and Alireza Khorshidi. Addressing uncertainty in atomistic machine learning. Physical Chemistry Chemical Physics, 19: 0 10978--10985, 2017. doi:10.1039/C7CP00375G

work page doi:10.1039/c7cp00375g 2017

[22] [38]

Podryabinkin, A.V

Evgeny V. Podryabinkin and Alexander V. Shapeev. Active learning of linearly parametrized interatomic potentials. Computational Materials Science, 140: 0 171--180, 2017. doi:10.1016/j.commatsci.2017.08.031

work page doi:10.1016/j.commatsci.2017.08.031 2017

[23] [41]

Orb-v3: atomistic simulation at scale, 2025

Benjamin Rhodes, Sander Vandenhaute, Vaidotas Šimkus, James Gin, Jonathan Godwin, Tim Duignan, and Mark Neumann. Orb-v3: atomistic simulation at scale, 2025. URL https://arxiv.org/abs/2504.06231

work page arXiv 2025

[24] [42]

Extended-connectivity fingerprints.J

David Rogers and Mathew Hahn. Extended-connectivity fingerprints. Journal of Chemical Information and Modeling, 50 0 (5): 0 742--754, 2010. doi:10.1021/ci100050t

work page doi:10.1021/ci100050t 2010

[25] [43]

Levine, Zachary Ulissi, C

Sushree Jagriti Sahoo, Mikael Maraschin, Daniel S. Levine, Zachary Ulissi, C. Lawrence Zitnick, Joel B Varley, Joseph A. Gauthier, Nitish Govindarajan, and Muhammed Shuaibi. The open catalyst 2025 (oc25) dataset and models for solid-liquid interfaces, 2025. URL https://arxiv.org/abs/2509.17862

work page arXiv 2025

[26] [45]

Committee neural network potentials control generalization errors and enable active learning

Christoph Schran, Krystof Brezina, and Ondrej Marsalek. Committee neural network potentials control generalization errors and enable active learning. The Journal of Chemical Physics, 153 0 (10): 0 104105, 2020. doi:10.1063/5.0016004

work page doi:10.1063/5.0016004 2020

[27] [47]

Sch¨ utt, Huziel E

Kristof T. Schütt, Huziel E. Sauceda, Pieter-Jan Kindermans, Alexandre Tkatchenko, and Klaus-Robert Müller. SchNet -- a deep learning architecture for molecules and materials. The Journal of Chemical Physics, 148 0 (24): 0 241722, 2018. doi:10.1063/1.5019779

work page doi:10.1063/1.5019779 2018

[28] [49]

Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, and Adrian E

Justin S. Smith, Ben Nebgen, Nicholas Lubbers, Olexandr Isayev, and Adrian E. Roitberg. Less is more: Sampling chemical space with active learning. The Journal of Chemical Physics, 148 0 (24): 0 241733, 2018. doi:10.1063/1.5023802

work page doi:10.1063/1.5023802 2018

[29] [51]

Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M

Jonathan Vandermause, Steven B. Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M. Kolpak, and Boris Kozinsky. On-the-fly active learning of interpretable Bayesian force fields for atomistic rare events. npj Computational Materials, 6: 0 20, 2020. doi:10.1038/s41524-020-0283-z

work page doi:10.1038/s41524-020-0283-z 2020

[30] [52]

Pretrained Model Representations as Acquisition Signals for Active Learning of MLIPs

Eszter Varga-Umbrich, Shikha Surana, Paul Duckworth, Jules Tilly, Olivier Peltre, and Zachary Weller-Davies. Pretrained model representations as acquisition signals for active learning of mlips, 2026. URL https://arxiv.org/abs/2605.03964

work page internal anchor Pith review Pith/arXiv arXiv 2026

[31] [55]

Uma: A family of universal models for atoms

Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso-Luque, Kareem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, Sushree Jagriti Sahoo, Zachary W. Ulissi, and C. Lawrence Zitnick. Uma: A family of universal models for atoms, 2026. URL htt...

work page arXiv 2026

[32] [56]

Exploring chemical and conformational spaces by batch mode deep active learning

Viktor Zaverkin, David Holzmüller, Ingo Steinwart, and Johannes Kästner. Exploring chemical and conformational spaces by batch mode deep active learning. Digital Discovery, 1: 0 605--620, 2022. doi:10.1039/D2DD00034B

work page doi:10.1039/d2dd00034b 2022

[33] [57]

Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials

Viktor Zaverkin, David Holzmüller, Henrik Christiansen, Federico Errica, Francesco Alesiani, Makoto Takamoto, Mathias Niepert, and Johannes Kästner. Uncertainty-biased molecular dynamics for learning uniformly accurate interatomic potentials. npj Computational Materials, 10: 0 83, 2024. doi:10.1038/s41524-024-01254-1

work page doi:10.1038/s41524-024-01254-1 2024

[34] [58]

Active learning of uniformly accurate interatomic potentials for materials simulation

Linfeng Zhang, De-Ye Lin, Han Wang, Roberto Car, and Weinan E. Active learning of uniformly accurate interatomic potentials for materials simulation. Physical Review Materials, 3 0 (2): 0 023804, 2019. doi:10.1103/PhysRevMaterials.3.023804

work page doi:10.1103/physrevmaterials.3.023804 2019

[35] [60]

Data curation for machine learning interatomic potentials by determinantal point processes, 2026

Joanna Zou and Youssef Marzouk. Data curation for machine learning interatomic potentials by determinantal point processes, 2026. URL https://arxiv.org/abs/2603.22160

work page arXiv 2026

[36] [61]

and Becke, A

Kohn, W. and Becke, A. D. and Parr, R. G. , title =. The Journal of Physical Chemistry , year =. doi:10.1021/jp960669l , url =

work page doi:10.1021/jp960669l

[37] [62]

2023 , eprint=

MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields , author=. 2023 , eprint=

work page 2023

[38] [63]

2025 , eprint=

Machine Learning Interatomic Potentials: library for efficient training, model development and simulation of molecular systems , author=. 2025 , eprint=

work page 2025

[39] [64]

Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations , journal =

Guanjie Wang and Changrui Wang and Xuanguang Zhang and Zefeng Li and Jian Zhou and Zhimei Sun , keywords =. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations , journal =. 2024 , issn =. doi:https://doi.org/10.1016/j.isci.2024.109673 , url =

work page doi:10.1016/j.isci.2024.109673 2024

[40] [65]

Quality of uncertainty estimates from neural network potential ensembles , volume=

Kahle, Leonid and Zipoli, Federico , year=. Quality of uncertainty estimates from neural network potential ensembles , volume=. Physical Review E , publisher=. doi:10.1103/physreve.105.015311 , number=

work page doi:10.1103/physreve.105.015311

[41] [66]

Journal of Materials Informatics , VOLUME =

Yifan Li and Xiuying Zhang and Mingkang Liu and Lei Shen , TITLE =. Journal of Materials Informatics , VOLUME =. 2025 , NUMBER =

work page 2025

[42] [67]

Jacobs, Ryan and Morgan, Dane and Attarian, Siamak and Meng, Jun and Shen, Chen and Wu, Zhenghao and Xie, Clare Yijia and Yang, Julia H. and Artrith, Nongnuch and Blaiszik, Ben and Ceder, Gerbrand and Choudhary, Kamal and Csanyi, Gabor and Cubuk, Ekin Dogus and Deng, Bowen and Drautz, Ralf and Fu, Xiang and Godwin, Jonathan and Honavar, Vasant and Isayev,...

work page doi:10.1016/j.cossms.2025.101214 2025

[43] [68]

Physical Review Letters , volume=

Gaussian Approximation Potentials: The Accuracy of Quantum Mechanics, without the Electrons , author=. Physical Review Letters , volume=. 2010 , doi=

work page 2010

[44] [69]

Physical Review Letters , volume=

Deep Potential Molecular Dynamics: A Scalable Model with the Accuracy of Quantum Mechanics , author=. Physical Review Letters , volume=. 2018 , doi=

work page 2018

[45] [70]

and Sauceda, Huziel E

Schütt, Kristof T. and Sauceda, Huziel E. and Kindermans, Pieter-Jan and Tkatchenko, Alexandre and Müller, Klaus-Robert , journal=. 2018 , doi=

work page 2018

[46] [71]

Physical Review B , volume=

Atomic cluster expansion for accurate and transferable interatomic potentials , author=. Physical Review B , volume=. 2019 , doi=

work page 2019

[47] [72]

and Kornbluth, Mordechai and Molinari, Nicola and Smidt, Tess E

Batzner, Simon and Musaelian, Albert and Sun, Lixin and Geiger, Mario and Mailoa, Jonathan P. and Kornbluth, Mordechai and Molinari, Nicola and Smidt, Tess E. and Kozinsky, Boris , journal=. 2022 , doi=

work page 2022

[48] [73]

Nature Communications , volume=

Learning local equivariant representations for large-scale atomistic dynamics , author=. Nature Communications , volume=. 2023 , doi=

work page 2023

[49] [74]

2026 , eprint=

UMA: A Family of Universal Models for Atoms , author=. 2026 , eprint=

work page 2026

[50] [75]

Nature Communications , year =

Mazitov, Arslan and Bigi, Filippo and Kellner, Matthias and Pegolo, Paolo and Tisi, Davide and Fraux, Guillaume and Pozdnyakov, Sergey and Loche, Philip and Ceriotti, Michele , title =. Nature Communications , year =. doi:10.1038/s41467-025-65662-7 , url =

work page doi:10.1038/s41467-025-65662-7

[51] [76]

Physical Review Letters , volume=

Generalized Neural-Network Representation of High-Dimensional Potential-Energy Surfaces , author=. Physical Review Letters , volume=. 2007 , doi=

work page 2007

[52] [77]

Physical Review B , volume=

On representing chemical environments , author=. Physical Review B , volume=. 2013 , doi=

work page 2013

[53] [78]

and De, Sandip and Poelking, Carl and Bernstein, Noam and Kermode, James R

Bartók, Albert P. and De, Sandip and Poelking, Carl and Bernstein, Noam and Kermode, James R. and Csányi, Gábor and Ceriotti, Michele , year=. Machine learning unifies the modeling of materials and molecules , volume=. Science Advances , publisher=. doi:10.1126/sciadv.1701816 , number=

work page doi:10.1126/sciadv.1701816

[54] [79]

Journal of Chemical Information and Modeling , volume=

Extended-Connectivity Fingerprints , author=. Journal of Chemical Information and Modeling , volume=. 2010 , doi=

work page 2010

[55] [80]

and Barros, Kipton and Allen, Alice E

Kulichenko, Maksim and Nebgen, Benjamin and Lubbers, Nicholas and Smith, Justin S. and Barros, Kipton and Allen, Alice E. A. and Habib, Adela and Shinkle, Emily and Fedik, Nikita and Li, Ying Wai and Messerly, Richard A. and Tretiak, Sergei , year =. Data Generation for Machine Learning Interatomic Potentials and Beyond , volume =. Chemical Reviews , publ...

work page doi:10.1021/acs.chemrev.4c00572

[56] [81]

The Journal of Chemical Physics , volume=

Less is more: Sampling chemical space with active learning , author=. The Journal of Chemical Physics , volume=. 2018 , doi=

work page 2018

[57] [82]

Computational Materials Science , volume=

Active learning of linearly parametrized interatomic potentials , author=. Computational Materials Science , volume=. 2017 , doi=

work page 2017

[58] [83]

Physical Review B , volume=

On-the-fly machine learning force field generation: Application to melting points , author=. Physical Review B , volume=. 2019 , doi=

work page 2019

[59] [84]

and Batzner, Simon and Xie, Yu and Sun, Lixin and Kolpak, Alexie M

Vandermause, Jonathan and Torrisi, Steven B. and Batzner, Simon and Xie, Yu and Sun, Lixin and Kolpak, Alexie M. and Kozinsky, Boris , journal=. On-the-fly active learning of interpretable. 2020 , doi=

work page 2020

[60] [85]

Physical Review Materials , volume=

Active learning of uniformly accurate interatomic potentials for materials simulation , author=. Physical Review Materials , volume=. 2019 , doi=

work page 2019

[61] [86]

A comprehensive benchmark of active learning strategies with AutoML for small-sample regression in materials science , volume =

Bi, Jinghou and Xu, Yuanhao and Conrad, Felix and Wiemer, Hajo and Ihlenfeldt, Steffen , year =. A comprehensive benchmark of active learning strategies with AutoML for small-sample regression in materials science , volume =. Scientific Reports , publisher =. doi:10.1038/s41598-025-24613-4 , number =

work page doi:10.1038/s41598-025-24613-4

[62] [87]

Nature Computational Science , volume=

Uncertainty-driven dynamics for active learning of interatomic potentials , author=. Nature Computational Science , volume=. 2023 , doi=

work page 2023

[63] [88]

npj Computational Materials , volume=

De novo exploration and self-guided learning of potential-energy surfaces , author=. npj Computational Materials , volume=. 2019 , doi=

work page 2019

[64] [89]

The Journal of Chemical Physics , volume=

An entropy-maximization approach to automated training set generation for interatomic potentials , author=. The Journal of Chemical Physics , volume=. 2020 , doi=

work page 2020

[65] [90]

Active learning meets metadynamics: automated workflow for reactive machine learning interatomic potentials , volume =

Vitartas, Valdas and Zhang, Hanwen and Juraskova, Veronika and Johnston-Wood, Tristan and Duarte, Fernanda , year =. Active learning meets metadynamics: automated workflow for reactive machine learning interatomic potentials , volume =. Digital Discovery , publisher =. doi:10.1039/d5dd00261c , number =

work page doi:10.1039/d5dd00261c

[66] [91]

The Journal of Chemical Physics , volume=

Committee neural network potentials control generalization errors and enable active learning , author=. The Journal of Chemical Physics , volume=. 2020 , doi=

work page 2020

[67] [92]

Physical Chemistry Chemical Physics , volume=

Addressing uncertainty in atomistic machine learning , author=. Physical Chemistry Chemical Physics , volume=. 2017 , doi=

work page 2017

[68] [93]

The Journal of Chemical Physics , volume=

Fast uncertainty estimates in deep learning interatomic potentials , author=. The Journal of Chemical Physics , volume=. 2023 , doi=

work page 2023

[69] [94]

Machine Learning: Science and Technology , volume=

Uncertainty quantification by direct propagation of shallow ensembles , author=. Machine Learning: Science and Technology , volume=. 2024 , doi=

work page 2024

[70] [95]

The Journal of Chemical Physics , volume=

Multi-head committees enable direct uncertainty prediction for atomistic foundation models , author=. The Journal of Chemical Physics , volume=. 2025 , doi=

work page 2025

[71] [96]

Physical Review Materials , volume =

Ouyang, Xinjian and Wang, Zhilong and Jie, Xiao and Zhang, Feng and Zhang, Yanxing and Liu, Laijun and Wang, Dawei , title =. Physical Review Materials , volume =. 2024 , month = oct, publisher =. doi:10.1103/PhysRevMaterials.8.103804 , url =

work page doi:10.1103/physrevmaterials.8.103804 2024

[72] [97]

2026 , eprint=

Cutting Through the Noise: On-the-fly Outlier Detection for Robust Training of Machine Learning Interatomic Potentials , author=. 2026 , eprint=

work page 2026

[73] [98]

2026 , eprint=

BLIPs: Bayesian Learned Interatomic Potentials , author=. 2026 , eprint=

work page 2026

[74] [99]

2025 , eprint=

Orb-v3: atomistic simulation at scale , author=. 2025 , eprint=

work page 2025

[75] [100]

Tan, Aik Rui and Urata, Shingo and Goldman, Samuel and Dietschreit, Johannes C. B. and Gómez-Bombarelli, Rafael , year=. Single-model uncertainty quantification in neural network potentials does not consistently outperform model ensembles , volume=. npj Computational Materials , publisher=. doi:10.1038/s41524-023-01180-8 , number=

work page doi:10.1038/s41524-023-01180-8

[76] [101]

Advances in Neural Information Processing Systems , year=

Neural Tangent Kernel: Convergence and Generalization in Neural Networks , author=. Advances in Neural Information Processing Systems , year=

work page

[77] [102]

Journal of Machine Learning Research , volume=

A Framework and Benchmark for Deep Batch Active Learning for Regression , author=. Journal of Machine Learning Research , volume=. 2023 , url=

work page 2023

[78] [103]

2023 , eprint=

Black-Box Batch Active Learning for Regression , author=. 2023 , eprint=

work page 2023

[79] [104]

, year =

Vazirani, Vijay V. , year =. k-Center , ISBN =. doi:10.1007/978-3-662-04565-7_5 , booktitle =

work page doi:10.1007/978-3-662-04565-7_5

[80] [105]

Rasmussen, Carl Edward and Williams, Christopher K. I. , year =. Gaussian Processes for Machine Learning , ISBN =. doi:10.7551/mitpress/3206.001.0001 , publisher =

work page doi:10.7551/mitpress/3206.001.0001