arxiv: 2512.05717 · v3 · submitted 2025-12-05 · ⚛️ physics.chem-ph · cond-mat.mtrl-sci· cs.LG

Recognition: 2 theorem links

· Lean Theorem

Comparing the latent features of universal machine-learning interatomic potentials

Sofiia Chorna , Davide Tisi , Cesare Malosso , Wei Bin How , Michele Ceriotti , Sanggyu Chong

Authors on Pith no claims yet

Pith reviewed 2026-05-17 01:17 UTC · model grok-4.3

classification ⚛️ physics.chem-ph cond-mat.mtrl-scics.LG

keywords universal machine-learning interatomic potentialslatent featuresfeature reconstruction errorschemical spacefine-tuning biasatomic environmentscumulantsmodel comparison

0 comments

The pith

Universal machine-learning interatomic potentials encode chemical space in significantly distinct latent features.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how different universal machine-learning interatomic potentials represent chemical information inside their latent features. It quantifies the differences by measuring how accurately features from one model can be reconstructed from those of another. A sympathetic reader would care because these models are meant to generalize across chemistry, yet their internal representations appear non-interchangeable. The analysis shows that even models with the same architecture produce dataset-dependent patterns, and that fine-tuning leaves a strong imprint of the original pre-training data. The work also demonstrates a way to turn atom-level features into global structure descriptors by stacking successive cumulants.

Core claim

uMLIPs encode the chemical space in significantly distinct ways, with substantial cross-model feature reconstruction errors. When variants of the same model architecture are considered, trends become dependent on the dataset, target, and training protocol of choice. Fine-tuning of a uMLIP retains a strong pre-training bias in the latent features. Atom-level features can be compressed into global structure-level features via concatenation of progressive cumulants, each adding significantly new information about the variability across the atomic environments within a given system.

What carries the argument

Feature reconstruction error, used to quantify the relative information content of latent features by testing how well one model's outputs can be recovered from another's.

If this is right

Different uMLIPs cannot be substituted for one another without loss of information about certain chemical environments.
Fine-tuned models carry forward biases from their pre-training data into downstream applications.
Global structure descriptors built from stacked cumulants capture additional variability not present in single-atom features.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Ensembles that draw on multiple uMLIPs could cover a wider range of chemical environments by exploiting their non-overlapping latent information.
Aligning latent spaces across models might improve transfer learning and reduce the need for full retraining.
Task-specific selection of a uMLIP based on its latent biases could improve performance in targeted areas such as catalysis or defect chemistry.

Load-bearing premise

Feature reconstruction error serves as a reliable and unbiased proxy for the distinct information content of latent features across models that differ in architecture, dataset, and training protocol.

What would settle it

Training a mapping from the latent features of one uMLIP to those of another and finding that reconstruction error remains low on a chemically diverse test set of atomic configurations would falsify the claim of substantially distinct encodings.

Figures

Figures reproduced from arXiv: 2512.05717 by Cesare Malosso, Davide Tisi, Michele Ceriotti, Sanggyu Chong, Sofiia Chorna, Wei Bin How.

**Figure 2.** Figure 2: FIG. 2. The reconstruction errors across single-task MACE models trained on MPtrj (MACE-MP [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3. The reconstruction errors across different input tasks of UMA-S-1P1 trained on OMat24, [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗

**Figure 4.** Figure 4: FIG. 4. A schematic overview of the fine-tuning strategies for the PET architecture: full fine-tuning, [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: FIG. 5. The reconstruction errors across the last-layer atom-level features of differently fine-tuned [PITH_FULL_IMAGE:figures/full_fig_p016_5.png] view at source ↗

**Figure 6.** Figure 6: FIG. 6. The evolution of the reconstruction errors of last-layer atom-level features during training [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗

**Figure 7.** Figure 7: FIG. 7. The global and local reconstruction errors between the last-layer features (LL) and back [PITH_FULL_IMAGE:figures/full_fig_p018_7.png] view at source ↗

**Figure 8.** Figure 8: FIG. 8. The reconstruction errors of the progressive cumulative structure-level features, constructed [PITH_FULL_IMAGE:figures/full_fig_p021_8.png] view at source ↗

**Figure 9.** Figure 9: FIG. 9. The PCovR projections of structure-level descriptors for different uMLIPs on the MAD test [PITH_FULL_IMAGE:figures/full_fig_p022_9.png] view at source ↗

read the original abstract

The past few years have seen the development of ``universal'' machine-learning interatomic potentials (uMLIPs) capable of approximating the ground-state potential energy surface across a wide range of chemical structures and compositions with reasonable accuracy. While these models differ in the architecture and the dataset used, they share the ability to compress a staggering amount of chemical information into descriptive latent features. Herein, we systematically analyze what the different uMLIPs have learned by quantitatively assessing the relative information content of their latent features with feature reconstruction errors, and observing how the trends are affected by the choice of training set and training protocol. We find that uMLIPs encode the chemical space in significantly distinct ways, with substantial cross-model feature reconstruction errors. When variants of the same model architecture are considered, trends become dependent on the dataset, target, and training protocol of choice. We also observe that fine-tuning of a uMLIP retains a strong pre-training bias in the latent features. Finally, we discuss how atom-level features, which are directly output by MLIPs, can be compressed into global structure-level features via concatenation of progressive cumulants, each adding significantly new information about the variability across the atomic environments within a given system.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Different uMLIPs really do encode chemistry in distinct latent ways, and fine-tuning leaves a clear pre-training imprint.

read the letter

The main point from this paper is that universal ML interatomic potentials do not all compress chemical information the same way. Cross-model reconstruction errors are substantial, and even fine-tuned versions keep a strong signature from their original pre-training data and objectives. That observation lines up with what people have suspected but not measured directly before. The cumulant approach for turning atom features into global descriptors is a simple addition that seems to capture extra variability without much extra work.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes latent features of several universal machine-learning interatomic potentials (uMLIPs). Using feature reconstruction errors as a quantitative proxy, the authors conclude that these models encode chemical space in significantly distinct ways, with large cross-model reconstruction errors. For architecture variants, observed trends depend on dataset, target property, and training protocol. Fine-tuning preserves a strong pre-training bias in the latent space. The work also introduces a cumulant-based procedure to compress atom-level features into global structure descriptors.

Significance. If the reconstruction-based comparisons are robust, the results would provide a practical basis for assessing information overlap among uMLIPs, informing model selection, ensembling, and fine-tuning strategies. The retention of pre-training bias and the cumulant compression method are potentially useful for transfer-learning studies and for deriving system-level descriptors from local atomic environments.

major comments (2)

[Feature reconstruction procedure] The central claim that uMLIPs encode chemical space in significantly distinct ways rests on cross-model feature reconstruction errors. The manuscript does not describe whether these reconstructions employ dimension-matched linear maps, nonlinear probes, or explicit normalization to correct for differences in latent dimensionality, scale, and basis across architectures. Without such controls, the reported errors may be dominated by representational incompatibility rather than genuine differences in encoded chemistry (see abstract and the section describing the reconstruction procedure).
[Results on architecture variants] The observation that trends for same-architecture variants become dataset- and protocol-dependent is noted, yet the paper provides no quantitative decomposition (e.g., variance partitioning or ablation across training sets) to separate the contribution of these factors from intrinsic model differences. This weakens the ability to interpret the magnitude of cross-model distinctions.

minor comments (2)

[Discussion] The cumulant compression method is introduced in the final paragraph; a short methods subsection or supplementary note clarifying the exact definition of progressive cumulants and their information gain would improve reproducibility.
[Abstract] Error bars or statistical significance measures on the reported reconstruction errors are not mentioned in the abstract; adding them would strengthen the quantitative claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their insightful comments, which have helped us improve the clarity of our analysis on the latent features of uMLIPs. Below we provide point-by-point responses to the major comments.

read point-by-point responses

Referee: The central claim that uMLIPs encode chemical space in significantly distinct ways rests on cross-model feature reconstruction errors. The manuscript does not describe whether these reconstructions employ dimension-matched linear maps, nonlinear probes, or explicit normalization to correct for differences in latent dimensionality, scale, and basis across architectures. Without such controls, the reported errors may be dominated by representational incompatibility rather than genuine differences in encoded chemistry.

Authors: We agree that the description of the reconstruction procedure requires more detail to address potential concerns about representational incompatibility. In the original manuscript, the procedure is outlined in the methods section, but we have now expanded it to explicitly state that we employ dimension-matched linear maps using least-squares regression, along with z-score normalization for each feature set to account for scale and basis differences. These controls were chosen to provide a conservative assessment of information overlap. The revised text includes the mathematical definition and additional validation that the high errors persist under these conditions, supporting our conclusions about distinct encodings. revision: yes
Referee: The observation that trends for same-architecture variants become dataset- and protocol-dependent is noted, yet the paper provides no quantitative decomposition (e.g., variance partitioning or ablation across training sets) to separate the contribution of these factors from intrinsic model differences. This weakens the ability to interpret the magnitude of cross-model distinctions.

Authors: We acknowledge that a quantitative decomposition would strengthen the claims regarding the relative importance of dataset and protocol versus intrinsic model differences. Although our study already examines multiple datasets and protocols to illustrate the dependence, we have added a new analysis in the revised manuscript. This includes an ablation study where we systematically vary one factor while holding others constant and perform a variance partitioning to quantify contributions. The results indicate that while dataset and protocol influence the trends, the cross-model distinctions remain substantial, consistent with our original findings. revision: yes

Circularity Check

0 steps flagged

No significant circularity in empirical latent feature comparison

full rationale

The paper's analysis rests on direct empirical computation of cross-model feature reconstruction errors to compare latent spaces of uMLIPs. No derivation chain, first-principles prediction, or equation is presented that reduces a claimed result to its own inputs by construction. Trends are reported as observed outcomes of varying datasets, targets, and protocols rather than fitted quantities renamed as predictions. No self-citation is invoked as a load-bearing uniqueness theorem or ansatz source. The methodology is self-contained against external benchmarks via explicit reconstruction metrics, consistent with a non-circular empirical study.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper performs an empirical comparative analysis using standard machine-learning techniques and does not introduce new physical axioms, free parameters fitted within the study, or postulated entities.

pith-pipeline@v0.9.0 · 5534 in / 1178 out tokens · 103833 ms · 2026-05-17T01:17:08.315583+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We find that uMLIPs encode the chemical space in significantly distinct ways, with substantial cross-model feature reconstruction errors... using the global and local feature reconstruction errors of Goscinski et al.
IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We also demonstrate that fine-tuning of the uMLIP exhibits a strong pre-training bias in the latent feature space... concatenation of progressive cumulants

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

87 extracted references · 87 canonical work pages · 3 internal anchors

[1]

training trajectory

(SI Fig. S7) shows that smaller models can still approximate the principal manifold of larger ones reasonably well, with the medium-sized variant offering a good information content/size tradeoff. Finally, we assessed the impact of single-task versus multi-head readout architectures using two MACE models trained on the same OMat24 dataset: the original si...

work page
[2]

within the Materials Cloud [82] Archive (DOI: 10.24435/materialscloud:r5-vh). Model weights used in this work were obtained from their public repositories:mace-foundations (https://github.com/ACEsuit/mace-foundations), PET-MAD (https://github.com/ lab-cosmo/pet-mad), DPA-3.1-3M (https://www.aissquare.com/models/detail?name= DPA-3.1-3M&id=343&pageType=mode...

work page doi:10.24435/materialscloud:r5-vh
[3]

A practical guide to machine learning interatomic potentials – status and future,

Ryan Jacobs, Dane Morgan, Siamak Attarian, Jun Meng, Chen Shen, Zhenghao Wu, Clare Yi- jia Xie, Julia H. Yang, Nongnuch Artrith, Ben Blaiszik, Gerbrand Ceder, Kamal Choudhary, Gabor Csanyi, Ekin Dogus Cubuk, Bowen Deng, Ralf Drautz, Xiang Fu, Jonathan Godwin, Vasant Honavar, Olexandr Isayev, Anders Johansson, Boris Kozinsky, Stefano Martiniani, Shyue Ping...

work page 2025
[4]

Recent advances and applications of machine learning in solid-state materials science,

Jonathan Schmidt, M´ ario R. G. Marques, Silvana Botti, and Miguel A. L. Marques, “Recent advances and applications of machine learning in solid-state materials science,” npj Compu- tational Materials5, 83 (2019)

work page 2019
[5]

The importance of being scalable: Improving the speed and accuracy of neural network interatomic potentials across chemical domains,

Eric Qu and Aditi S. Krishnapriyan, “The importance of being scalable: Improving the speed and accuracy of neural network interatomic potentials across chemical domains,” inAdvances in Neural Information Processing Systems, Vol. 37, edited by A. Globerson, L. Mackey, D. Bel- grave, A. Fan, U. Paquet, J. Tomczak, and C. Zhang (Curran Associates, Inc., 2024...

work page 2024
[6]

Learning the electronic density of states in condensed matter,

Chiheb Ben Mahmoud, Andrea Anelli, G´ abor Cs´ anyi, and Michele Ceriotti, “Learning the electronic density of states in condensed matter,” Phys. Rev. B102, 235130 (2020)

work page 2020
[7]

Adaptive energy reference for machine-learning models of the electronic density of states,

Wei Bin How, Sanggyu Chong, Federico Grasselli, Kevin K. Huguenin-Dumittan, and Michele Ceriotti, “Adaptive energy reference for machine-learning models of the electronic density of states,” Phys. Rev. Mater.9, 013802 (2025)

work page 2025
[8]

Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunc- tions,

K. T. Sch¨ utt, M. Gastegger, A. Tkatchenko, K. R. M¨ uller, and R. J. Maurer, “Unifying machine learning and quantum chemistry with a deep neural network for molecular wavefunc- tions,” Nature Communications10, 5024 (2019)

work page 2019
[9]

Exploring the design space of machine learning models for quantum chemistry with a fully differentiable framework,

Divya Suman, Jigyasa Nigam, Sandra Saade, Paolo Pegolo, Hanna T¨ urk, Xing Zhang, Garnet Kin-Lic Chan, and Michele Ceriotti, “Exploring the design space of machine learning models for quantum chemistry with a fully differentiable framework,” Journal of Chemical Theory and Computation21, 6505–6516 (2025)

work page 2025
[10]

A graph neural network for the era of large atomistic models,

Duo Zhang, Anyang Peng, Chun Cai, Wentao Li, Yuanchang Zhou, Jinzhe Zeng, Mingyu Guo, Chengqian Zhang, Bowen Li, Hong Jiang, Tong Zhu, Weile Jia, Linfeng Zhang, and Han Wang, “A graph neural network for the era of large atomistic models,” (2025), arXiv:2506.01686 [physics.comp-ph]

work page arXiv 2025
[11]

Uma: A family of universal models for atoms,

Brandon M. Wood, Misko Dzamba, Xiang Fu, Meng Gao, Muhammed Shuaibi, Luis Barroso- Luque, Kareem Abdelmaqsoud, Vahe Gharakhanyan, John R. Kitchin, Daniel S. Levine, Kyle Michel, Anuroop Sriram, Taco Cohen, Abhishek Das, Ammar Rizvi, Sushree Jagriti Sahoo, Zachary W. Ulissi, and C. Lawrence Zitnick, “Uma: A family of universal models for atoms,” (2025), ar...

work page arXiv 2025
[12]

A foundation model for atomistic materials chemistry,

Ilyes Batatia, Philipp Benner, Yuan Chiang, Alin M. Elena, D´ avid P. Kov´ acs, Janosh Riebe- sell, Xavier R. Advincula, Mark Asta, Matthew Avaylon, William J. Baldwin, Fabian Berger, Noam Bernstein, Arghya Bhowmik, Filippo Bigi, Samuel M. Blau, Vlad C˘ arare, Michele Ce- riotti, Sanggyu Chong, James P. Darby, Sandip De, Flaviano Della Pia, Volker L. Deri...

work page 2025
[13]

Cross learning between electronic structure the- ories for unifying molecular, surface, and inorganic crystal foundation force fields,

Ilyes Batatia, Chen Lin, Joseph Hart, Elliott Kasoar, Alin M. Elena, Sam Walton Nor- wood, Thomas Wolf, and G´ abor Cs´ anyi, “Cross learning between electronic structure the- ories for unifying molecular, surface, and inorganic crystal foundation force fields,” (2025), arXiv:2510.25380 [physics.chem-ph]

work page arXiv 2025
[14]

Scaling deep learning for materials discovery,

Amil Merchant, Simon Batzner, Samuel S. Schoenholz, Muratahan Aykol, Gowoon Cheon, and Ekin Dogus Cubuk, “Scaling deep learning for materials discovery,” Nature624, 80–85 (2023)

work page 2023
[15]

Pet-mad as a lightweight universal interatomic potential for advanced materials modeling,

Arslan Mazitov, Filippo Bigi, Matthias Kellner, Paolo Pegolo, Davide Tisi, Guillaume Fraux, Sergey Pozdnyakov, Philip Loche, and Michele Ceriotti, “Pet-mad as a lightweight universal interatomic potential for advanced materials modeling,” Nature Communications16, 10653 (2025)

work page 2025
[16]

MatterSim: A Deep Learning Atomistic Model Across Elements, Temperatures and Pressures

Han Yang, Chenxi Hu, Yichi Zhou, Xixian Liu, Yu Shi, Jielan Li, Guanzhi Li, Zekun Chen, Shuizhou Chen, Claudio Zeni, Matthew Horton, Robert Pinsler, Andrew Fowler, Daniel Z¨ ugner, Tian Xie, Jake Smith, Lixin Sun, Qian Wang, Lingyu Kong, Chang Liu, Hongxia Hao, and Ziheng Lu, “Mattersim: A deep learning atomistic model across elements, temper- atures and ...

work page internal anchor Pith review Pith/arXiv arXiv 2024
[18]

Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,

Bowen Deng, Peichen Zhong, KyuJung Jun, Janosh Riebesell, Kevin Han, Christopher J. Bartel, and Gerbrand Ceder, “Chgnet as a pretrained universal neural network potential for charge-informed atomistic modelling,” Nature Machine Intelligence5, 1031–1041 (2023). 27

work page 2023
[19]

Improving machine-learning models in materials science through large datasets,

Jonathan Schmidt, Tiago F.T. Cerqueira, Aldo H. Romero, Antoine Loew, Fabian J¨ ager, Hai- Chen Wang, Silvana Botti, and Miguel A.L. Marques, “Improving machine-learning models in materials science through large datasets,” Materials Today Physics48, 101560 (2024)

work page 2024
[20]

Massive atomic diversity: a compact universal dataset for atomistic machine learning,

Arslan Mazitov, Sofiia Chorna, Guillaume Fraux, Marnik Bercx, Giovanni Pizzi, Sandip De, and Michele Ceriotti, “Massive atomic diversity: a compact universal dataset for atomistic machine learning,” Scientific Data12, 1857 (2025)

work page 2025
[21]

A foundational potential energy surface dataset for materials,

Aaron D. Kaplan, Runze Liu, Ji Qi, Tsz Wai Ko, Bowen Deng, Janosh Riebesell, Gerbrand Ceder, Kristin A. Persson, and Shyue Ping Ong, “A foundational potential energy surface dataset for materials,” (2025), arXiv:2503.04070 [cond-mat.mtrl-sci]

work page arXiv 2025
[22]

Spice, a dataset of drug-like molecules and peptides for training machine learning potentials,

Peter Eastman, Pavan Kumar Behara, David L. Dotson, Raimondas Galvelis, John E. Herr, Josh T. Horton, Yuezhi Mao, John D. Chodera, Benjamin P. Pritchard, Yuanqing Wang, Gianni De Fabritiis, and Thomas E. Markland, “Spice, a dataset of drug-like molecules and peptides for training machine learning potentials,” Scientific Data10, 11 (2023)

work page 2023
[23]

Open catalyst 2020 (oc20) dataset and community challenges,

Lowik Chanussot, Abhishek Das, Siddharth Goyal, Thibaut Lavril, Muhammed Shuaibi, Mor- gane Riviere, Kevin Tran, Javier Heras-Domingo, Caleb Ho, Weihua Hu, Aini Palizhati, Anuroop Sriram, Brandon Wood, Junwoong Yoon, Devi Parikh, C. Lawrence Zitnick, and Zachary Ulissi, “Open catalyst 2020 (oc20) dataset and community challenges,” ACS Catalysis 11, 6059–6...

work page 2020
[24]

The open dac 2023 dataset and challenges for sorbent discovery in direct air capture,

Anuroop Sriram, Sihoon Choi, Xiaohan Yu, Logan M. Brabson, Abhishek Das, Zachary Ulissi, Matt Uyttendaele, Andrew J. Medford, and David S. Sholl, “The open dac 2023 dataset and challenges for sorbent discovery in direct air capture,” (2023), arXiv:2311.00341 [cond- mat.mtrl-sci]

work page arXiv 2023
[25]

Open Materials 2024 (OMat24) Inorganic Materials Dataset and Models

Luis Barroso-Luque, Muhammed Shuaibi, Xiang Fu, Brandon M. Wood, Misko Dzamba, Meng Gao, Ammar Rizvi, C. Lawrence Zitnick, and Zachary W. Ulissi, “Open materials 2024 (omat24) inorganic materials dataset and models,” (2024), arXiv:2410.12771 [cond-mat.mtrl- sci]

work page internal anchor Pith review Pith/arXiv arXiv 2024
[26]

Open molecular crystals 2025 (omc25) dataset and models,

Vahe Gharakhanyan, Luis Barroso-Luque, Yi Yang, Muhammed Shuaibi, Kyle Michel, Daniel S. Levine, Misko Dzamba, Xiang Fu, Meng Gao, Xingyu Liu, Haoran Ni, Keian Noori, Brandon M. Wood, Matt Uyttendaele, Arman Boromand, C. Lawrence Zitnick, Noa Marom, Zachary W. Ulissi, and Anuroop Sriram, “Open molecular crystals 2025 (omc25) dataset and models,” (2025), a...

work page arXiv 2025
[27]

The open molecules 2025 (omol25) dataset, evaluations, and models,

Daniel S. Levine, Muhammed Shuaibi, Evan Walter Clark Spotte-Smith, Michael G. Taylor, Muhammad R. Hasyim, Kyle Michel, Ilyes Batatia, G´ abor Cs´ anyi, Misko Dzamba, Peter Eastman, Nathan C. Frey, Xiang Fu, Vahe Gharakhanyan, Aditi S. Krishnapriyan, Joshua A. Rackers, Sanjeev Raja, Ammar Rizvi, Andrew S. Rosen, Zachary Ulissi, Santiago Vargas, C. Lawrenc...

work page arXiv 2025
[28]

A framework to evaluate machine learning crystal stability predictions,

Janosh Riebesell, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, and Kristin A. Persson, “A framework to evaluate machine learning crystal stability predictions,” Nature Machine Intelligence7, 836–847 (2025)

work page 2025
[29]

Lambench: A benchmark for large atomistic models,

Anyang Peng, Chun Cai, Mingyu Guo, Duo Zhang, Chengqian Zhang, Wanrun Jiang, Yinan Wang, Antoine Loew, Chengkun Wu, Weinan E, Linfeng Zhang, and Han Wang, “Lambench: A benchmark for large atomistic models,” (2025), arXiv:2504.19578 [physics.comp-ph]

work page arXiv 2025
[30]

Mlip arena: Advancing fairness and transparency in machine learning interatomic potentials via an open, accessible benchmark platform,

Yuan Chiang, Tobias Kreiman, Christine Zhang, Matthew C. Kuner, Elizabeth Weaver, Ishan Amin, Hyunsoo Park, Yunsung Lim, Jihan Kim, Daryl Chrzan, Aron Walsh, Samuel M. Blau, Mark Asta, and Aditi S. Krishnapriyan, “Mlip arena: Advancing fairness and transparency in machine learning interatomic potentials via an open, accessible benchmark platform,” (2025),...

work page arXiv 2025
[31]

Peering inside the black box by learning the relevance of many-body functions in neural network potentials,

Klara Bonneau, Jonas Lederer, Clark Templeton, David Rosenberger, Lorenzo Giambagli, Klaus-Robert M¨ uller, and Cecilia Clementi, “Peering inside the black box by learning the relevance of many-body functions in neural network potentials,” Nature Communications16, 9898 (2025)

work page 2025
[32]

A prediction rigidity formalism for low-cost uncertainties in trained neural networks,

Filippo Bigi, Sanggyu Chong, Michele Ceriotti, and Federico Grasselli, “A prediction rigidity formalism for low-cost uncertainties in trained neural networks,” Machine Learning: Science and Technology5, 045018 (2024)

work page 2024
[33]

Fast uncertainty esti- mates in deep learning interatomic potentials,

Albert Zhu, Simon Batzner, Albert Musaelian, and Boris Kozinsky, “Fast uncertainty esti- mates in deep learning interatomic potentials,” The Journal of Chemical Physics158, 164111 (2023)

work page 2023
[34]

Uncertainty quantification for predictions of atomistic neural networks,

Luis Itza Vazquez-Salazar, Eric D. Boittier, and Markus Meuwly, “Uncertainty quantification for predictions of atomistic neural networks,” Chem. Sci.13, 13068–13084 (2022). 29

work page 2022
[35]

Uncertainty-driven dynamics for active learning of interatomic potentials,

Maksim Kulichenko, Kipton Barros, Nicholas Lubbers, Ying Wai Li, Richard Messerly, Sergei Tretiak, Justin S. Smith, and Benjamin Nebgen, “Uncertainty-driven dynamics for active learning of interatomic potentials,” Nature Computational Science3, 230–239 (2023)

work page 2023
[36]

A quantitative uncertainty metric controls error in neural network-driven chemical discovery,

Jon Paul Janet, Chenru Duan, Tzuhsiung Yang, Aditya Nandy, and Heather J. Kulik, “A quantitative uncertainty metric controls error in neural network-driven chemical discovery,” Chem. Sci.10, 7913–7922 (2019)

work page 2019
[37]

Analysis of uncertainty of neural fingerprint-based models,

Christian W. Feldmann, Jochen Sieg, and Miriam Mathea, “Analysis of uncertainty of neural fingerprint-based models,” Faraday Discuss.256, 551–567 (2025)

work page 2025
[38]

Outlier-detection for reactive machine learned potential energy surfaces,

Luis Itza Vazquez-Salazar, Silvan K¨ aser, and Markus Meuwly, “Outlier-detection for reactive machine learned potential energy surfaces,” npj Computational Materials11, 33 (2025)

work page 2025
[39]

Model-free estimation of completeness, uncertainties, and outliers in atomistic machine learn- ing using information theory,

Daniel Schwalbe-Koda, Sebastien Hamel, Babak Sadigh, Fei Zhou, and Vincenzo Lordi, “Model-free estimation of completeness, uncertainties, and outliers in atomistic machine learn- ing using information theory,” Nature Communications16(2025), 10.1038/s41467-025-59232- 0

work page doi:10.1038/s41467-025-59232- 2025
[40]

Bayesian force fields from active learning for simulation of inter-dimensional transformation of stanene,

Yu Xie, Jonathan Vandermause, Lixin Sun, Andrea Cepellotti, and Boris Kozinsky, “Bayesian force fields from active learning for simulation of inter-dimensional transformation of stanene,” npj Computational Materials7, 40 (2021)

work page 2021
[41]

On-the-fly active learning of interpretable bayesian force fields for atomistic rare events,

Jonathan Vandermause, Steven B. Torrisi, Simon Batzner, Yu Xie, Lixin Sun, Alexie M. Kolpak, and Boris Kozinsky, “On-the-fly active learning of interpretable bayesian force fields for atomistic rare events,” npj Computational Materials6, 20 (2020)

work page 2020
[42]

Data generation for machine learning interatomic potentials and beyond,

M. Kulichenko, B. Nebgen, N. Lubbers, J. S. Smith, K. Barros, A. E. A. Allen, A. Habib, E. Shinkle, N. Fedik, Y. W. Li, R. A. Messerly, and S. Tretiak, “Data generation for machine learning interatomic potentials and beyond,” Chemical Reviews124, 13681–13714 (2024)

work page 2024
[43]

Data curation for machine learning interatomic potentials by determinantal point processes,

Joanna Zou and Youssef Marzouk, “Data curation for machine learning interatomic potentials by determinantal point processes,” inProceedings of the International Conference on Learning Representations (ICLR)(2025)

work page 2025
[44]

Ranking the information content of distance measures,

Aldo Glielmo, Claudio Zeni, Bingqing Cheng, G´ abor Cs´ anyi, and Alessandro Laio, “Ranking the information content of distance measures,” PNAS Nexus1, pgac039 (2022)

work page 2022
[45]

Mapping and classifying molecules from a high-throughput structural database,

Sandip De, Felix Musil, Teresa Ingram, Carsten Baldauf, and Michele Ceriotti, “Mapping and classifying molecules from a high-throughput structural database,” Journal of Chemin- formatics9, 6 (2017). 30

work page 2017
[46]

A critical examination of compound stability predictions from machine- learned formation energies,

Christopher J. Bartel, Amalie Trewartha, Qi Wang, Alexander Dunn, Anubhav Jain, and Gerbrand Ceder, “A critical examination of compound stability predictions from machine- learned formation energies,” npj Computational Materials6, 97 (2020)

work page 2020
[47]

Univer- sally converging representations of matter across scientific foundation models,

Sathya Edamadaka, Soojung Yang, Ju Li, and Rafael G´ omez-Bombarelli, “Univer- sally converging representations of matter across scientific foundation models,” (2025), arXiv:2512.03750 [cs.LG]

work page arXiv 2025
[48]

Platonic representation of foundation machine learning inter- atomic potentials,

Zhenzhu Li and Aron Walsh, “Platonic representation of foundation machine learning inter- atomic potentials,” (2025), arXiv:2512.05349 [cond-mat.mtrl-sci]

work page arXiv 2025
[49]

The role of feature space in atomistic learning,

Alexander Goscinski, Guillaume Fraux, Giulio Imbalzano, and Michele Ceriotti, “The role of feature space in atomistic learning,” Machine Learning: Science and Technology2, 025028 (2021)

work page 2021
[50]

A universal machine learning model for the elec- tronic density of states,

Wei Bin How, Pol Febrer, Sanggyu Chong, Arslan Mazitov, Filippo Bigi, Matthias Kellner, Sergey Pozdnyakov, and Michele Ceriotti, “A universal machine learning model for the elec- tronic density of states,” (2025), arXiv:2508.17418 [physics.chem-ph]

work page arXiv 2025
[51]

Smooth, exact rotational symmetrization for deep learning on point clouds,

Sergey Pozdnyakov and Michele Ceriotti, “Smooth, exact rotational symmetrization for deep learning on point clouds,” inAdvances in Neural Information Processing Systems (NeurIPS 2023)(2023)

work page 2023
[52]

DeePMD-kit v3: A Multiple-Backend Framework for Machine Learning Potentials,

Jinzhe Zeng, Duo Zhang, Anyang Peng, Xiangyu Zhang, Sensen He, Yan Wang, Xinzijian Liu, Hangrui Bi, Yifan Li, Chun Cai, Chengqian Zhang, Yiming Du, Jia-Xin Zhu, Pinghui Mo, Zhengtao Huang, Qiyu Zeng, Shaochen Shi, Xuejian Qin, Zhaoxi Yu, Chenxing Luo, Ye Ding, Yun-Pei Liu, Ruosong Shi, Zhenyu Wang, Sigbjørn Løland Bore, Junhan Chang, Zhe Deng, Zhaohan Din...

work page 2025
[53]

Learning smooth and expressive interatomic poten- tials for physical property prediction,

Xiang Fu, Brandon M Wood, Luis Barroso-Luque, Daniel S. Levine, Meng Gao, Misko Dzamba, and C. Lawrence Zitnick, “Learning smooth and expressive interatomic poten- tials for physical property prediction,” inProceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 267, edited by Aarti Singh, Marya...

work page 2025
[54]

scikit- matter: A suite of generalisable machine learning methods born out of chemistry and materials science,

Alexander Goscinski, Victor Paul Principe, Guillaume Fraux, Sergei Kliavinek, Ben- jamin Aaron Helfrecht, Philip Loche, Michele Ceriotti, and Rose Kathleen Cersonsky, “scikit- matter: A suite of generalisable machine learning methods born out of chemistry and materials science,” Open Research Europe3, 81 (2023)

work page 2023
[55]

Principal covariates regression: Part i. theory,

Sijmen de Jong and Henk A.L. Kiers, “Principal covariates regression: Part i. theory,” Chemo- metrics and Intelligent Laboratory Systems14, 155–164 (1992), proceedings of the 2nd Scan- dinavian Symposium on Chemometrics

work page 1992
[56]

Improving sample and feature selection with principal covariates regression,

Rose K Cersonsky, Benjamin A Helfrecht, Edgar A Engel, Sergei Kliavinek, and Michele Ce- riotti, “Improving sample and feature selection with principal covariates regression,” Machine Learning: Science and Technology2, 035038 (2021)

work page 2021
[57]

Multitask learning,

Rich Caruana, “Multitask learning,” Machine Learning28, 41–75 (1997)

work page 1997
[58]

An Overview of Multi-Task Learning in Deep Neural Networks

Sebastian Ruder, “An overview of multi-task learning in deep neural networks,” (2017), arXiv:1706.05098 [cs.LG]

work page internal anchor Pith review Pith/arXiv arXiv 2017
[59]

Adaptive mixtures of local experts,

Robert A. Jacobs, Michael I. Jordan, Steven J. Nowlan, and Geoffrey E. Hinton, “Adaptive mixtures of local experts,” Neural Computation3, 79–87 (1991)

work page 1991
[60]

Scaling vision with sparse mix- ture of experts,

Carlos Riquelme, Joan Puigcerver, Basil Mustafa, Maxim Neumann, Rodolphe Jenatton, Andr´ e Susano Pinto, Daniel Keysers, and Neil Houlsby, “Scaling vision with sparse mix- ture of experts,” inAdvances in Neural Information Processing Systems, Vol. 34 (2021) pp. 8583–8595

work page 2021
[61]

Outrageously large neural networks: The sparsely-gated mixture- of-experts layer,

Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc V. Le, Geoffrey Hinton, and Jeff Dean, “Outrageously large neural networks: The sparsely-gated mixture- of-experts layer,” inProceedings of the International Conference on Learning Representations (ICLR)(2017)

work page 2017
[62]

Generalized Gradient Approximation made sim- ple,

Jp P Perdew, K Burke, and M Ernzerhof, “Generalized Gradient Approximation made sim- ple,” Phys. Rev. Lett. Phys. Rev. Lett. (USA),77, 3865 (1996)

work page 1996
[63]

Accurate and numerically efficient r2scan meta-generalized gradient approximation,

James W. Furness, Aaron D. Kaplan, Jinliang Ning, John P. Perdew, and Jianwei Sun, “Accurate and numerically efficient r2scan meta-generalized gradient approximation,” The Journal of Physical Chemistry Letters11, 8208–8215 (2020)

work page 2020
[64]

Lysogorskiy, A

Yury Lysogorskiy, Anton Bochkarev, and Ralf Drautz, “Graph atomic cluster expansion for foundational machine learning interatomic potentials,” (2025), arXiv:2508.17936 [cond- 32 mat.mtrl-sci]

work page arXiv 2025
[65]

Orb-v3: atomistic simulation at scale,

Benjamin Rhodes, Sander Vandenhaute, Vaidotas ˇSimkus, James Gin, Jonathan Godwin, Tim Duignan, and Mark Neumann, “Orb-v3: atomistic simulation at scale,” (2025), arXiv:2504.06231 [cond-mat.mtrl-sci]

work page arXiv 2025
[66]

Mace-off: Short-range transferable machine learning force fields for organic molecules,

D´ avid P´ eter Kov´ acs, J. Harry Moore, Nicholas J. Browning, Ilyes Batatia, Joshua T. Horton, Yixuan Pu, Venkat Kapil, William C. Witt, Ioan-Bogdan Magd˘ au, Daniel J. Cole, and G´ abor Cs´ anyi, “Mace-off: Short-range transferable machine learning force fields for organic molecules,” Journal of the American Chemical Society147, 17598–17611 (2025)

work page 2025
[67]

UPET: Universal interatomic potentials for advanced materials modeling,

Laboratory of Computational Science and Modeling (COSMO), EPFL, “UPET: Universal interatomic potentials for advanced materials modeling,”https://github.com/lab-cosmo/ upet(2026), online; accessed Jan. 2026

work page 2026
[68]

Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning,

Bowen Deng, Yunyeong Choi, Peichen Zhong, Janosh Riebesell, Shashwat Anand, Zhuohan Li, KyuJung Jun, Kristin A. Persson, and Gerbrand Ceder, “Overcoming systematic softening in universal machine learning interatomic potentials by fine-tuning,” (2024), arXiv:2405.07105 [cond-mat.mtrl-sci]

work page arXiv 2024
[69]

Fine-tuning foundation models of materials interatomic potentials with frozen trans- fer learning,

Mariia Radova, Wojciech G. Stark, Connor S. Allen, Reinhard J. Maurer, and Albert P. Bart´ ok, “Fine-tuning foundation models of materials interatomic potentials with frozen trans- fer learning,” npj Computational Materials11, 237 (2025)

work page 2025
[70]

Data-efficient fine- tuning of foundational models for first-principles quality sublimation enthalpies,

Harveen Kaur, Flaviano Della Pia, Ilyes Batatia, Xavier R. Advincula, Benjamin X. Shi, Jinggang Lan, G´ abor Cs´ anyi, Angelos Michaelides, and Venkat Kapil, “Data-efficient fine- tuning of foundational models for first-principles quality sublimation enthalpies,” Faraday Discuss.256, 120–138 (2025)

work page 2025
[71]

Anomalous High Ionic Conductivity of Nanoporous beta-Li 3 PS4,

Zengcai Liu, Wujun Fu, E. Andrew Payzant, Xiang Yu, Zili Wu, Nancy J. Dudney, Jim Kiggans, Kunlun Hong, Adam J. Rondinone, and Chengdu Liang, “Anomalous High Ionic Conductivity of Nanoporous beta-Li 3 PS4,” J. Am. Chem. Soc.135, 975–978 (2013)

work page 2013
[72]

Li-p-s electrolyte materials as a benchmark for machine-learned interatomic potentials,

Natascia L. Fragapane and Volker L. Deringer, “Li-p-s electrolyte materials as a benchmark for machine-learned interatomic potentials,” (2025), arXiv:2511.16569 [cond-mat.mtrl-sci]

work page arXiv 2025
[73]

Lithium-ion-conductive sulfide polymer electrolyte with disulfide bond-linked PS4 tetrahedra for all-solid-state batteries,

Atsutaka Kato, Mari Yamamoto, Futoshi Utsuno, Hiroyuki Higuchi, and Masanari Takahashi, “Lithium-ion-conductive sulfide polymer electrolyte with disulfide bond-linked PS4 tetrahedra for all-solid-state batteries,” Commun Mater2, 112 (2021). 33

work page 2021
[74]

Lithium battery chemistries en- abled by solid-state electrolytes,

Arumugam Manthiram, Xingwen Yu, and Shaofei Wang, “Lithium battery chemistries en- abled by solid-state electrolytes,” Nat Rev Mater2, 16103 (2017)

work page 2017
[75]

Thermal conductivity of Li 3 PS 4 solid electrolytes withab initioaccuracy,

Davide Tisi, Federico Grasselli, Lorenzo Gigli, and Michele Ceriotti, “Thermal conductivity of Li 3 PS 4 solid electrolytes withab initioaccuracy,” Phys. Rev. Materials8, 065403 (2024)

work page 2024
[76]

Mechanism of Charge Transport in Lithium Thiophosphate,

Lorenzo Gigli, Davide Tisi, Federico Grasselli, and Michele Ceriotti, “Mechanism of Charge Transport in Lithium Thiophosphate,” Chem. Mater.36, 1482–1496 (2024)

work page 2024
[77]

Metatensor and metatomic: Foundational libraries for interoperable atomistic machine learning,

Filippo Bigi, Joseph W. Abbott, Philip Loche, Arslan Mazitov, Davide Tisi, Marcel F. Langer, Alexander Goscinski, Paolo Pegolo, Sanggyu Chong, Rohit Goswami, Sofiia Chorna, Matthias Kellner, Michele Ceriotti, and Guillaume Fraux, “Metatensor and metatomic: Foundational libraries for interoperable atomistic machine learning,” (2025)

work page 2025
[78]

Reconstructions and dynamics ofβ-lithium thiophosphate surfaces,

Hanna T¨ urk, Davide Tisi, and Michele Ceriotti, “Reconstructions and dynamics ofβ-lithium thiophosphate surfaces,” PRX Energy4, 033010 (2025)

work page 2025
[79]

Using sketch-map coordinates to analyze and bias molecular dynamics simulations

Gareth A Tribello, Michele Ceriotti, and Michele Parrinello, “Using sketch-map coordinates to analyze and bias molecular dynamics simulations.” Proc. Natl. Acad. Sci. USA109, 5196–201 (2012)

work page 2012
[80]

Materials cartography: Representing and mining materials space using structural and electronic fingerprints,

Olexandr Isayev, Denis Fourches, Eugene N. Muratov, Corey Oses, Kevin Rasch, Alexander Tropsha, and Stefano Curtarolo, “Materials cartography: Representing and mining materials space using structural and electronic fingerprints,” Chemistry of Materials27, 735–743 (2015)

work page 2015
[81]

Artificial intelligence- aided mapping of the structure–composition–conductivity relationships of glass–ceramic lithium thiophosphate electrolytes,

Haoyue Guo, Qian Wang, Alexander Urban, and Nongnuch Artrith, “Artificial intelligence- aided mapping of the structure–composition–conductivity relationships of glass–ceramic lithium thiophosphate electrolytes,” Chemistry of Materials34, 6702–6712 (2022)

work page 2022

Showing first 80 references.