A Gaussian process surrogate gate inserted between generative crystal models and property oracles matches or exceeds ungated fine-tuning while using roughly one-fifth the oracle calls for heat capacity and bulk modulus.
hub Canonical reference
Orb: A fast, scalable neural network potential.arXiv preprint arXiv:2410.22570(2024)
Canonical reference. 80% of citing Pith papers cite this work as background.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Learned functional perturbations plus CRPS training convert deterministic ML interatomic potentials into probabilistic ones, improving CRPS by 19-32% on N-body benchmarks and uncertainty-error correlation from 0.75 to 0.84 on silica.
Torched-TACAW enables efficient large-scale STEM-EELS simulations of vibrational and magnon excitations in defective materials by combining ML-driven molecular dynamics with supercell partitioning and on-the-fly multislice processing.
SciVerseGym is a new open Gymnasium environment that frames sequential crystal discovery as an MDP with local/global actions, configurable evaluators, and support for RL, Bayesian optimization, and related workflows.
Sparsity-promoting fine-tuning adapts equivariant materials foundation models by selectively updating ~3% of parameters to match full fine-tuning on molecular and crystalline benchmarks while revealing interpretable physical patterns.
Physics-informed distillation from a universal MLIP plus limited CCSD(T) fine-tuning yields cm^{-1} accurate potentials for non-covalent interactions, with teacher choice strongly affecting accuracy on some systems.
High-throughput screening combining Voronoi polyhedral volumes and foundational ML models identifies 37 promising Ca cathode candidates from the Materials Project database.
Machine learning models, especially certain deep neural networks, can predict lattice thermal conductivity with useful accuracy across different generalization tests while being orders of magnitude faster than first-principles calculations.
A new benchmark finds that state-of-the-art ML interatomic potentials struggle with compositional generalization, producing errors an order of magnitude higher on unseen molecular combinations than on training-like cases.
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better predicted by representation distinguishability than by leaderboard accuracy.
MACE-MP-0 is a general-purpose atomistic ML force field trained on public data that enables stable simulations of diverse chemical systems with qualitative and sometimes quantitative accuracy, serving as a starting point for fine-tuning.
Universal MLIPs serve as configuration generators whose DFT-relabeled subsamples enable one-shot or iterative training of material-specific MLIPs that recover accurate reactive energy profiles with 600-2000 DFT calculations.
Systematic tests show naive fine-tuning excels for single-task accuracy while multihead replay best preserves out-of-distribution robustness in MLIP adaptation.
Synthetic pre-training on ML-generated tensor data followed by fine-tuning on ground-truth calculations improves data efficiency for graph models of solid-state NMR parameters when the pre-training and fine-tuning domains match.
HASGO combines harmony search with universal MLIPs and multi-head replay fine-tuning to locate operando surface reconstructions, demonstrated by identifying the square-pyramidal O5 subsurface motif on Ag(100) during ethylene epoxidation.
The EDDP machine-learned potential for lead predicts the observed FCC-HCP phase transition at ~15 GPa, unlike EAM and MEAM models, when paired with nested sampling.
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
Hybrid quantum workflow on IQM Emerald processor computes -3.52 kcal/mol binding energy for pyridine-phenol complex via QSCI in (10e,10o) space, matching CASCI but underbinding relative to CCSD(T) benchmark of -8.5 to -9.5 kcal/mol.
Experiments on QM9 and AFLOW datasets show that static and dynamic batching for GNNs can yield up to 2.7x training speedups depending on data, model, batch size, hardware, and training steps, with occasional differences in learning metrics.
citing papers explorer
-
Surrogate-Gated Generation and Foundation-Model Embeddings for Bayesian Materials Design
A Gaussian process surrogate gate inserted between generative crystal models and property oracles matches or exceeds ungated fine-tuning while using roughly one-fifth the oracle calls for heat capacity and bulk modulus.
-
Uncertainty-aware Machine Learning Interatomic Potentials via Learned Functional Perturbations
Learned functional perturbations plus CRPS training convert deterministic ML interatomic potentials into probabilistic ones, improving CRPS by 19-32% on N-body benchmarks and uncertainty-error correlation from 0.75 to 0.84 on silica.
-
Efficient Large-Scale STEM-EELS Simulations With Torched-TACAW
Torched-TACAW enables efficient large-scale STEM-EELS simulations of vibrational and magnon excitations in defective materials by combining ML-driven molecular dynamics with supercell partitioning and on-the-fly multislice processing.
-
Robust and Interpretable Adaptation of Equivariant Materials Foundation Models via Sparsity-promoting Fine-tuning
Sparsity-promoting fine-tuning adapts equivariant materials foundation models by selectively updating ~3% of parameters to match full fine-tuning on molecular and crystalline benchmarks while revealing interpretable physical patterns.
-
Non-covalent Interactions at cm$^{-1}$ Accuracy: Data Efficient Physics-Informed Distillation for Machine Learning Interatomic Potentials
Physics-informed distillation from a universal MLIP plus limited CCSD(T) fine-tuning yields cm^{-1} accurate potentials for non-covalent interactions, with teacher choice strongly affecting accuracy on some systems.
-
Geometry-based Discovery of Calcium Battery Cathodes Accelerated by Foundational Machine-Learned Models
High-throughput screening combining Voronoi polyhedral volumes and foundational ML models identifies 37 promising Ca cathode candidates from the Materials Project database.
-
Benchmarking Compositional Generalisation for Machine Learning Interatomic Potentials
A new benchmark finds that state-of-the-art ML interatomic potentials struggle with compositional generalization, producing errors an order of magnitude higher on unseen molecular combinations than on training-like cases.
-
CrystalREPA: Transferring Physical Priors from Universal MLIPs to Crystal Generative Models
CrystalREPA closes the representation gap between crystal generators and universal MLIPs via contrastive alignment, yielding more stable and valid generated crystals while revealing that MLIP teacher quality is better predicted by representation distinguishability than by leaderboard accuracy.
-
A foundation model for atomistic materials chemistry
MACE-MP-0 is a general-purpose atomistic ML force field trained on public data that enables stable simulations of diverse chemical systems with qualitative and sometimes quantitative accuracy, serving as a starting point for fine-tuning.
-
Universal Interatomic Potentials as Configuration-Space Generators for One-Shot and Iterative Fine-Tuning of Ab Initio-Accurate Material-Specific Models
Universal MLIPs serve as configuration generators whose DFT-relabeled subsamples enable one-shot or iterative training of material-specific MLIPs that recover accurate reactive energy profiles with 600-2000 DFT calculations.
-
Fine-tuning MLIP foundation models: strategies for accuracy and transferability
Systematic tests show naive fine-tuning excels for single-task accuracy while multihead replay best preserves out-of-distribution robustness in MLIP adaptation.
-
Synthetic pre-training of graph-network models for predicting solid-state NMR parameters
Synthetic pre-training on ML-generated tensor data followed by fine-tuning on ground-truth calculations improves data efficiency for graph models of solid-state NMR parameters when the pre-training and fine-tuning domains match.
-
Scalable Prediction of Complex Surface Reconstructions under Operating Conditions via Harmony-Search-Based Global Optimization
HASGO combines harmony search with universal MLIPs and multi-head replay fine-tuning to locate operando surface reconstructions, demonstrated by identifying the square-pyramidal O5 subsurface motif on Ag(100) during ethylene epoxidation.
-
Benchmarking empirical and machine-learned interatomic potentials using phase diagram predictions for Lead
The EDDP machine-learned potential for lead predicts the observed FCC-HCP phase transition at ~15 GPa, unlike EAM and MEAM models, when paired with nested sampling.
-
MatterSim-MT: A multi-task foundation model for in silico materials characterization
MatterSim-MT is a multi-task ML foundation model pretrained on 35M+ structures for in silico materials property prediction and complex simulations.
-
OptiMat Alloys: a FAIR, living database of multi-principal element alloys enabled by a conversational agent
OptiMat Alloys is a conversational AI system that maintains a living FAIR database of multi-principal element alloy calculations and enables natural-language, on-demand computations with built-in uncertainty checks.
-
Additive binding energies in asphalt on a quantum processor via quantum-selected configuration interaction (QSCI)
Hybrid quantum workflow on IQM Emerald processor computes -3.52 kcal/mol binding energy for pyridine-phenol complex via QSCI in (10e,10o) space, matching CASCI but underbinding relative to CCSD(T) benchmark of -8.5 to -9.5 kcal/mol.
-
Training speedups via batching for geometric learning: an analysis of static and dynamic algorithms
Experiments on QM9 and AFLOW datasets show that static and dynamic batching for GNNs can yield up to 2.7x training speedups depending on data, model, batch size, hardware, and training steps, with occasional differences in learning metrics.