Molecule Attention Transformer
read the original abstract
Designing a single neural network architecture that performs competitively across a range of molecule property prediction tasks remains largely an open challenge, and its solution may unlock a widespread use of deep learning in the drug discovery industry. To move towards this goal, we propose Molecule Attention Transformer (MAT). Our key innovation is to augment the attention mechanism in Transformer using inter-atomic distances and the molecular graph structure. Experiments show that MAT performs competitively on a diverse set of molecular prediction tasks. Most importantly, with a simple self-supervised pretraining, MAT requires tuning of only a few hyperparameter values to achieve state-of-the-art performance on downstream tasks. Finally, we show that attention weights learned by MAT are interpretable from the chemical point of view.
This paper has not been read by Pith yet.
Forward citations
Cited by 3 Pith papers
-
Chem-GMNet: A Sphere-Native Geometric Transformer for Molecular Property Prediction
Chem-GMNet uses sphere-native embeddings, DualSKA attention, and SH-FFN layers to match or beat ChemBERTa-2 on MoleculeNet tasks with fewer parameters and sometimes no pretraining.
-
TIGER: Text-Informed Generalized Enzyme-Reaction Retrieval
TIGER is a text-informed ML framework that improves bidirectional enzyme-reaction retrieval by distilling semantic knowledge from sequences and aligning representations across tasks and distributions.
-
A Systematic Survey and Benchmark of Deep Learning for Molecular Property Prediction in the Foundation Model Era
A systematic survey and benchmark of four deep learning paradigms for molecular property prediction that organizes the field, critiques current data practices, and outlines three future directions.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.