pith. the verified trust layer for science. sign in

arxiv: 1906.09427 · v1 · pith:SDPY76KTnew · submitted 2019-06-22 · 💻 cs.LG · stat.ML

Alchemy: A Quantum Chemistry Dataset for Benchmarking AI Models

classification 💻 cs.LG stat.ML
keywords alchemydatasetmodelschemistrymolecularmoleculesbenchmarkscontest
0
0 comments X p. Extension
Add this Pith Number to your LaTeX paper What is a Pith Number?
\usepackage{pith}
\pithnumber{SDPY76KT}

Prints a linked pith:SDPY76KT badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more

read the original abstract

We introduce a new molecular dataset, named Alchemy, for developing machine learning models useful in chemistry and material science. As of June 20th 2019, the dataset comprises of 12 quantum mechanical properties of 119,487 organic molecules with up to 14 heavy atoms, sampled from the GDB MedChem database. The Alchemy dataset expands the volume and diversity of existing molecular datasets. Our extensive benchmarks of the state-of-the-art graph neural network models on Alchemy clearly manifest the usefulness of new data in validating and developing machine learning models for chemistry and material science. We further launch a contest to attract attentions from researchers in the related fields. More details can be found on the contest website \footnote{https://alchemy.tencent.com}. At the time of benchamrking experiment, we have generated 119,487 molecules in our Alchemy dataset. More molecular samples are generated since then. Hence, we provide a list of molecules used in the reported benchmarks.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Path-Based Gradient Boosting for Graph-Level Prediction

    cs.LG 2026-04 unverdicted novelty 6.0

    PathBoost extends path-based gradient boosting with logistic loss, prefix-based multi-attribute handling, and automatic anchor selection, achieving better or comparable results to GNNs and graph kernels on benchmark d...