pith. machine review for the scientific record. sign in

arxiv: 2605.10722 · v1 · submitted 2026-05-11 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

On Improving Graph Neural Networks for QSAR by Pre-training on Extended-Connectivity Fingerprints

Authors on Pith no claims yet

Pith reviewed 2026-05-12 05:04 UTC · model grok-4.3

classification 💻 cs.LG
keywords graph neural networksQSARpre-trainingextended-connectivity fingerprintsmolecular property predictiondrug discoveryout-of-distribution generalization
0
0 comments X

The pith

Pre-training graph neural networks to predict extended-connectivity fingerprints improves performance on most QSAR benchmarks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a pre-training strategy for graph neural networks that uses extended-connectivity fingerprints as the target to enhance results on quantitative structure-activity relationship tasks in drug discovery. It tests the approach with statistical validation on out-of-distribution splits across Biogen industry datasets and compares against standard baselines. A reader would care because GNNs have shown inconsistent advantages over simpler molecular representations, and this method provides a general way to boost them without architecture changes. The work also identifies limits, including underperformance on heterogeneous data and complex endpoints like binding affinity, along with checks for substructure leakage effects.

Core claim

Pre-training GNNs to predict ECFPs produces statistically significant gains in standard performance metrics over all evaluated baselines on five out of six Biogen benchmarks under challenging out-of-distribution splits, while showing reduced effectiveness for more heterogeneous datasets and complex endpoints such as binding affinity prediction.

What carries the argument

The pre-training objective of predicting bits in the Extended-Connectivity Fingerprint (ECFP), which encodes substructure presence and acts as a self-supervised signal to build molecular representations before downstream QSAR fine-tuning.

If this is right

  • GNNs become more competitive with classical fingerprint methods on standard QSAR tasks.
  • Pre-training supports better out-of-distribution generalization across many practically relevant molecular datasets.
  • Effectiveness varies with dataset heterogeneity and endpoint complexity, requiring case-by-case evaluation.
  • Substructure leakage during pre-training can diminish benefits in specific scenarios.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • GNNs may learn substructure patterns more reliably through this explicit pre-training step than through end-to-end supervised training alone.
  • The same pre-training idea could be tested with alternative molecular fingerprints to check whether gains hold across different representations.
  • Integration into virtual screening workflows might improve early-stage hit identification rates by leveraging the boosted initial predictions.

Load-bearing premise

The chosen out-of-distribution splits and substructure-leakage checks are sufficient to ensure that performance gains reflect genuine generalization rather than residual data overlap or dataset artifacts.

What would settle it

Re-running the benchmarks with stricter splits that fully eliminate substructure overlap between pre-training and fine-tuning sets and finding that the statistically significant improvements vanish or reverse.

Figures

Figures reproduced from arXiv: 2605.10722 by Charlotte M. Deane, Garrett M. Morris, Markus Dablander, Sam Money-Kyrle, Stephane Werner, Thierry Hanser.

Figure 1
Figure 1. Figure 1: Performance metrics on ESOL aqueous solubility with all molecules (ESOL) and only molecules with a drug-like solubility [PITH_FULL_IMAGE:figures/full_fig_p037_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Pearson correlation (↑ ρ) between methods on Biogen datasets. Tasks are (a) Human Plasma Protein Binding (PPB), (b) Rat PPB, (c) Human Intrinsic Clearance (CLint), (d) Rat CLint, (e) Efflux, (f) Solubility. Featurisation methods shown are GINs trained from scratch (Scratch GIN), ECFP… view at source ↗
Figure 3
Figure 3. Figure 3: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the mean absolute percentage error (↓ MAPE(%)) between methods on Biogen datasets. Tasks are (a) Human Plasma Protein Binding (PPB), (b) Rat PPB, (c) Human Intrinsic Clearance (CLint), (d) Rat CLint, (e) Efflux, (f) Solubility. Featurisation methods shown are GINs trained from scratch (S… view at source ↗
Figure 4
Figure 4. Figure 4: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Pearson correlation (↑ ρ) of PT-GIN models pre-trained on Tanimoto filtered QMugs subsets. Tasks are (a) Human Plasma Protein Binding (PPB), (b) Rat PPB, (c) Human Intrinsic Clearance (CLint), (d) Rat CLint, (e) Efflux, (f) Solubility. Featurisation methods shown are GINs pre-trained… view at source ↗
Figure 5
Figure 5. Figure 5: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the mean absolute percentage error (↓ MAPE(%)) of PT-GIN models pre-trained on Tanimoto filtered QMugs subsets. Tasks are (a) Human Plasma Protein Binding (PPB), (b) Rat PPB, (c) Human Intrinsic Clearance (CLint), (d) Rat CLint, (e) Efflux, (f) Solubility. Featurisation methods shown are… view at source ↗
Figure 6
Figure 6. Figure 6: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the coefficient of determination (↑ R 2 ) between methods on ChEMBL datasets. Tasks are (a) DRD2, (b) Factor XA binding affinity prediction. Featurisation methods shown are ECFP-pretrained GINs (PT-GIN), hashed Extended-Connectivity Fingerprints (ECFPhashed), hashed Functional-Connectivi… view at source ↗
Figure 7
Figure 7. Figure 7: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Pearson correlation (↑ ρ) between methods on ChEMBL datasets. Tasks are (a) DRD2, (b) Factor XA binding affinity prediction. Featurisation methods shown are ECFP-pretrained GINs (PT-GIN), hashed Extended-Connectivity Fingerprints (ECFPhashed), hashed Functional-Connectivity Fingerpri… view at source ↗
Figure 8
Figure 8. Figure 8: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the mean absolute percentage error (↓ MAPE(%)) between methods on ChEMBL datasets. Tasks are (a) DRD2, (b) Factor XA binding affinity prediction. Featurisation methods shown are ECFP-pretrained GINs (PT-GIN), hashed Extended-Connectivity Fingerprints (ECFPhashed), hashed Functional-Conne… view at source ↗
Figure 9
Figure 9. Figure 9: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Pearson correlation (↑ ρ) between methods on MoleculeNet regression datasets. Datasets shown are (a) ESOL, (b) FreeSolv, and (c) Lipophilicity. Featurisation methods shown are ECFP-pretrained GINs (PT-GIN), hashed Extended-Connectivity Fingerprints (ECFPhashed), hashed Functional-Con… view at source ↗
Figure 10
Figure 10. Figure 10: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the mean absolute percentage error (↓ MAPE(%)) between methods on MoleculeNet regression datasets. Datasets shown are (a) ESOL, (b) FreeSolv, and (c) Lipophilicity. Featurisation methods shown are ECFP-pretrained GINs (PT-GIN), hashed Extended-Connectivity Fingerprints (ECFPhashed), has… view at source ↗
Figure 11
Figure 11. Figure 11: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Area Under the Receiver Operating Characteristic (↑ AUROC) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846… view at source ↗
Figure 12
Figure 12. Figure 12: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Area Under the Precision-Recall Curve (↑ AUCPR) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV85… view at source ↗
Figure 13
Figure 13. Figure 13: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Matthews correlation coefficient (↑ MCC) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV852, (p) … view at source ↗
Figure 14
Figure 14. Figure 14: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Enrichment Factor at 10% (↑ EF10%) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV852, (p) MUV858… view at source ↗
Figure 15
Figure 15. Figure 15: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Enrichment Factor at 5% (↑ EF5%) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV852, (p) MUV858, … view at source ↗
Figure 16
Figure 16. Figure 16: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Enrichment Factor at 1% (↑ EF1%) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV852, (p) MUV858, … view at source ↗
Figure 17
Figure 17. Figure 17: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Enrichment Factor at 0.5% (↑ EF0.5%) between methods on MUV virtual screening tasks. Tasks are (a) MUV466, (b) MUV548, (c) MUV600, (d) MUV644, (e) MUV652, (f) MUV689, (g) MUV692, (h) MUV712, (i) MUV713, (j) MUV733, (k) MUV737, (l) MUV810, (m) MUV832, (n) MUV846, (o) MUV852, (p) MUV8… view at source ↗
Figure 18
Figure 18. Figure 18: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Area Under the Receiver Operating Characteristic (↑ AUROC) between methods on Tox21 toxicity tasks. For task labels, see [PITH_FULL_IMAGE:figures/full_fig_p054_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Area Under the Precision-Recall Curve (↑ AUCPR) between methods on Tox21 toxicity tasks. For task labels, see [PITH_FULL_IMAGE:figures/full_fig_p055_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: Rank-biserial coefficient (rrb) effect sizes (upper triangle) and Wilcoxon signed-rank test p-values (lower triangle) of the Matthews correlation coefficient (↑ MCC) between methods on Tox21 toxicity screening tasks. For task labels, see [PITH_FULL_IMAGE:figures/full_fig_p056_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: Substructure importances on the FreeSolv dataset. (a) Permutation feature importances for ECFP [PITH_FULL_IMAGE:figures/full_fig_p058_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: Distribution of ∆Gsolv in the FreeSolv dataset for molecules with and without R-OH. 24 [PITH_FULL_IMAGE:figures/full_fig_p059_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: Substructure importances on the Human CLint dataset [ [PITH_FULL_IMAGE:figures/full_fig_p060_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: Substructure importances on the Biogen Solubility task [ [PITH_FULL_IMAGE:figures/full_fig_p061_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: Substructure importances on the MUV858 task [ [PITH_FULL_IMAGE:figures/full_fig_p062_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: Substructure importances on the Tox21 SR ARE task [ [PITH_FULL_IMAGE:figures/full_fig_p063_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: Metric distributions for PT-GIN on Efflux. Left column contains histograms and boxplots. Right column contains normal [PITH_FULL_IMAGE:figures/full_fig_p064_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: Metric distributions for PT-GIN on MUV466. Left column contains histograms and boxplots. Right column contains [PITH_FULL_IMAGE:figures/full_fig_p065_28.png] view at source ↗
Figure 29
Figure 29. Figure 29: Enrichment Factor distributions for PT-GIN on MUV466. Left column contains histograms and boxplots. Right column [PITH_FULL_IMAGE:figures/full_fig_p066_29.png] view at source ↗
Figure 30
Figure 30. Figure 30: Tanimoto similarity filtering of pre-training dataset. [PITH_FULL_IMAGE:figures/full_fig_p072_30.png] view at source ↗
read the original abstract

Molecular Graph Neural Networks (GNNs) are increasingly common in drug discovery, particularly for Quantitative Structure-Activity Relationship (QSAR) studies; yet, their superiority compared to classical molecular featurisation approaches is disputed. We report a general strategy for improving GNNs for QSAR by pre-training to predict Extended-Connectivity Fingerprints (ECFP). We validate our approach with statistical tests and challenging out-of-distribution (OOD) splits. Across five out of six Biogen benchmarks, we observed a statistically significant improvement in standard performance metrics over all evaluated baselines when using ECFP pre-trained GNNs. However, for more heterogeneous datasets and more complex endpoints, such as binding affinity prediction, pre-trained GNNs underperformed in OOD settings. Importantly, we investigated the impact of substructure-level data leakage during pre-training on downstream performance. While we identified scenarios where pre-training on ECFPs was less effective, our findings show that ECFP-based pre-training can enhance downstream OOD performance on a diverse set of practically relevant QSAR tasks.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper proposes pre-training GNNs to predict Extended-Connectivity Fingerprints (ECFP) as a general strategy to improve performance on QSAR tasks. It validates the approach on six Biogen benchmarks using OOD splits and statistical tests, claiming statistically significant gains over baselines on five of six tasks, while noting underperformance on heterogeneous datasets and complex endpoints such as binding affinity. The authors also examine substructure-level data leakage during pre-training and its effect on downstream OOD results.

Significance. If the empirical claims hold after addressing the controls, the work supplies a simple, fingerprint-based pre-training recipe that can narrow the gap between GNNs and classical molecular descriptors in practical drug-discovery settings. The explicit use of OOD splits and leakage checks, together with the candid reporting of failure cases on complex tasks, adds value beyond purely positive results.

major comments (2)
  1. [Abstract / Experimental Validation] Abstract and experimental section: the headline claim of statistically significant improvement on five of six benchmarks rests on OOD splits and substructure-leakage controls, yet the manuscript supplies no quantitative details on split construction (similarity thresholds, scaffold/temporal vs. random), the precise leakage metric employed, the fraction of affected molecules, or the number of independent runs and exact statistical test underlying the significance statements. These omissions are load-bearing because residual molecular overlap could explain the reported gains.
  2. [Results] Results section: full performance tables (means, standard deviations, p-values) are absent from the provided description, and hyper-parameter choices for both the GNN and the pre-training objective are not reported. Without these, it is impossible to reproduce or assess the robustness of the cross-baseline comparisons.
minor comments (2)
  1. [Abstract] The abstract would be clearer if it named the specific GNN architectures and the exact performance metrics (e.g., RMSE, AUC) used for each endpoint.
  2. [Introduction] A short related-work paragraph contrasting ECFP pre-training with existing self-supervised GNN objectives (e.g., graph contrastive or property-prediction pre-training) would help situate the contribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback, which has helped us identify areas where additional detail will improve the reproducibility and transparency of the work. We address each major comment below and have revised the manuscript accordingly.

read point-by-point responses
  1. Referee: [Abstract / Experimental Validation] Abstract and experimental section: the headline claim of statistically significant improvement on five of six benchmarks rests on OOD splits and substructure-leakage controls, yet the manuscript supplies no quantitative details on split construction (similarity thresholds, scaffold/temporal vs. random), the precise leakage metric employed, the fraction of affected molecules, or the number of independent runs and exact statistical test underlying the significance statements. These omissions are load-bearing because residual molecular overlap could explain the reported gains.

    Authors: We agree that these specifics are required to substantiate the OOD claims. In the revised manuscript we have added a new subsection in Methods that (i) describes the OOD split protocol, including scaffold-based splitting with a Tanimoto similarity threshold of 0.35 and temporal splits for the time-stamped datasets; (ii) defines the substructure leakage metric as the fraction of test-set molecules that share at least one ECFP bit with any pre-training molecule and reports the per-benchmark fractions (ranging 4–18 %); (iii) states that all experiments were repeated over 10 independent random seeds; and (iv) specifies the use of paired Wilcoxon signed-rank tests with Holm–Bonferroni correction for the significance statements (p < 0.05). The abstract has been updated to reference these controls explicitly. revision: yes

  2. Referee: [Results] Results section: full performance tables (means, standard deviations, p-values) are absent from the provided description, and hyper-parameter choices for both the GNN and the pre-training objective are not reported. Without these, it is impossible to reproduce or assess the robustness of the cross-baseline comparisons.

    Authors: We acknowledge the omission. The revised manuscript now contains complete performance tables (main text and supplementary material) that report mean ± standard deviation across the 10 runs together with the exact p-values for every baseline comparison. A new appendix lists all hyperparameters: GNN architecture (3 message-passing layers, hidden dimension 128, dropout 0.1), pre-training objective (learning rate 1e-3, 50 epochs, batch size 256), and the same settings used for the fine-tuning stage. These additions allow full reproduction of the reported results. revision: yes

Circularity Check

0 steps flagged

No circularity: purely empirical pre-training evaluation

full rationale

The paper introduces an empirical pre-training procedure in which GNNs are trained to predict ECFP vectors before fine-tuning on QSAR endpoints. All performance claims rest on direct experimental comparisons against baselines on six Biogen datasets, using OOD splits and explicit substructure-leakage checks. No equations, uniqueness theorems, or first-principles derivations appear; the reported gains are statistical observations from held-out test sets rather than quantities defined by the same fitted parameters. Self-citations, if present, are not invoked to justify the core method or to forbid alternatives. The work is therefore self-contained as standard empirical machine-learning validation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on standard machine-learning assumptions about transfer from fingerprint prediction to activity prediction and on the representativeness of the chosen OOD splits; no free parameters or new entities are introduced in the abstract.

axioms (1)
  • domain assumption Out-of-distribution splits constructed from the Biogen datasets adequately test generalization without residual substructure leakage
    The paper's validation strategy depends on these splits being leakage-free and representative of real deployment conditions.

pith-pipeline@v0.9.0 · 5503 in / 1352 out tokens · 63548 ms · 2026-05-12T05:04:47.792624+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

64 extracted references · 64 canonical work pages · 2 internal anchors

  1. [1]

    Nature Machine Intelligence , volume =

    Molecular Contrastive Learning of Representations via Graph Neural Networks , author =. Nature Machine Intelligence , volume =

  2. [2]

    Multiple Comparisons Using Rank Sums , volume=

    Dunn, Olive Jean , year=. Multiple Comparisons Using Rank Sums , volume=. Technometrics , publisher=. doi:10.2307/1266041 , url =

  3. [3]

    Proceedings of the 31st International Conference on Neural Information Processing Systems , publisher =

    Attention Is All You Need , author =. Proceedings of the 31st International Conference on Neural Information Processing Systems , publisher =

  4. [4]

    Biometrics Bulletin , author =

    Wilcoxon, Frank , year = 1945, journal =. Individual. doi:10.2307/3001968 , url =. 3001968 , eprinttype =

  5. [5]

    Patrick , year = 2024, url =

    Walters, W. Patrick , year = 2024, url =. Useful

  6. [6]

    Should We Really Use Post-Hoc Tests Based on Mean-Ranks? , author =. J. Mach. Learn. Res. , volume =

  7. [7]

    Scipy.Stats.Tukey\_hsd --- Tukey's Honestly Significant Difference Test ---

  8. [8]

    Psychometrika , volume =

    Rank-Biserial Correlation , author =. Psychometrika , volume =. doi:10.1007/BF02289138 , url =

  9. [9]

    , year = 2014, journal =

    Kerby, Dave S. , year = 2014, journal =. The

  10. [10]

    Descriptor-Based

    Burns, Jackson and Zalte, Akshat and Green, William , year =. Descriptor-Based. doi:10.48550/arXiv.2506.15792 , archiveprefix =. 2506.15792 , primaryclass =

  11. [11]

    Armstrong

    Armstrong, Richard A. , year =. When to Use the. Ophthalmic and Physiological Optics , volume =. doi:10.1111/opo.12131 , url =

  12. [12]

    Proceedings of the 35th International Conference on Neural Information Processing Systems , volume=

    Do transformers really perform badly for graph representation? , author=. Proceedings of the 35th International Conference on Neural Information Processing Systems , volume=. 2021 , url=

  13. [13]

    Aitken, Murray and Roland, Alex and Kleinrock, Michael , year =. Global

  14. [14]

    Akiba, S

    Optuna: A Next-generation Hyperparameter Optimization Framework , author=. doi:10.1145/3292500.3330701 , year=

  15. [15]

    and Singh, Kanwar P

    Alexander, John H. and Singh, Kanwar P. , year =. Inhibition of. American Journal of Cardiovascular Drugs , volume =. doi:10.2165/00129784-200505050-00001 , langid =

  16. [16]

    and Wognum, Cas and

    Ash, Jeremy R. and Wognum, Cas and. Practically Significant Method Comparison Protocols for Machine Learning in Small Molecule Drug Discovery. , journal =. 2025 , doi =

  17. [17]

    Breiman, Leo , year =. Random. Machine Learning , volume =. doi:10.1023/A:1010933404324 , url =

  18. [18]

    Unsupervised

    Butina, Darko , year =. Unsupervised. Journal of Chemical Information and Computer Sciences , volume =. doi:10.1021/ci9803381 , url =

  19. [19]

    Biggin, Philip , year =

    Crusius, Daniel and Cipcigan, Flaviu and C. Biggin, Philip , year =. Are We Fitting Data or Noise?. Faraday Discussions , volume =. doi:10.1039/D4FD00091A , url =

  20. [20]

    , year =

    Dablander, Markus and Hanser, Thierry and Lambiotte, Renaud and Morris, Garrett M. , year =. Exploring. Journal of Cheminformatics , volume =. doi:10.1186/s13321-023-00708-w , url =

  21. [21]

    , year =

    Dablander, Markus and Hanser, Thierry and Lambiotte, Renaud and Morris, Garrett M. , year =. Sort &. Journal of Cheminformatics , volume =. doi:10.1186/s13321-024-00932-y , url =

  22. [22]

    Prospective

    Fang, Cheng and Wang, Ye and Grater, Richard and Kapadnis, Sudarshan and Black, Cheryl and Trapa, Patrick and Sciabola, Simone , year =. Prospective. Journal of Chemical Information and Modeling , volume =. doi:10.1021/acs.jcim.3c00160 , url =

  23. [23]

    and Riley, Patrick F

    Gilmer, Justin and Schoenholz, Samuel S. and Riley, Patrick F. and Vinyals, Oriol and Dahl, George E. , year=. Neural message passing for Quantum chemistry , abstractNote=. Proceedings of the 34th International Conference on Machine Learning , publisher=

  24. [24]

    Proceedings of the 28th

    Hou, Zhenyu and Liu, Xiao and Cen, Yukuo and Dong, Yuxiao and Yang, Hongxia and Wang, Chunjie and Tang, Jie , year =. Proceedings of the 28th. doi:10.1145/3534678.3539321 , isbn =

  25. [25]

    Strategies for pre-training graph neural networks.arXiv preprint arXiv:1905.12265, 2019

    Hu, Weihua and Liu, Bowen and Gomes, Joseph and Zitnik, Marinka and Liang, Percy and Pande, Vijay and Leskovec, Jure , year =. Strategies for. doi:10.48550/arXiv.1905.12265 , number =. 1905.12265 , archiveprefix =

  26. [26]

    2022 , journal =

    Isert, Clemens and Atz, Kenneth and. 2022 , journal =. doi:10.1038/s41597-022-01390-7 , url =

  27. [27]

    Mol2vec:

    Jaeger, Sabrina and Fulle, Simone and Turk, Samo , year =. Mol2vec:. Journal of Chemical Information and Modeling , volume =. doi:10.1021/acs.jcim.7b00616 , url =

  28. [28]

    Molecular Graph Convolutions: Moving beyond Fingerprints , shorttitle =

    Kearnes, Steven and McCloskey, Kevin and Berndl, Marc and Pande, Vijay and Riley, Patrick , year =. Molecular Graph Convolutions: Moving beyond Fingerprints , shorttitle =. Journal of Computer-Aided Molecular Design , volume =. doi:10.1007/s10822-016-9938-8 , pmcid =

  29. [29]

    2017 , pages =

    Proceedings of the 31st International Conference on Neural Information Processing Systems , author =. 2017 , pages =

  30. [30]

    2018 , journal =

    The Rise and Fall of Machine Learning Methods in Biomedical Research , author =. 2018 , journal =. doi:10.12688/f1000research.13016.2 , url =

  31. [31]

    2025 , journal =

    Coverage Bias in Small Molecule Machine Learning , author =. 2025 , journal =. doi:10.1038/s41467-024-55462-w , url =

  32. [32]

    Landrum, Greg , year =

  33. [33]

    From Intuition to

    McGibbon, Miles and Shave, Steven and Dong, Jie and Gao, Yumiao and Houston, Douglas R and Xie, Jiancong and Yang, Yuedong and Schwaller, Philippe and Blay, Vincent , year =. From Intuition to. Briefings in Bioinformatics , volume =. doi:10.1093/bib/bbad422 , url =

  34. [34]

    doi:10.1038/s41467-024-53751-y , url =

    2024 , journal =. doi:10.1038/s41467-024-53751-y , url =

  35. [35]

    and Wen, Shi Wu and Wu, Xinyin and Acheampong, Kwabena and Liu, Aizhong , year =

    Pan, Xiongfeng and Kaminga, Atipatsa C. and Wen, Shi Wu and Wu, Xinyin and Acheampong, Kwabena and Liu, Aizhong , year =. Dopamine and. Frontiers in Aging Neuroscience , volume =. doi:10.3389/fnagi.2019.00175 , url =

  36. [36]

    Scikit-Learn:

    Pedregosa, Fabian and Varoquaux, Ga. Scikit-Learn:. 2011 , journal =

  37. [37]

    and Deane, Charlotte and Morris, Garrett M

    Raja, Arun and Zhao, Hongtao and Tyrchan, Christian and Nittinger, Eva and Bronstein, Michael M. and Deane, Charlotte and Morris, Garrett M. , year =. On the

  38. [38]

    Extended-connectivity fingerprints

    Extended-Connectivity Fingerprints , author =. 2010 , journal =. doi:10.1021/ci100050t , url =

  39. [39]

    Wallach, Izhar and Heifets, Abraham , year =. Most. Journal of Chemical Information and Modeling , volume =. doi:10.1021/acs.jcim.7b00403 , url =

  40. [40]

    Patrick , year =

    Walters, W. Patrick , year =. We

  41. [41]

    2024 , journal =

    A Call for an Industry-Led Initiative to Critically Assess Machine Learning for Real-World Drug Discovery , author =. 2024 , journal =. doi:10.1038/s42256-024-00911-w , url =

  42. [42]

    doi: 10.1039/c7sc02664a

    Wu, Zhenqin and Ramsundar, Bharath and N. Feinberg, Evan and Gomes, Joseph and Geniesse, Caleb and S. Pappu, Aneesh and Leswing, Karl and Pande, Vijay , year =. Chemical Science , volume =. doi:10.1039/C7SC02664A , url =

  43. [43]

    Analyzing

    Yang, Kevin and Swanson, Kyle and Jin, Wengong and Coley, Connor and Eiden, Philipp and Gao, Hua and. Analyzing. 2019 , journal =. doi:10.1021/acs.jcim.9b00237 , url =

  44. [44]

    Pre-Training via

    Zaidi, Sheheryar and Schaarschmidt, Michael and Martens, James and Kim, Hyunjik and Teh, Yee Whye and. Pre-Training via. 2206.00133 , archiveprefix =

  45. [45]

    Zdrazil, Barbara and Felix, Eloy and Hunter, Fiona and Manners, Emma J and Blackshaw, James and Corbett, Sybilla and. The. 2024 , journal =. doi:10.1093/nar/gkad1004 , url =

  46. [46]

    2013 , journal =

    Expanding Medicinal Chemistry Space , author =. 2013 , journal =. doi:10.1016/j.drudis.2012.10.008 , url =

  47. [47]

    Klarner, Leo and Rudner, Tim G. J. and Reutlinger, Michael and Schindler, Torsten and Morris, Garrett M. and Deane, Charlotte and Teh, Yee Whye , date =. Drug. Proceedings of the 40th. 2023 , pages =

  48. [48]

    Scaffold

    Guo, Qianrong and. Scaffold. Artificial. 2024 , pages =. doi:10.1007/978-3-031-72359-9_5 , abstract =

  49. [49]

    2025 , pages=

    Journal of Cheminformatics , author=. 2025 , pages=. doi:10.1186/s13321-025-01039-8 , url =

  50. [50]

    Fey, Matthias and Lenssen, Jan Eric , year =. Fast. doi:10.48550/arXiv.1903.02428 , archiveprefix =. 1903.02428 , primaryclass =

  51. [51]

    2019 , volume =

    Proceedings of the 33rd International Conference on Neural Information Processing Systems , author =. 2019 , volume =

  52. [52]

    Xu, Keyulu and Hu, Weihua and Leskovec, Jure and Jegelka, Stefanie , year =. How. doi:10.48550/arXiv.1810.00826 , archiveprefix =. 1810.00826 , primaryclass =

  53. [53]

    Sypetkowski, Maciej and Wenkel, Frederik and Poursafaei, Farimah and Dickson, Nia and Suri, Karush and Fradkin, Philip and Beaini, Dominique , year =. On the. arXiv , langid =:2404.11568 , primaryclass =

  54. [54]

    Could Graph Neural Networks Learn Better Molecular Representation for Drug Discovery?

    Jiang, Dejun and Wu, Zhenxing and Hsieh, Chang-Yu and Chen, Guangyong and Liao, Ben and Wang, Zhe and Shen, Chao and Cao, Dongsheng and Wu, Jian and Hou, Tingjun , year =. Could Graph Neural Networks Learn Better Molecular Representation for Drug Discovery?. Journal of Cheminformatics , volume =. doi:10.1186/s13321-020-00479-8 , url =

  55. [55]

    , year =

    Delaney, John S. , year =. Journal of Chemical Information and Computer Sciences , volume =. doi:10.1021/ci034243x , url =

  56. [56]

    and Guthrie, J

    Mobley, David L. and Guthrie, J. Peter , year =. Journal of Computer-Aided Molecular Design , volume =. doi:10.1007/s10822-014-9747-x , url =

  57. [57]

    and Huang, Ruili and Waidyanatha, Suramya and Shinn, Paul and Collins, Bradley J

    Richard, Ann M. and Huang, Ruili and Waidyanatha, Suramya and Shinn, Paul and Collins, Bradley J. and Thillainadarajah, Inthirany and Grulke, Christopher M. and Williams, Antony J. and Lougee, Ryan R. and Judson, Richard S. and Houck, Keith A. and Shobair, Mahmoud and Yang, Chihae and Rathman, James F. and Yasgar, Adam and Fitzpatrick, Suzanne C. and Sime...

  58. [58]

    2005 , journal =

    Aqueous and Cosolvent Solubility Data for Drug-like Organic Compounds , author =. 2005 , journal =. doi:10.1208/aapsj070110 , url =

  59. [59]

    and Baumann, Knut , year =

    Rohrer, Sebastian G. and Baumann, Knut , year =. Maximum Unbiased Validation (. Journal of Chemical Information and Modeling , volume =. doi:10.1021/ci8002649 , url =

  60. [60]

    and Huang, Xi-Ping and C

    Axen, Seth D. and Huang, Xi-Ping and C. A. 2017 , month = sep, journal =. doi:10.1021/acs.jmedchem.7b00696 , url =

  61. [61]

    2024 , journal =

    Guiding Questions to Avoid Data Leakage in Biological Machine Learning Applications , author =. 2024 , journal =. doi:10.1038/s41592-024-02362-y , url =

  62. [62]

    Proceedings of the 37th International Conference on Neural Information Processing Systems , author =

    Understanding the. Proceedings of the 37th International Conference on Neural Information Processing Systems , author =. 2023 , url =

  63. [63]

    Nature Machine Intelligence , pages =

    Molecular Deep Learning at the Edge of Chemical Space , author =. Nature Machine Intelligence , pages =. doi:10.1038/s42256-026-01216-w , urldate =

  64. [64]

    and Riniker, Sereina , year = 2024, journal =

    Landrum, Gregory A. and Riniker, Sereina , year = 2024, journal =. Combining. doi:10.1021/acs.jcim.4c00049 , urldate =