pith. sign in

arxiv: 2605.17265 · v1 · pith:PZRH7YT4new · submitted 2026-05-17 · 💻 cs.LG

When Molecular Similarity Works: Property Cliffs Reveal Hidden Errors

Pith reviewed 2026-05-20 14:11 UTC · model grok-4.3

classification 💻 cs.LG
keywords molecular property predictionproperty cliffsmachine learning evaluationCliffSplitCliffLossQM9MoleculeNetsimilarity metrics
0
0 comments X

The pith

Property cliffs expose hidden errors in molecular property prediction that overall metrics miss.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that molecular machine learning models often fail in local regions where structurally similar molecules have sharply different properties, called property cliffs, even when average accuracy appears competitive. These localized failures matter for drug discovery and material design because predictions must be trustworthy precisely for compounds that look alike. The authors introduce CliffSplit, an evaluation protocol that builds test cases around cliff neighborhoods with local support, and CliffLoss, a training adjustment that penalizes cliff-sensitive mistakes. Experiments across QM9 and MoleculeNet datasets with multiple backbones confirm that cliff regions carry at least 15% higher error and that CliffLoss narrows the cliff-to-smooth gap by up to 30% while cutting overall mean absolute error by 9.7%.

Core claim

Property cliffs expose a gap in standard evaluation for molecular machine learning: models with competitive overall performance fail in high-risk local neighborhoods where similar molecules differ sharply in target property. CliffSplit constructs locally supported, cliff-exposed test cases to quantify this, revealing at least 15% higher error in cliff-heavy QM9 regions. CliffLoss, a train-only mechanism, reduces the cliff-to-smooth error gap by up to 30% on Lipophilicity and improves overall MAE by 9.7%.

What carries the argument

CliffSplit, a cliff-aware evaluation protocol that constructs locally supported, cliff-exposed test cases, together with CliffLoss, a model-agnostic train-only mitigation mechanism for cliff-sensitive errors.

If this is right

  • Overall performance numbers can conceal large local errors in neighborhoods where molecular similarity breaks down.
  • CliffSplit testing uncovers at least 15% higher error in cliff-heavy portions of QM9 targets.
  • CliffLoss training shrinks the cliff-to-smooth error difference by as much as 30% on tasks like Lipophilicity.
  • The combination turns an anecdotal observation about similarity failure into a measurable benchmark for molecular models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same cliff-aware split and loss ideas could be tested on other structure-to-property tasks outside small molecules.
  • Active learning pipelines might use cliff detection to select additional data points from high-risk neighborhoods.
  • Combining CliffLoss with uncertainty estimates could further improve reliability in safety-critical molecular design.

Load-bearing premise

The chosen similarity metric and neighborhood definition correctly identify regions where molecular similarity should predict property similarity but does not.

What would settle it

If models evaluated on the cliff-exposed test sets from CliffSplit show no measurable error increase relative to standard random splits, the claim of undetected localized failures would be refuted.

Figures

Figures reproduced from arXiv: 2605.17265 by Di Hu, Duanhua Cao, Haojie Rao, Jiajun Yu, Jiameng Chen, Kun Li, Longtao Hu, Wenbin Hu, Yizhen Zheng.

Figure 1
Figure 1. Figure 1: Motivation: cliff exposure across split paradigms. Existing splits differ fundamentally in [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Property-cliff regions and severity geometry. (a)(b)(c) Zoomed high-similarity regime for [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: CliffLoss training pipeline. Offline cliff scores are precomputed and fixed. During training, [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Train/validation/test marginal distributions under CliffSplit. The three splits remain highly [PITH_FULL_IMAGE:figures/full_fig_p011_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Nearest training similarity under CliffSplit. Test medians stay close to training-side [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Precomputed quartile layout in similarity and difference space. Each panel uses the original [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Visual comparison of ablation configurations across five backbones on QM9 HOMO. Left [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Training trajectories of λt on QM9 HOMO for all five backbones. Each curve plots the adaptive weight λ(t) over training epochs. Cliff-sensitive backbones (Uni-Mol, GotenNet) show sustained upward growth toward the clipping boundary, while already-balanced backbones (EMPP, ViSNet) stabilize near the base weight. MoleculeFormer occupies an intermediate regime. The self-stabilizing dynamics confirm that the c… view at source ↗
Figure 9
Figure 9. Figure 9: CliffLoss mitigation evidence on QM9 HOMO. (a) CliffLoss consistently compresses the [PITH_FULL_IMAGE:figures/full_fig_p015_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: CliffScore magnitude as a function of the normalization percentile [PITH_FULL_IMAGE:figures/full_fig_p016_10.png] view at source ↗
read the original abstract

Accurate prediction of molecular properties underpins drug discovery and material design, yet even state-of-the-art models remain vulnerable to localized failure modes that aggregate metrics cannot detect. The places where molecular similarity should be most helpful are also places where standard evaluation can be most misleading. Property cliffs expose this gap: structurally similar molecules can still differ sharply in target property, so models with competitive overall performance may fail in high-risk local neighborhoods. To expose and mitigate this failure mode, CliffSplit, a cliff-aware evaluation protocol that constructs locally supported, cliff-exposed test cases, and CliffLoss, a model-agnostic train-only mitigation mechanism for cliff-sensitive errors, are introduced. Experiments on three QM9 targets and three MoleculeNet tasks across five backbones show that CliffSplit reveals at least 15% higher error in cliff-heavy QM9 regions, while CliffLoss reduces the cliff-to-smooth error gap by up to 30% on Lipophilicity and improves overall MAE by 9.7%. Together, these results turn molecular similarity failure from a descriptive anomaly into a benchmarked evaluation problem for molecular machine learning. The code is available at https://anonymous.4open.science/r/Cliff_Loss.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript introduces CliffSplit, a cliff-aware evaluation protocol for constructing locally supported, cliff-exposed test cases in molecular property prediction, and CliffLoss, a model-agnostic train-only mitigation mechanism targeting cliff-sensitive errors. Experiments on three QM9 targets, three MoleculeNet tasks, and five backbones report that CliffSplit reveals at least 15% higher error in cliff-heavy QM9 regions, while CliffLoss reduces the cliff-to-smooth error gap by up to 30% on Lipophilicity and improves overall MAE by 9.7%. Code is provided at an anonymous repository.

Significance. If the results hold after addressing the noted assumption, the work provides a concrete way to expose and mitigate localized failure modes in molecular ML that aggregate metrics miss, with direct relevance to drug discovery applications. The availability of code supports reproducibility and is a strength.

major comments (1)
  1. [CliffSplit protocol (methods description)] The central error-gap claim for CliffSplit (at least 15% higher error in cliff-heavy regions) depends on the premise that the fixed similarity metric (e.g., Tanimoto on fingerprints) and neighborhood cutoff correctly isolate regions where molecular similarity should imply property similarity. No validation is reported, such as property correlation plots for non-cliff similar pairs or ablation studies on metric choice, to rule out other distributional effects. This assumption is load-bearing for the interpretation of the reported performance deltas.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive feedback on the manuscript. The comment regarding validation of the similarity assumption in CliffSplit is well-taken and has prompted us to add supporting analyses that strengthen the interpretation of the error-gap results without altering the core claims.

read point-by-point responses
  1. Referee: The central error-gap claim for CliffSplit (at least 15% higher error in cliff-heavy regions) depends on the premise that the fixed similarity metric (e.g., Tanimoto on fingerprints) and neighborhood cutoff correctly isolate regions where molecular similarity should imply property similarity. No validation is reported, such as property correlation plots for non-cliff similar pairs or ablation studies on metric choice, to rule out other distributional effects. This assumption is load-bearing for the interpretation of the reported performance deltas.

    Authors: We agree that explicit validation of the assumption would improve the manuscript. In the revised version we have added property correlation plots (new Supplementary Figure S4) for non-cliff pairs within the Tanimoto cutoff on Morgan fingerprints; these show strong positive correlations (Pearson r = 0.87-0.93) across the three QM9 targets, confirming that the chosen metric and cutoff identify neighborhoods where property similarity holds except at cliffs. We have also included a brief ablation (Section 4.1) repeating the CliffSplit analysis with Dice similarity and with ECFP4 fingerprints; the error gaps remain qualitatively unchanged (13-18% higher error in cliff-heavy regions). These additions directly address the concern and help rule out confounding distributional effects while preserving the original experimental results. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance deltas on held-out sets

full rationale

The paper introduces CliffSplit as an evaluation protocol and CliffLoss as a mitigation, then reports empirical MAE and error-gap improvements on QM9 and MoleculeNet tasks across backbones. These are measured on constructed test splits and training modifications, with no equations, fitted parameters, or predictions that reduce by construction to the protocol's own inputs. The central results remain independent measurements rather than self-referential derivations.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach rests on standard molecular similarity assumptions and the existence of measurable property cliffs; no new physical entities are postulated and the only free parameters appear to be thresholds used to label cliffs.

free parameters (1)
  • cliff_threshold
    Value used to decide when a property difference between similar molecules counts as a cliff; chosen or tuned on data.
axioms (1)
  • domain assumption Structurally similar molecules are expected to have similar properties except at identifiable cliffs
    Invoked to justify why similarity-based models should be tested on cliff neighborhoods.

pith-pipeline@v0.9.0 · 5763 in / 1344 out tokens · 35222 ms · 2026-05-20T14:11:32.256509+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 1 internal anchor

  1. [1]

    Equivariant masked position prediction for efficient molecular representation

    Junyi An, Chao Qu, Yun-Fei Shi, Xinhao Liu, Qianwei Tang, Fenglei Cao, and Yuan Qi. Equivariant masked position prediction for efficient molecular representation. InInternational Conference on Learning Representations (ICLR), 2025. URL https://openreview.net/ forum?id=Nue5iMj8n6

  2. [2]

    Invariant Risk Minimization

    Martin Arjovsky, Léon Bottou, Ishaan Gulrajani, and David Lopez-Paz. Invariant risk mini- mization.arXiv preprint arXiv:1907.02893, 2020

  3. [3]

    Gotennet: Rethinking efficient 3d equivariant graph neural networks

    Sarp Aykent and Tian Xia. Gotennet: Rethinking efficient 3d equivariant graph neural networks. InProceedings of the Thirteenth International Conference on Learning Representations (ICLR),

  4. [4]

    URLhttps://openreview.net/forum?id=5wxCQDtbMo

  5. [5]

    Bemis and Mark A

    Guy W. Bemis and Mark A. Murcko. The properties of known drugs. 1. molecular frameworks. Journal of Medicinal Chemistry, 39(15):2887–2893, 1996. doi: 10.1021/jm9602928

  6. [6]

    Molevolve: Llm-guided evolutionary search for interpretable molecular optimization.arXiv preprint arXiv:2603.24382, 2026

    Xiangsen Chen, Ruilong Wu, Yanyan Lan, Ting Ma, and Yang Liu. Molevolve: Llm-guided evolutionary search for interpretable molecular optimization.arXiv preprint arXiv:2603.24382, 2026

  7. [7]

    John S. Delaney. ESOL: Estimating aqueous solubility directly from molecular structure. Journal of Chemical Information and Computer Sciences, 44(3):1000–1005, 2004. doi: 10. 1021/ci034243x

  8. [8]

    Advances in activity cliff research.Molecular Informatics, 35(5):181–191, 2016

    Dilyana Dimova and Jürgen Bajorath. Advances in activity cliff research.Molecular Informatics, 35(5):181–191, 2016

  9. [9]

    Dropout as a Bayesian approximation: Representing model uncertainty in deep learning

    Yarin Gal and Zoubin Ghahramani. Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. InProceedings of the 33rd International Conference on Machine Learning (ICML), pages 1050–1059, 2016

  10. [10]

    Neural message passing for quantum chemistry

    Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational Conference on Machine Learning, pages 1263–1272. PMLR, 2017

  11. [11]

    Structure-activity landscape index: identifying and quantifying activity cliffs.Journal of Chemical Information and Modeling, 48(3):646–658, 2008

    Rajarshi Guha and John H Van Drie. Structure-activity landscape index: identifying and quantifying activity cliffs.Journal of Chemical Information and Modeling, 48(3):646–658, 2008

  12. [12]

    GOOD: A graph out-of-distribution benchmark

    Shurui Gui, Xiner Li, Limei Wang, and Shuiwang Ji. GOOD: A graph out-of-distribution benchmark. InAdvances in Neural Information Processing Systems, volume 35, pages 2059–

  13. [13]

    Curran Associates, Inc., 2022

  14. [14]

    Weinberger

    Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. On calibration of modern neural networks. InProceedings of the 34th International Conference on Machine Learning (ICML), pages 1321–1330, 2017

  15. [15]

    Huabin Hu and Jürgen Bajorath. Systematic identification of activity cliffs with dual-atom replacements and their rationalization on the basis of single-atom replacement analogs and x-ray structures.Chemical Biology & Drug Design, 99(2):308–319, 2022

  16. [16]

    Open graph benchmark: Datasets for machine learning on graphs

    Weihua Hu, Matthias Fey, Marinka Zitnik, Yuxiao Dong, Hongyu Ren, Bowen Liu, Michele Catasta, and Jure Leskovec. Open graph benchmark: Datasets for machine learning on graphs. InAdvances in Neural Information Processing Systems (NeurIPS), volume 33, 2020

  17. [17]

    Johnson and Gerald M

    Mark A. Johnson and Gerald M. Maggiora.Concepts and Applications of Molecular Similarity. Wiley, New York, 1990

  18. [18]

    Kingma and Jimmy Ba

    Diederik P. Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InProceedings of the 3rd International Conference on Learning Representations (ICLR), 2015. 19

  19. [19]

    Haque, Sara M

    Pang Wei Koh, Shiori Sagawa, Henrik Marklund, Sang Michael Xie, Marvin Zhang, Akshay Balsubramani, Weihua Hu, Michihiro Yasunaga, Richard Lanas Phillips, Irena Gao, Tony Lee, Etienne David, Ian Stavness, Wei Guo, Berton Earnshaw, Imran S. Haque, Sara M. Beery, Jure Leskovec, Anshul Kundaje, Emma Pierson, Sergey Levine, Chelsea Finn, and Percy Liang. WILDS...

  20. [20]

    Simple and scalable predictive uncertainty estimation using deep ensembles

    Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. InAdvances in Neural Information Processing Systems (NeurIPS), volume 30, 2017

  21. [21]

    DrugPilot: LLM-based Parameterized Reasoning Agent for Drug Discovery, July 2025

    Kun Li, Zhennan Wu, Shoupeng Wang, Jia Wu, Shirui Pan, and Wenbin Hu. Drugpilot: Llm- based parameterized reasoning agent for drug discovery.arXiv preprint arXiv:2505.13940, 2025

  22. [22]

    Bsl: A unified and generalizable multitask learning platform for virtual drug discovery from design to synthesis.arXiv preprint arXiv:2508.01195, 2025

    Kun Li, Zhennan Wu, Yida Xiong, Hongzhi Zhang, Longtao Hu, Zhonglie Liu, Junqi Zeng, Wen- jie Wu, Mukun Chen, Jiameng Chen, et al. Bsl: A unified and generalizable multitask learning platform for virtual drug discovery from design to synthesis.arXiv preprint arXiv:2508.01195, 2025

  23. [23]

    Graph- structured small molecule drug discovery through deep learning: Progress, challenges, and opportunities

    Kun Li, Yida Xiong, Hongzhi Zhang, Xiantao Cai, Jia Wu, Bo Du, and Wenbin Hu. Graph- structured small molecule drug discovery through deep learning: Progress, challenges, and opportunities. In2025 IEEE International Conference on Web Services (ICWS), pages 1033– 1042, 2025. doi: 10.1109/ICWS67624.2025.00135

  24. [24]

    Contrastive learning-based drug screening model for glun1/glun3a inhibitors.Acta Pharmacologica Sinica, pages 1–13, 2025

    Kun Li, Yue Zeng, Yi-da Xiong, Hao-chen Wu, Sui Fang, Zhi-yan Qu, Yan Zhu, Bo Du, Zhao- bing Gao, and Wen-bin Hu. Contrastive learning-based drug screening model for glun1/glun3a inhibitors.Acta Pharmacologica Sinica, pages 1–13, 2025

  25. [25]

    Can molecular evolution mechanism enhance molecular representation? InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 15108–15116, 2026

    Kun Li, Longtao Hu, Jiameng Chen, Hongzhi Zhang, Yida Xiong, Xiantao Cai, Wenbin Hu, and Jia Wu. Can molecular evolution mechanism enhance molecular representation? InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 15108–15116, 2026

  26. [26]

    Pcevo: Path-consistent molecular representation via virtual evolutionary

    Kun Li, Longtao Hu, Yida Xiong, Jiajun Yu, Hongzhi Zhang, Jiameng Chen, Xiantao Cai, Jia Wu, and Wenbin Hu. Pcevo: Path-consistent molecular representation via virtual evolutionary. InProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, IJCAI-26. International Joint Conferences on Artificial Intelligence Organization...

  27. [27]

    Yuanqi Liao and Tess E. Smidt. Equiformerv2: Geometric and physical quantities improve E(3) equivariant message passing.arXiv preprint arXiv:2306.07997, 2023

  28. [28]

    Miyato, S

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Péter Dollár. Focal loss for dense object detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2): 318–327, 2020. doi: 10.1109/TPAMI.2018.2858826

  29. [29]

    Maggiora

    Gerald M. Maggiora. On outliers and activity cliffs: Why qsar often disappoints.Journal of Chemical Information and Modeling, 46(4):1535–1550, 2006

  30. [30]

    Long-tail learning via logit adjustment.Proceedings of ICLR, 2021

    Aditya Krishna Menon, Sadeep Jayasumana, Abhishek Singh Rawat, Harshvardhan Jain, Andreas Veit, and Sanjiv Kumar. Long-tail learning via logit adjustment.Proceedings of ICLR, 2021

  31. [31]

    Mobley and J

    David L. Mobley and J. Peter Guthrie. FreeSolv: A database of experimental and calculated hydration free energies, with input files.Journal of Computer-Aided Molecular Design, 28(7): 711–720, 2014. doi: 10.1007/s10822-014-9747-x

  32. [32]

    Molecule- former is a gcn-transformer architecture for molecular property prediction.Communications Biology, 8(1668), 2025

    Mingyuan Qin, Ziyan Sun, Lei Feng, Chongyin Han, Jingjing Xia, and Lianyi Han. Molecule- former is a gcn-transformer architecture for molecular property prediction.Communications Biology, 8(1668), 2025. doi: 10.1038/s42003-025-09064-x

  33. [33]

    Dral, Matthias Rupp, and O

    Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, and O. Anatole von Lilienfeld. Quantum chemistry structures and properties of 134k molecules.Scientific Data, 1:140022, 2014. 20

  34. [34]

    Learning to reweight examples for robust deep learning

    Mengye Ren, Wenyuan Zeng, Bin Yang, and Raquel Urtasun. Learning to reweight examples for robust deep learning. InAdvances in Neural Information Processing Systems (NeurIPS), volume 31, pages 669–678, 2018

  35. [35]

    Large-scale chemical language representations capture molecular structure and properties

    Jerret Ross, Brian Belgodere, Vijil Chenthamarakshan, Inkit Padhi, Youssef Mroueh, and Payel Das. Large-scale chemical language representations capture molecular structure and properties. Nature Machine Intelligence, 4(12):1256–1264, 2022

  36. [36]

    Hashimoto, and Percy Liang

    Shiori Sagawa, Pang Wei Koh, Tatsunori B. Hashimoto, and Percy Liang. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. InInternational Conference on Learning Representations (ICLR), 2020

  37. [37]

    Schnet: A continuous-filter convolutional neural network for modeling quantum interactions.Advances in Neural Information Processing Systems, 30, 2017

    Kristof Schütt, Pieter-Jan Kindermans, Huziel Enoc Sauceda Felix, Stefan Chmiela, Alexandre Tkatchenko, and Klaus-Robert Müller. Schnet: A continuous-filter convolutional neural network for modeling quantum interactions.Advances in Neural Information Processing Systems, 30, 2017

  38. [38]

    Using random forest to model the domain applicability of another random forest model.Journal of Chemical Information and Modeling, 53(11):2837–2850, 2013

    Robert P Sheridan. Using random forest to model the domain applicability of another random forest model.Journal of Chemical Information and Modeling, 53(11):2837–2850, 2013

  39. [39]

    Sheridan, Bradley P

    Robert P. Sheridan, Bradley P. Feuston, Vladimir N. Maiorov, and Simon K. Kearsley. Similarity to molecules in the training set is a good discriminator for prediction accuracy in qsar.Journal of Chemical Information and Computer Sciences, 44(6):1912–1928, 2004. doi: 10.1021/ ci049782w

  40. [40]

    Exploring activity cliffs in medicinal chemistry: miniper- spective.Journal of Medicinal Chemistry, 55(7):2932–2942, 2012

    Dagmar Stumpfe and Jurgen Bajorath. Exploring activity cliffs in medicinal chemistry: miniper- spective.Journal of Medicinal Chemistry, 55(7):2932–2942, 2012

  41. [41]

    In silico evaluation of logd7.4 and comparison with other prediction methods.Journal of Chemometrics, 29(7):389–398, 2015

    Jian-Bing Wang, Dong-Sheng Cao, Min-Feng Zhu, Yong-Huan Yun, Nan Xiao, and Yi-Zeng Liang. In silico evaluation of logd7.4 and comparison with other prediction methods.Journal of Chemometrics, 29(7):389–398, 2015. doi: 10.1002/cem.2718

  42. [42]

    Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing.Nature Communications, 15(1), January 2024

    Yusong Wang, Tong Wang, Shaoning Li, Xinheng He, Mingyu Li, Zun Wang, Nanning Zheng, Bin Shao, and Tie-Yan Liu. Enhancing geometric representations for molecules with equivariant vector-scalar interactive message passing.Nature Communications, 15(1):313, 2024. doi: 10.1038/s41467-023-43720-2

  43. [43]

    Feinberg, Joseph Gomes, Caleb Geniesse, Abhilash S

    Ziqi Wu, Bharath Ramsundar, Evan N. Feinberg, Joseph Gomes, Caleb Geniesse, Abhilash S. Pappu, Karl Leswing, and Vijay S. Pande. MoleculeNet: A benchmark for molecular machine learning.Chemical Science, 9(2):513–530, 2018

  44. [44]

    Analyzing learned molecular representations for property prediction.Journal of Chemical Information and Modeling, 59(8):3370–3388, 2019

    Kevin Yang, Kyle Swanson, Wengong Jin, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, et al. Analyzing learned molecular representations for property prediction.Journal of Chemical Information and Modeling, 59(8):3370–3388, 2019

  45. [45]

    Kernel readout for graph neural networks

    Jiajun Yu, Zhihao Wu, Jinyu Cai, Adele Lu Jia, and Jicong Fan. Kernel readout for graph neural networks. InIJCAI, pages 2505–2514, 2024

  46. [46]

    A centrality-based graph learning framework

    Jiajun Yu, Zhihao Wu, Jielong Lu, Tianyue Wang, and Haishuai Wang. A centrality-based graph learning framework. InProceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, pages 3588–3596, 2025

  47. [47]

    Collaborative expert llms guided multi-objective molecular optimization.arXiv preprint arXiv:2503.03503, 2025

    Jiajun Yu, Yizhen Zheng, Huan Yee Koh, Shirui Pan, Tianyue Wang, and Haishuai Wang. Collaborative expert llms guided multi-objective molecular optimization.arXiv preprint arXiv:2503.03503, 2025

  48. [48]

    Topology-aware dynamic reweighting for distribution shifts on graphs

    Weihuang Zheng, Jiashuo Liu, et al. Topology-aware dynamic reweighting for distribution shifts on graphs. InInternational Conference on Machine Learning (ICML), 2025

  49. [49]

    Large language models for drug discovery and development

    Yizhen Zheng, Huan Yee Koh, Jiaxin Ju, Madeleine Yang, Lauren T May, Geoffrey I Webb, Li Li, Shirui Pan, and George Church. Large language models for drug discovery and development. Patterns, 6(10), 2025. 21

  50. [50]

    Uni-Mol: A universal 3d molecular representation learning framework

    Gengmo Zhou, Zhifeng Gao, Qiankun Ding, Hang Zheng, Hongteng Xu, Zhewei Wei, Linfeng Zhang, and Guolin Ke. Uni-Mol: A universal 3d molecular representation learning framework. InThe Eleventh International Conference on Learning Representations, 2023. 22