pith. sign in

arxiv: 2606.00555 · v2 · pith:HOPF4JOMnew · submitted 2026-05-30 · 💻 cs.AI · q-bio.BM

Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design

Pith reviewed 2026-06-28 18:58 UTC · model grok-4.3

classification 💻 cs.AI q-bio.BM
keywords LLM agentsstructure-based drug designmolecular optimizationprobingbinding affinitydruggabilityCrossDocked2020multi-agent systems
0
0 comments X

The pith

PROBE uses controlled probe edits to build site maps that guide LLM agents toward edits improving both binding affinity and druggability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current LLM-agent pipelines for structure-based drug design edit ligands without first testing local responses, so single steps rarely advance both binding affinity and druggability at once. The paper quantifies this with two new diagnostics: one tracking joint gains and one tracking trade-off cases. PROBE mimics medicinal chemists by decomposing ligands into editable sites, running controlled probe edits, and distilling the responses into a pocket-specific site map plus an EditManual. These artifacts then steer a three-agent loop (affinity, druggability, co-optimization) that produces more successful joint edits. On the CrossDocked2020 benchmark the method reaches state-of-the-art results while lowering the frequency of the diagnosed failure modes.

Core claim

The central claim is that edit-response probing first decomposes the ligand into editable sites and constructs a pocket-specific site map that flags where joint affinity-druggability gains are plausible, where the objectives are likely in tension, and where liability substructures should change; the responses are then distilled into an EditManual that, together with the map, directs an iterative multi-agent loop of affinity, druggability, and co-optimization agents to generate edits that more frequently satisfy both objectives simultaneously.

What carries the argument

edit-response probing that builds a pocket-specific site map and EditManual from controlled edits on editable sites to guide subsequent optimization

If this is right

  • Single molecular edits improve both binding affinity and druggability together more often than in standard LLM pipelines.
  • Cases in which a gain on one objective produces a loss on the other become less frequent.
  • The method attains state-of-the-art performance on the CrossDocked2020 benchmark.
  • The diagnostic failure modes of unguided editing are substantially reduced.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same probing step could be inserted into other multi-objective molecular design loops that pit affinity against selectivity or ADMET properties.
  • Site-map construction may reduce the total number of LLM calls needed by pruning low-value edit directions early.
  • The approach invites testing whether the probe-derived guidance transfers across protein families without retraining the underlying language model.

Load-bearing premise

Responses from a limited set of controlled probe edits on editable sites will generalize to predict joint affinity-druggability outcomes for later optimization edits without systematic bias from probe selection or LLM prompting.

What would settle it

A controlled experiment on held-out ligands in which the frequency of joint-improvement edits remains unchanged when the site map and EditManual are removed from the agent loop would falsify the claim that probing enables better joint optimization.

Figures

Figures reproduced from arXiv: 2606.00555 by James Kwok, Weiyu Chen, Yaqing Wang, Zaifei Yang.

Figure 1
Figure 1. Figure 1: Overview of the PROBE. (a) Site Map Construction. (b) Pocket-Specific Edit Manual [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Computational budget and performance comparison. CIDD (27 Rounds) is the baseline [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Comparison on target PLCD1. Hydrogen bonds and hydrophobic contacts are marked [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
read the original abstract

Structure-based drug design increasingly employs LLM agents to iteratively refine ligands against a target pocket, yet a viable ligand must satisfy two often-conflicting objectives -- binding affinity and druggability -- which single optimization steps rarely improve together. To quantify this difficulty, we introduce two diagnostic metrics: the first measures how often a single edit improves both objectives, and the second measures how often a gain on one objective comes with a loss on the other. Applying these diagnostics to current LLM-agent pipelines exposes a consistent failure mode: the agent performs molecular editing without knowing how the pocket-ligand complex responds to local modifications, thus rarely achieving joint improvement. Inspired by medicinal chemists, who probe the pocket-ligand complex with controlled analog edits before choosing an optimization direction, we propose \textbf{PROBE}, an optimization framework built around edit-response probing. PROBE first decomposes the ligand into editable sites and builds a pocket-specific \textbf{site map} that flags where joint gains are plausible, where the two objectives are likely in tension, and where liability substructures should be changed; it then performs controlled probe edits whose responses are distilled into an \textbf{EditManual}. Guided by the site map and EditManual, PROBE runs an iterative multi-agent loop in which an affinity agent, a druggability agent, and a co-optimization agent jointly produce edits. On the CrossDocked2020 benchmark, PROBE achieves state-of-the-art performance and substantially mitigates the failure modes exposed by our diagnostics metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that LLM agents for iterative ligand refinement in structure-based drug design rarely achieve simultaneous gains in binding affinity and druggability. It introduces two diagnostic metrics to quantify this failure mode, proposes the PROBE framework that decomposes ligands into editable sites, constructs a pocket-specific site map, performs controlled probe edits to build an EditManual, and then runs a guided multi-agent optimization loop (affinity, druggability, and co-optimization agents). On the CrossDocked2020 benchmark, PROBE is reported to reach state-of-the-art performance while substantially reducing the diagnosed failure modes.

Significance. If the empirical results and generalization hold, the work offers a concrete, benchmark-driven method to mitigate a recurring joint-optimization failure in LLM-based molecular design by emulating medicinal-chemistry probing. The independent definition of the diagnostic metrics (separate from the optimization loop) is a methodological strength, as is the explicit focus on falsifiable failure-mode quantification rather than post-hoc success rates alone.

major comments (2)
  1. [Abstract and §4] Abstract and §4 (Results): the central claim that PROBE 'substantially mitigates the failure modes' and achieves SOTA rests on the assumption that responses from a limited set of controlled probe edits on pre-identified editable sites generalize to predict joint affinity-druggability outcomes for subsequent multi-agent edits. No ablation or sensitivity analysis is presented that varies probe-site decomposition heuristics or LLM prompting strategies to test for systematic bias in the site map or EditManual.
  2. [§3.2] §3.2 (PROBE framework): the site map flags (joint-gain plausible, tension, liability) are derived from probe responses, yet the manuscript provides no direct validation that these flags remain predictive when the probe edits are replaced by the actual optimization edits produced by the multi-agent loop; this is load-bearing for the claim that probing reliably guides co-optimization.
minor comments (2)
  1. [Table captions and §4] Table captions and §4: exact definitions of the two diagnostic metrics (joint-improvement rate and trade-off rate), the precise baselines, data splits, and statistical tests used for the CrossDocked2020 comparison are not summarized in the abstract and should be stated explicitly in the results tables.
  2. [Methods] Notation: the terms 'site map' and 'EditManual' are used throughout without a concise formal definition or pseudocode in the methods section; a one-paragraph boxed definition would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of validation that can strengthen the manuscript. We address each major comment below and commit to incorporating additional analyses in the revision.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Results): the central claim that PROBE 'substantially mitigates the failure modes' and achieves SOTA rests on the assumption that responses from a limited set of controlled probe edits on pre-identified editable sites generalize to predict joint affinity-druggability outcomes for subsequent multi-agent edits. No ablation or sensitivity analysis is presented that varies probe-site decomposition heuristics or LLM prompting strategies to test for systematic bias in the site map or EditManual.

    Authors: We agree that the generalization assumption is central and that explicit sensitivity analyses would provide stronger support. The current CrossDocked2020 results demonstrate mitigation of the diagnosed failure modes, but to directly test for bias we will add, in the revised manuscript, ablations that vary (i) the ligand decomposition heuristics used to identify editable sites and (ii) the prompting strategies employed during probing. These will quantify any systematic effects on site-map construction and downstream performance. revision: yes

  2. Referee: [§3.2] §3.2 (PROBE framework): the site map flags (joint-gain plausible, tension, liability) are derived from probe responses, yet the manuscript provides no direct validation that these flags remain predictive when the probe edits are replaced by the actual optimization edits produced by the multi-agent loop; this is load-bearing for the claim that probing reliably guides co-optimization.

    Authors: The site-map flags are designed to be predictive on the basis of controlled probe responses, and the multi-agent loop is guided by them. We acknowledge that a direct head-to-head comparison between probe-derived flags and the outcomes of the subsequent multi-agent edits is absent. In revision we will add an analysis that extracts the flags applied during optimization runs and measures their predictive accuracy against the observed affinity-druggability changes, thereby validating the load-bearing assumption. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical framework evaluated on external benchmark

full rationale

The paper defines two diagnostic metrics independently to quantify joint improvement failures, constructs the PROBE site map and EditManual from probe responses as a methodological step, and reports performance on the external CrossDocked2020 benchmark. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text that would reduce any claimed result to its own inputs by construction. The derivation chain is self-contained against the benchmark evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The approach rests on domain assumptions about LLM behavior in molecular editing and introduces two new constructs (site map, EditManual) whose utility is demonstrated only on the benchmark.

axioms (1)
  • domain assumption LLM agents can be prompted to perform controlled molecular edits and interpret their effects on affinity and druggability
    Core to the multi-agent loop and probing step described in the abstract.
invented entities (2)
  • site map no independent evidence
    purpose: Flags locations where joint gains are plausible, where objectives conflict, or where liabilities exist
    New construct built from probe-edit responses; no independent evidence outside the paper's benchmark.
  • EditManual no independent evidence
    purpose: Distilled guidance from probe-edit responses to direct subsequent optimization
    New construct introduced to guide the co-optimization agent; no independent evidence outside the paper's benchmark.

pith-pipeline@v0.9.1-grok · 5815 in / 1263 out tokens · 20925 ms · 2026-06-28T18:58:34.757912+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

97 extracted references · 1 canonical work pages

  1. [1]

    Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

    Cele Abad-Zapatero and James T Metz. Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

  2. [2]

    Liddia: Language-based intelligent drug discovery agent

    Reza Averly, Frazier N Baker, Ian A Watson, and Xia Ning. Liddia: Language-based intelligent drug discovery agent. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 12015–12039, 2025

  3. [3]

    New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays

    Jonathan B Baell and Georgina A Holloway. New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays. Journal of medicinal chemistry, 53(7):2719–2740, 2010

  4. [4]

    The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

    Steven H Bertz. The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

  5. [5]

    Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

    G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

  6. [6]

    Generating novel leads for drug discovery using llms with logical feedback

    Shreyas Bhat Brahmavar, Ashwin Srinivasan, Tirtharaj Dash, Sowmya Ramaswamy Krishnan, Lovekesh Vig, Arijit Roy, and Raviprasad Aduri. Generating novel leads for drug discovery using llms with logical feedback. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 21–29, 2024

  7. [7]

    Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research

    Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, and Quanquan Gu. Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research

  8. [8]

    On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

    Jorg Degen, Christof Wegscheid-Gerlach, Andrea Zaliani, and Matthias Rarey. On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

  9. [9]

    Tagmol: Target-aware gradient-guided molecule generation

    Vineeth Dorna, D Subhalingam, Keshav Kolluru, Shreshth Tuli, Mrityunjay Singh, Saurabh Singal, NM Anoop Krishnan, and Sayan Ranu. Tagmol: Target-aware gradient-guided molecule generation. InICML’24 Workshop ML for Life and Material Science: From Theory to Industry Applications

  10. [10]

    Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

    Peter Ertl and Ansgar Schuffenhauer. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

  11. [11]

    Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

    Paul G Francoeur, Tomohide Masuda, Jocelyn Sunseri, Andrew Jia, Richard B Iovanisci, Ian Snyder, and David R Koes. Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

  12. [12]

    Cidd: Collaborative intelligence for structure-based drug design empowered by llms

    Bowen Gao, Yanwen Huang, Yiqiao Liu, Wenxuan Xie, Bowei He, Haichuan Tan, Wei-Ying Ma, Ya-Qin Zhang, and Yanyan Lan. Cidd: Collaborative intelligence for structure-based drug design empowered by llms. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

  13. [13]

    Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

    Johann Gasteiger and Mario Marsili. Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

  14. [14]

    A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery

    Arup K Ghose, Vellarkad N Viswanadhan, and John J Wendoloski. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. a qualitative and quantitative characterization of known drug databases.Journal of combinatorial chemistry, 1(1):55–68, 1999

  15. [15]

    3d equivariant diffusion for target-aware molecule generation and affinity prediction

    Jiaqi Guan, Wesley Wei Qian, Xingang Peng, Yufeng Su, Jian Peng, and Jianzhu Ma. 3d equivariant diffusion for target-aware molecule generation and affinity prediction. In The Eleventh International Conference on Learning Representations, 2023. URL https: //openreview.net/forum?id=kJqXEPXMsE0. 10

  16. [16]

    Decompdiff: Diffusion models with decomposed priors for structure- based drug design

    Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, and Quanquan Gu. Decompdiff: Diffusion models with decomposed priors for structure- based drug design. InInternational Conference on Machine Learning, pages 11827–11846. PMLR, 2023

  17. [17]

    The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

    Andreia P Guerreiro, Carlos M Fonseca, and Luís Paquete. The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

  18. [18]

    arXiv preprint arXiv:2308.07413 , year=

    Charles Harris, Kieran Didi, Arian R Jamasb, Chaitanya K Joshi, Simon V Mathis, Pietro Lio, and Tom Blundell. Benchmarking generated poses: How rational is structure-based drug design with generative models?arXiv preprint arXiv:2308.07413, 2023

  19. [19]

    Empowering llms for structure-based drug design via exploration-augmented latent inference

    Xuanning Hu, Anchen Li, Qianli Xing, Jinglong Ji, Hao Tuo, and Bo Yang. Empowering llms for structure-based drug design via exploration-augmented latent inference. InProceedings of the ACM Web Conference 2026, pages 4244–4255, 2026

  20. [20]

    Protein-ligand interaction prior for binding- aware 3d molecule diffusion models

    Zhilin Huang, Ling Yang, Xiangxin Zhou, Zhilong Zhang, Wentao Zhang, Xiawu Zheng, Jie Chen, Yu Wang, Bin Cui, and Wenming Yang. Protein-ligand interaction prior for binding- aware 3d molecule diffusion models. InThe Twelfth International Conference on Learning Representations, 2024

  21. [21]

    Benchmarking the reliability of qikprop

    Leukothea Ioakimidis, Loizos Thoukydidis, Amin Mirza, Saira Naeem, and Jóhannes Reynisson. Benchmarking the reliability of qikprop. correlation between experimental and predicted values. QSAR & Combinatorial Science, 27(4):445–456, 2008

  22. [22]

    Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

    Clemens Isert, Kenneth Atz, and Gisbert Schneider. Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

  23. [23]

    A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

    Jan H Jensen. A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

  24. [24]

    Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

    Amit Kadan, Kevin Ryczko, Erika Lloyd, Adrian Roitberg, and Takeshi Yamazaki. Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

  25. [25]

    Rdkit documentation.Release, 1(1-79):4, 2013

    Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013

  26. [26]

    Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

    Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

  27. [27]

    A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

    Shitong Luo, Jiaqi Guan, Jianzhu Ma, and Jian Peng. A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

  28. [28]

    Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

    Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

  29. [29]

    Pocket2mol: Efficient molecular sampling based on 3d protein pockets

    Xingang Peng, Shitong Luo, Jiaqi Guan, Qi Xie, Jian Peng, and Jianzhu Ma. Pocket2mol: Efficient molecular sampling based on 3d protein pockets. InInternational conference on machine learning, pages 17644–17655. PMLR, 2022

  30. [30]

    Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

    Alleyn T Plowright, Craig Johnstone, Jan Kihlberg, Jonas Pettersson, Graeme Robb, and Richard A Thompson. Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

  31. [31]

    Piloting structure-based drug design via modality-specific optimal schedule

    Keyue Qiu, Yuxuan Song, Zhehuan Fan, Peidong Liu, Zhe Zhang, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Piloting structure-based drug design via modality-specific optimal schedule. InInternational Conference on Machine Learning, pages 50619–50644. PMLR, 2025

  32. [32]

    Empower structure-based molecule optimization with gradient guided bayesian flow networks

    Keyue Qiu, Yuxuan Song, Jie Yu, Hongbo Ma, Ziyao Cao, Zhilong Zhang, Yushuai Wu, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Empower structure-based molecule optimization with gradient guided bayesian flow networks. InInternational Conference on Machine Learning, pages 50645–50671. PMLR, 2025. 11

  33. [33]

    Molcraft: Structure-based drug design in continuous parameter space

    Yanru Qu, Keyue Qiu, Yuxuan Song, Jingjing Gong, Jiawei Han, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Molcraft: Structure-based drug design in continuous parameter space. In International Conference on Machine Learning, pages 41749–41768. PMLR, 2024

  34. [34]

    Mollm: Multi-objective large language model for molecular design–optimizing with experts

    Nian Ran, Yue Wang, and Richard Allmendinger. Mollm: Multi-objective large language model for molecular design–optimizing with experts. 2025

  35. [35]

    Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

    Sebastian Salentin, Sven Schreiber, V Joachim Haupt, Melissa F Adasme, and Michael Schroeder. Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

  36. [36]

    Oleg Trott and Arthur J Olson. Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.Journal of computational chemistry, 31(2):455–461, 2010

  37. [37]

    Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

    Mingyang Wang, Zhe Wang, Huiyong Sun, Jike Wang, Chao Shen, Gaoqi Weng, Xin Chai, Honglin Li, Dongsheng Cao, and Tingjun Hou. Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

  38. [38]

    The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

    Steven S Wesolowski and Dean G Brown. The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

  39. [39]

    Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

    Kehan Wu, Yingce Xia, Pan Deng, Renhe Liu, Yuan Zhang, Han Guo, Yumeng Cui, Qizhi Pei, Lijun Wu, Shufang Xie, et al. Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

  40. [40]

    Learning subpocket prototypes for generalizable structure-based drug design

    Zaixi Zhang and Qi Liu. Learning subpocket prototypes for generalizable structure-based drug design. InInternational Conference on Machine Learning, pages 41382–41398. PMLR, 2023

  41. [41]

    Structure-based drug design with geometric deep learning: A comprehensive survey

    Zaixi Zhang, Jiaxian Yan, Yining Huang, Qi Liu, Enhong Chen, Mengdi Wang, and Marinka Zitnik. Structure-based drug design with geometric deep learning: A comprehensive survey. ACM Computing Surveys, 58(5):1–35, 2025

  42. [42]

    preserve affinity while improving druggability

    Jingyuan Zhou, Dengwei Zhao, Hao Qian, Shikui Tu, and Lei Xu. Multi-objective structure- based drug design using causal discovery.IEEE Transactions on Computational Biology and Bioinformatics, 2025. 12 A Prompts Prompt A.1: MOO-awareness prompt injected into the design stage of CIDD. [Multi-Objective Optimization Guidance] You are optimizing a ligand agai...

  43. [43]

    Identify all optimization goals mentioned in the reasoning

  44. [44]

    If only affinity-related goals are mentioned, output: affinity

  45. [45]

    If only druggability-related goals are mentioned, output: druggability

  46. [46]

    the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one

    If both are mentioned but one is the main driver of the edit (e.g. the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one

  47. [47]

    Only output "both" when the reasoning treats affinity and druggability as co-equal targets of the same edit

  48. [48]

    general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent

    Ignore meta-commentary that is not about this specific edit (e.g. general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent. Reasoning text: {reasoning} Prompt A.3: Prompt for PROBE in site map construction. You are an elite Director of Medicinal Chemistry, Expert Toxicologist, and Structural Biologist. Your task ...

  49. [49]

    Fused ring systems should not be completely saturated (all sp3 carbons)

  50. [50]

    Ring carbon atoms should ideally have consistent hybridization

  51. [51]

    [Input Data]

    Reactive groups (e.g., Michael acceptors, anilines) count as structural alerts. [Input Data]

  52. [52]

    Structure SMILES : {smi}

  53. [53]

    Mapped SMILES : {mapped_smi}

  54. [54]

    Binding Metrics : Vina Score {score} | Ligand Efficiency (LE) {mc_data['le']:.3f}

  55. [55]

    Properties : MW {mc_data['mw']:.1f} | LogP {mc_data['logp']:.2f} | TPSA {mc_data['tpsa']:.1f}

  56. [56]

    Complexity : Chiral Centers {mc_data['chiral_centers']} | BertzCT {mc_data['sa_proxy']:.1f}

  57. [57]

    Flatten this complex sp3 ring system

    Structural Alerts: {','.join(mc_data['alerts']) if mc_data['alerts'] else 'Clean'} 14 [Diagnosis Signals] A. Interaction signals (PLIP): {interaction_text} B. Geometric signals -- Clashes: {clash_text} C. Geometric signals -- Voids / Buriedness: {void_text} D. Geometric signals -- Chemical Mismatch: {mismatch_text} E. Fragment-level interaction mapping (B...

  58. [58]

    Base each strategy strictly on the chemical nature of the sites in the Site Map

    NO PROMPT COPYING. Base each strategy strictly on the chemical nature of the sites in the Site Map

  59. [59]

    Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)

    ABSTRACT OPERATIONS ONLY. Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)

  60. [60]

    "" target_strategies = ['A','B','C'] for stg_key in target_strategies: stg_content = strategy_dict[stg_key] if stg_key =='A': tactic_instructions = r

    STRUCTURAL-ALERT / COMPLEXITY COMPLIANCE. Unless the strategy explicitly accepts druggability damage (Strategy A), structural modifications should reduce complexity, avoid unnecessary chiral centers, and avoid spiro / bridged / fully-sp3 fused ring systems. [Task: Emit three Strategies sigma_k = < V_k, pi_k, tau_k >] For each strategy, output exactly the ...

  61. [61]

    High intensity

    "High intensity" -- Fill pocket void with ring. Substantial volume addition (Delta N_heavy >= +4); insert a complete standard ring system (e.g., phenyl / pyridyl / morpholino) into a void identified for sigma_A

  62. [62]

    Medium intensity

    "Medium intensity" -- Add functional group. Moderate volume addition (+1 <= Delta N_heavy <= +3); attach a small group (isopropyl, CF3, amide linker)

  63. [63]

    Low intensity

    "Low intensity" -- Isosteric tweak. 17 Minimal volume change (Delta N_heavy in {0, +1}); bioisosteric substitution (-CH3 -> -CF3, phenyl -> pyridyl) to retune electrostatics

  64. [64]

    Counterfactual

    "Counterfactual" -- Delete functional group. Reverse-direction probe (Delta N_heavy <= -1): remove a key H-bond donor / acceptor or anchor. Expected signal: if Vina worsens, the original group is load-bearing; if Vina is unchanged, the strategy probes a non-causal direction; if Vina improves, the targeted group is actively harmful. """ elif stg_key =='B':...

  65. [65]

    High intensity

    "High intensity" -- Prune Liability / Tension sites. Aggressive pruning (Delta N_heavy <= -4); remove bulky lipophilic groups, open complex fused rings into a single ring, or strip multiple chiral centers in one move

  66. [66]

    Medium intensity

    "Medium intensity" -- Trim peripheral liabilities. Surgical pruning (-3 <= Delta N_heavy <= -1); drop redundant terminals or simplify a substituted ring into an unsubstituted standard ring

  67. [67]

    Low intensity

    "Low intensity" -- Shave solvent-exposed atoms. Minimal pruning (Delta N_heavy = -1); remove a single solvent-exposed atom (terminal methyl / halogen) that contributes no binding energy

  68. [68]

    Counterfactual

    "Counterfactual" -- Add sp3 bloat. Reverse-direction probe (Delta N_heavy >= +3): attach a synthetically difficult, unnecessary sp3-rich bulky group (tert-butyl / cyclopentyl) to a solvent-exposed atom. Expected signal: if SA / QED collapse without Vina gain, mass at this site is purely harmful, confirming the pruning direction is causal. """ else: # stg_...

  69. [69]

    High intensity

    "High intensity" -- Replace core scaffold. Major topological shift with bounded mass change (-2 <= Delta N_heavy <= +2); swap a central ring or core linker for a 18 fundamentally different standard ring (phenyl -> tetrahydropyran / pyrimidine / piperazine) to reshape 3D geometry and Fsp3 without molecular-weight bloat

  70. [70]

    Medium intensity

    "Medium intensity" -- Peripheral bioisostere swap. Local topological shift (-1 <= Delta N_heavy <= +1); replace a peripheral group with a recognized bioisostere (carboxylic acid -> tetrazole, amide -> 1,2,4-triazole)

  71. [71]

    Low intensity

    "Low intensity" -- Regio-isomeric shift. No mass change (Delta N_heavy = 0); keep the groups, alter connectivity (ortho -> meta / para; reverse an amide -C(=O)NH- -> -NHC(=O)-)

  72. [72]

    Counterfactual

    "Counterfactual" -- Break geometric constraint. Deliberately violate a mandatory geometry: convert a planar aromatic ring essential for pi-stacking into a saturated ring (benzene -> cyclohexane), or rigidify a flexible linker via an alkyne to disrupt induced fit. Expected signal: if Vina collapses, the original rigid / planar topology is load-bearing for ...

  73. [73]

    Target: [one atom index]

    "Grow": Attach a new fragment by replacing an implicit hydrogen on the target. Target: [one atom index]

  74. [74]

    Replace_Terminal_Group

    "Replace_Terminal_Group": Cut a peripheral group (-OH, -CH3, halogen, ...) and swap it. Target: [one atom index of the group to be removed]

  75. [75]

    Replace_Sidechain_or_Ring

    "Replace_Sidechain_or_Ring": 19 Cut an entire sidechain or ring system attached to the main scaffold. Target: [one atom index -- MUST be the anchoring atom of the sidechain / ring]

  76. [76]

    Delete_Group

    "Delete_Group": Remove a peripheral group without replacement (bond capped by H). Target: [one atom index]. [Fragment Constraints -- SEMANTIC TAGS ONLY] Do NOT output raw numbers for physicochemical properties. Use these tags: - size : "Small" (MW < 150) / "Medium" (150-250) / "Large" (> 250) / "Any" - polarity : "Hydrophilic" / "Neutral" / "Lipophilic" /...

  77. [77]

    Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name

    SITE-LEVEL ABSTRACTION. Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name. The molecule mutates across rounds; atom indices are not stable, but each site is named after its originating fragment / pharmacophore role

  78. [78]

    For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies

    EVIDENCE AGGREGATION. For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies. Cite probe_ids when stating a verdict

  79. [79]

    For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)

    STRATEGY VERDICT. For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)

  80. [80]

    Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge

    SEMANTIC ENVELOPES. Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge

Showing first 80 references.