Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design

James Kwok; Weiyu Chen; Yaqing Wang; Zaifei Yang

arxiv: 2606.00555 · v2 · pith:HOPF4JOMnew · submitted 2026-05-30 · 💻 cs.AI · q-bio.BM

Probe Before You Edit: Probing-Guided Molecular Optimization for LLM Agents in Structure-Based Drug Design

Zaifei Yang , Weiyu Chen , Yaqing Wang , James Kwok This is my paper

Pith reviewed 2026-06-28 18:58 UTC · model grok-4.3

classification 💻 cs.AI q-bio.BM

keywords LLM agentsstructure-based drug designmolecular optimizationprobingbinding affinitydruggabilityCrossDocked2020multi-agent systems

0 comments

The pith

PROBE uses controlled probe edits to build site maps that guide LLM agents toward edits improving both binding affinity and druggability.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Current LLM-agent pipelines for structure-based drug design edit ligands without first testing local responses, so single steps rarely advance both binding affinity and druggability at once. The paper quantifies this with two new diagnostics: one tracking joint gains and one tracking trade-off cases. PROBE mimics medicinal chemists by decomposing ligands into editable sites, running controlled probe edits, and distilling the responses into a pocket-specific site map plus an EditManual. These artifacts then steer a three-agent loop (affinity, druggability, co-optimization) that produces more successful joint edits. On the CrossDocked2020 benchmark the method reaches state-of-the-art results while lowering the frequency of the diagnosed failure modes.

Core claim

The central claim is that edit-response probing first decomposes the ligand into editable sites and constructs a pocket-specific site map that flags where joint affinity-druggability gains are plausible, where the objectives are likely in tension, and where liability substructures should change; the responses are then distilled into an EditManual that, together with the map, directs an iterative multi-agent loop of affinity, druggability, and co-optimization agents to generate edits that more frequently satisfy both objectives simultaneously.

What carries the argument

edit-response probing that builds a pocket-specific site map and EditManual from controlled edits on editable sites to guide subsequent optimization

If this is right

Single molecular edits improve both binding affinity and druggability together more often than in standard LLM pipelines.
Cases in which a gain on one objective produces a loss on the other become less frequent.
The method attains state-of-the-art performance on the CrossDocked2020 benchmark.
The diagnostic failure modes of unguided editing are substantially reduced.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same probing step could be inserted into other multi-objective molecular design loops that pit affinity against selectivity or ADMET properties.
Site-map construction may reduce the total number of LLM calls needed by pruning low-value edit directions early.
The approach invites testing whether the probe-derived guidance transfers across protein families without retraining the underlying language model.

Load-bearing premise

Responses from a limited set of controlled probe edits on editable sites will generalize to predict joint affinity-druggability outcomes for later optimization edits without systematic bias from probe selection or LLM prompting.

What would settle it

A controlled experiment on held-out ligands in which the frequency of joint-improvement edits remains unchanged when the site map and EditManual are removed from the agent loop would falsify the claim that probing enables better joint optimization.

Figures

Figures reproduced from arXiv: 2606.00555 by James Kwok, Weiyu Chen, Yaqing Wang, Zaifei Yang.

**Figure 2.** Figure 2: Computational budget and performance comparison. CIDD (27 Rounds) is the baseline [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison on target PLCD1. Hydrogen bonds and hydrophobic contacts are marked [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

read the original abstract

Structure-based drug design increasingly employs LLM agents to iteratively refine ligands against a target pocket, yet a viable ligand must satisfy two often-conflicting objectives -- binding affinity and druggability -- which single optimization steps rarely improve together. To quantify this difficulty, we introduce two diagnostic metrics: the first measures how often a single edit improves both objectives, and the second measures how often a gain on one objective comes with a loss on the other. Applying these diagnostics to current LLM-agent pipelines exposes a consistent failure mode: the agent performs molecular editing without knowing how the pocket-ligand complex responds to local modifications, thus rarely achieving joint improvement. Inspired by medicinal chemists, who probe the pocket-ligand complex with controlled analog edits before choosing an optimization direction, we propose \textbf{PROBE}, an optimization framework built around edit-response probing. PROBE first decomposes the ligand into editable sites and builds a pocket-specific \textbf{site map} that flags where joint gains are plausible, where the two objectives are likely in tension, and where liability substructures should be changed; it then performs controlled probe edits whose responses are distilled into an \textbf{EditManual}. Guided by the site map and EditManual, PROBE runs an iterative multi-agent loop in which an affinity agent, a druggability agent, and a co-optimization agent jointly produce edits. On the CrossDocked2020 benchmark, PROBE achieves state-of-the-art performance and substantially mitigates the failure modes exposed by our diagnostics metrics.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PROBE adds clear diagnostics for joint vs conflicting gains plus a probing step to guide LLM molecular edits, but the SOTA claim rests on thin abstract-level evidence.

read the letter

The main thing to know is that this paper flags a practical failure mode in LLM agents for structure-based drug design—single edits almost never improve both affinity and druggability together—and introduces a probing phase to build site maps and EditManuals that steer a multi-agent loop toward joint gains.

What they do well is define two straightforward diagnostic metrics that measure joint-improvement frequency and trade-off frequency. Those metrics expose the problem cleanly and give the field something concrete to track. The probing approach, which decomposes the ligand into editable sites, runs controlled edits to map where joint gains are plausible, and distills responses into guidance, is a reasonable step past the single-step editing in the cited prior work. Framing it around how medicinal chemists actually work is sensible, and the multi-agent split (affinity, druggability, co-optimization) follows logically from the diagnostics.

The soft spots are mostly around verification. The abstract states SOTA on CrossDocked2020 and substantial mitigation of the failure modes, yet supplies no numbers, baselines, data splits, or significance tests, so the size of the improvement is impossible to judge. The stress-test concern about generalization from a limited set of probe edits is worth taking seriously; if the paper does not test whether different site decompositions or prompt variations change the site-map flags or final performance, the central claim stays vulnerable to selection or prompting bias. The new terms are fine once defined, but they add to the need for clear methods.

This is for people building or evaluating LLM agents for computational molecular optimization. A reader in that niche would get value from the diagnostics even without adopting the full pipeline. It deserves a serious referee because the problem is real, the approach is distinct, and the benchmark claim is testable once the details are supplied.

I would send it to peer review.

Referee Report

2 major / 2 minor

Summary. The paper claims that LLM agents for iterative ligand refinement in structure-based drug design rarely achieve simultaneous gains in binding affinity and druggability. It introduces two diagnostic metrics to quantify this failure mode, proposes the PROBE framework that decomposes ligands into editable sites, constructs a pocket-specific site map, performs controlled probe edits to build an EditManual, and then runs a guided multi-agent optimization loop (affinity, druggability, and co-optimization agents). On the CrossDocked2020 benchmark, PROBE is reported to reach state-of-the-art performance while substantially reducing the diagnosed failure modes.

Significance. If the empirical results and generalization hold, the work offers a concrete, benchmark-driven method to mitigate a recurring joint-optimization failure in LLM-based molecular design by emulating medicinal-chemistry probing. The independent definition of the diagnostic metrics (separate from the optimization loop) is a methodological strength, as is the explicit focus on falsifiable failure-mode quantification rather than post-hoc success rates alone.

major comments (2)

[Abstract and §4] Abstract and §4 (Results): the central claim that PROBE 'substantially mitigates the failure modes' and achieves SOTA rests on the assumption that responses from a limited set of controlled probe edits on pre-identified editable sites generalize to predict joint affinity-druggability outcomes for subsequent multi-agent edits. No ablation or sensitivity analysis is presented that varies probe-site decomposition heuristics or LLM prompting strategies to test for systematic bias in the site map or EditManual.
[§3.2] §3.2 (PROBE framework): the site map flags (joint-gain plausible, tension, liability) are derived from probe responses, yet the manuscript provides no direct validation that these flags remain predictive when the probe edits are replaced by the actual optimization edits produced by the multi-agent loop; this is load-bearing for the claim that probing reliably guides co-optimization.

minor comments (2)

[Table captions and §4] Table captions and §4: exact definitions of the two diagnostic metrics (joint-improvement rate and trade-off rate), the precise baselines, data splits, and statistical tests used for the CrossDocked2020 comparison are not summarized in the abstract and should be stated explicitly in the results tables.
[Methods] Notation: the terms 'site map' and 'EditManual' are used throughout without a concise formal definition or pseudocode in the methods section; a one-paragraph boxed definition would improve clarity.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. The comments highlight important aspects of validation that can strengthen the manuscript. We address each major comment below and commit to incorporating additional analyses in the revision.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Results): the central claim that PROBE 'substantially mitigates the failure modes' and achieves SOTA rests on the assumption that responses from a limited set of controlled probe edits on pre-identified editable sites generalize to predict joint affinity-druggability outcomes for subsequent multi-agent edits. No ablation or sensitivity analysis is presented that varies probe-site decomposition heuristics or LLM prompting strategies to test for systematic bias in the site map or EditManual.

Authors: We agree that the generalization assumption is central and that explicit sensitivity analyses would provide stronger support. The current CrossDocked2020 results demonstrate mitigation of the diagnosed failure modes, but to directly test for bias we will add, in the revised manuscript, ablations that vary (i) the ligand decomposition heuristics used to identify editable sites and (ii) the prompting strategies employed during probing. These will quantify any systematic effects on site-map construction and downstream performance. revision: yes
Referee: [§3.2] §3.2 (PROBE framework): the site map flags (joint-gain plausible, tension, liability) are derived from probe responses, yet the manuscript provides no direct validation that these flags remain predictive when the probe edits are replaced by the actual optimization edits produced by the multi-agent loop; this is load-bearing for the claim that probing reliably guides co-optimization.

Authors: The site-map flags are designed to be predictive on the basis of controlled probe responses, and the multi-agent loop is guided by them. We acknowledge that a direct head-to-head comparison between probe-derived flags and the outcomes of the subsequent multi-agent edits is absent. In revision we will add an analysis that extracts the flags applied during optimization runs and measures their predictive accuracy against the observed affinity-druggability changes, thereby validating the load-bearing assumption. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical framework evaluated on external benchmark

full rationale

The paper defines two diagnostic metrics independently to quantify joint improvement failures, constructs the PROBE site map and EditManual from probe responses as a methodological step, and reports performance on the external CrossDocked2020 benchmark. No equations, fitted parameters renamed as predictions, or self-citation chains appear in the provided text that would reduce any claimed result to its own inputs by construction. The derivation chain is self-contained against the benchmark evaluation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The approach rests on domain assumptions about LLM behavior in molecular editing and introduces two new constructs (site map, EditManual) whose utility is demonstrated only on the benchmark.

axioms (1)

domain assumption LLM agents can be prompted to perform controlled molecular edits and interpret their effects on affinity and druggability
Core to the multi-agent loop and probing step described in the abstract.

invented entities (2)

site map no independent evidence
purpose: Flags locations where joint gains are plausible, where objectives conflict, or where liabilities exist
New construct built from probe-edit responses; no independent evidence outside the paper's benchmark.
EditManual no independent evidence
purpose: Distilled guidance from probe-edit responses to direct subsequent optimization
New construct introduced to guide the co-optimization agent; no independent evidence outside the paper's benchmark.

pith-pipeline@v0.9.1-grok · 5815 in / 1263 out tokens · 20925 ms · 2026-06-28T18:58:34.757912+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

97 extracted references · 1 canonical work pages

[1]

Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

Cele Abad-Zapatero and James T Metz. Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

2005
[2]

Liddia: Language-based intelligent drug discovery agent

Reza Averly, Frazier N Baker, Ian A Watson, and Xia Ning. Liddia: Language-based intelligent drug discovery agent. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 12015–12039, 2025

2025
[3]

New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays

Jonathan B Baell and Georgina A Holloway. New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays. Journal of medicinal chemistry, 53(7):2719–2740, 2010

2010
[4]

The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

Steven H Bertz. The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

1981
[5]

Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

2012
[6]

Generating novel leads for drug discovery using llms with logical feedback

Shreyas Bhat Brahmavar, Ashwin Srinivasan, Tirtharaj Dash, Sowmya Ramaswamy Krishnan, Lovekesh Vig, Arijit Roy, and Raviprasad Aduri. Generating novel leads for drug discovery using llms with logical feedback. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 21–29, 2024

2024
[7]

Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research

Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, and Quanquan Gu. Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research
[8]

On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

Jorg Degen, Christof Wegscheid-Gerlach, Andrea Zaliani, and Matthias Rarey. On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

2008
[9]

Tagmol: Target-aware gradient-guided molecule generation

Vineeth Dorna, D Subhalingam, Keshav Kolluru, Shreshth Tuli, Mrityunjay Singh, Saurabh Singal, NM Anoop Krishnan, and Sayan Ranu. Tagmol: Target-aware gradient-guided molecule generation. InICML’24 Workshop ML for Life and Material Science: From Theory to Industry Applications
[10]

Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

Peter Ertl and Ansgar Schuffenhauer. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

2009
[11]

Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

Paul G Francoeur, Tomohide Masuda, Jocelyn Sunseri, Andrew Jia, Richard B Iovanisci, Ian Snyder, and David R Koes. Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

2020
[12]

Cidd: Collaborative intelligence for structure-based drug design empowered by llms

Bowen Gao, Yanwen Huang, Yiqiao Liu, Wenxuan Xie, Bowei He, Haichuan Tan, Wei-Ying Ma, Ya-Qin Zhang, and Yanyan Lan. Cidd: Collaborative intelligence for structure-based drug design empowered by llms. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

2025
[13]

Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

Johann Gasteiger and Mario Marsili. Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

1980
[14]

A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery

Arup K Ghose, Vellarkad N Viswanadhan, and John J Wendoloski. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. a qualitative and quantitative characterization of known drug databases.Journal of combinatorial chemistry, 1(1):55–68, 1999

1999
[15]

3d equivariant diffusion for target-aware molecule generation and affinity prediction

Jiaqi Guan, Wesley Wei Qian, Xingang Peng, Yufeng Su, Jian Peng, and Jianzhu Ma. 3d equivariant diffusion for target-aware molecule generation and affinity prediction. In The Eleventh International Conference on Learning Representations, 2023. URL https: //openreview.net/forum?id=kJqXEPXMsE0. 10

2023
[16]

Decompdiff: Diffusion models with decomposed priors for structure- based drug design

Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, and Quanquan Gu. Decompdiff: Diffusion models with decomposed priors for structure- based drug design. InInternational Conference on Machine Learning, pages 11827–11846. PMLR, 2023

2023
[17]

The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

Andreia P Guerreiro, Carlos M Fonseca, and Luís Paquete. The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

2021
[18]

arXiv preprint arXiv:2308.07413 , year=

Charles Harris, Kieran Didi, Arian R Jamasb, Chaitanya K Joshi, Simon V Mathis, Pietro Lio, and Tom Blundell. Benchmarking generated poses: How rational is structure-based drug design with generative models?arXiv preprint arXiv:2308.07413, 2023

work page arXiv 2023
[19]

Empowering llms for structure-based drug design via exploration-augmented latent inference

Xuanning Hu, Anchen Li, Qianli Xing, Jinglong Ji, Hao Tuo, and Bo Yang. Empowering llms for structure-based drug design via exploration-augmented latent inference. InProceedings of the ACM Web Conference 2026, pages 4244–4255, 2026

2026
[20]

Protein-ligand interaction prior for binding- aware 3d molecule diffusion models

Zhilin Huang, Ling Yang, Xiangxin Zhou, Zhilong Zhang, Wentao Zhang, Xiawu Zheng, Jie Chen, Yu Wang, Bin Cui, and Wenming Yang. Protein-ligand interaction prior for binding- aware 3d molecule diffusion models. InThe Twelfth International Conference on Learning Representations, 2024

2024
[21]

Benchmarking the reliability of qikprop

Leukothea Ioakimidis, Loizos Thoukydidis, Amin Mirza, Saira Naeem, and Jóhannes Reynisson. Benchmarking the reliability of qikprop. correlation between experimental and predicted values. QSAR & Combinatorial Science, 27(4):445–456, 2008

2008
[22]

Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

Clemens Isert, Kenneth Atz, and Gisbert Schneider. Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

2023
[23]

A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

Jan H Jensen. A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

2019
[24]

Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

Amit Kadan, Kevin Ryczko, Erika Lloyd, Adrian Roitberg, and Takeshi Yamazaki. Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

2025
[25]

Rdkit documentation.Release, 1(1-79):4, 2013

Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013

2013
[26]

Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

2012
[27]

A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

Shitong Luo, Jiaqi Guan, Jianzhu Ma, and Jian Peng. A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

2021
[28]

Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

2011
[29]

Pocket2mol: Efficient molecular sampling based on 3d protein pockets

Xingang Peng, Shitong Luo, Jiaqi Guan, Qi Xie, Jian Peng, and Jianzhu Ma. Pocket2mol: Efficient molecular sampling based on 3d protein pockets. InInternational conference on machine learning, pages 17644–17655. PMLR, 2022

2022
[30]

Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

Alleyn T Plowright, Craig Johnstone, Jan Kihlberg, Jonas Pettersson, Graeme Robb, and Richard A Thompson. Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

2012
[31]

Piloting structure-based drug design via modality-specific optimal schedule

Keyue Qiu, Yuxuan Song, Zhehuan Fan, Peidong Liu, Zhe Zhang, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Piloting structure-based drug design via modality-specific optimal schedule. InInternational Conference on Machine Learning, pages 50619–50644. PMLR, 2025

2025
[32]

Empower structure-based molecule optimization with gradient guided bayesian flow networks

Keyue Qiu, Yuxuan Song, Jie Yu, Hongbo Ma, Ziyao Cao, Zhilong Zhang, Yushuai Wu, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Empower structure-based molecule optimization with gradient guided bayesian flow networks. InInternational Conference on Machine Learning, pages 50645–50671. PMLR, 2025. 11

2025
[33]

Molcraft: Structure-based drug design in continuous parameter space

Yanru Qu, Keyue Qiu, Yuxuan Song, Jingjing Gong, Jiawei Han, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Molcraft: Structure-based drug design in continuous parameter space. In International Conference on Machine Learning, pages 41749–41768. PMLR, 2024

2024
[34]

Mollm: Multi-objective large language model for molecular design–optimizing with experts

Nian Ran, Yue Wang, and Richard Allmendinger. Mollm: Multi-objective large language model for molecular design–optimizing with experts. 2025

2025
[35]

Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

Sebastian Salentin, Sven Schreiber, V Joachim Haupt, Melissa F Adasme, and Michael Schroeder. Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

2015
[36]

Oleg Trott and Arthur J Olson. Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.Journal of computational chemistry, 31(2):455–461, 2010

2010
[37]

Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

Mingyang Wang, Zhe Wang, Huiyong Sun, Jike Wang, Chao Shen, Gaoqi Weng, Xin Chai, Honglin Li, Dongsheng Cao, and Tingjun Hou. Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

2022
[38]

The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

Steven S Wesolowski and Dean G Brown. The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

2016
[39]

Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

Kehan Wu, Yingce Xia, Pan Deng, Renhe Liu, Yuan Zhang, Han Guo, Yumeng Cui, Qizhi Pei, Lijun Wu, Shufang Xie, et al. Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

2024
[40]

Learning subpocket prototypes for generalizable structure-based drug design

Zaixi Zhang and Qi Liu. Learning subpocket prototypes for generalizable structure-based drug design. InInternational Conference on Machine Learning, pages 41382–41398. PMLR, 2023

2023
[41]

Structure-based drug design with geometric deep learning: A comprehensive survey

Zaixi Zhang, Jiaxian Yan, Yining Huang, Qi Liu, Enhong Chen, Mengdi Wang, and Marinka Zitnik. Structure-based drug design with geometric deep learning: A comprehensive survey. ACM Computing Surveys, 58(5):1–35, 2025

2025
[42]

preserve affinity while improving druggability

Jingyuan Zhou, Dengwei Zhao, Hao Qian, Shikui Tu, and Lei Xu. Multi-objective structure- based drug design using causal discovery.IEEE Transactions on Computational Biology and Bioinformatics, 2025. 12 A Prompts Prompt A.1: MOO-awareness prompt injected into the design stage of CIDD. [Multi-Objective Optimization Guidance] You are optimizing a ligand agai...

2025
[43]

Identify all optimization goals mentioned in the reasoning
[44]

If only affinity-related goals are mentioned, output: affinity
[45]

If only druggability-related goals are mentioned, output: druggability
[46]

the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one

If both are mentioned but one is the main driver of the edit (e.g. the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one
[47]

Only output "both" when the reasoning treats affinity and druggability as co-equal targets of the same edit
[48]

general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent

Ignore meta-commentary that is not about this specific edit (e.g. general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent. Reasoning text: {reasoning} Prompt A.3: Prompt for PROBE in site map construction. You are an elite Director of Medicinal Chemistry, Expert Toxicologist, and Structural Biologist. Your task ...
[49]

Fused ring systems should not be completely saturated (all sp3 carbons)
[50]

Ring carbon atoms should ideally have consistent hybridization
[51]

[Input Data]

Reactive groups (e.g., Michael acceptors, anilines) count as structural alerts. [Input Data]
[52]

Structure SMILES : {smi}
[53]

Mapped SMILES : {mapped_smi}
[54]

Binding Metrics : Vina Score {score} | Ligand Efficiency (LE) {mc_data['le']:.3f}
[55]

Properties : MW {mc_data['mw']:.1f} | LogP {mc_data['logp']:.2f} | TPSA {mc_data['tpsa']:.1f}
[56]

Complexity : Chiral Centers {mc_data['chiral_centers']} | BertzCT {mc_data['sa_proxy']:.1f}
[57]

Flatten this complex sp3 ring system

Structural Alerts: {','.join(mc_data['alerts']) if mc_data['alerts'] else 'Clean'} 14 [Diagnosis Signals] A. Interaction signals (PLIP): {interaction_text} B. Geometric signals -- Clashes: {clash_text} C. Geometric signals -- Voids / Buriedness: {void_text} D. Geometric signals -- Chemical Mismatch: {mismatch_text} E. Fragment-level interaction mapping (B...
[58]

Base each strategy strictly on the chemical nature of the sites in the Site Map

NO PROMPT COPYING. Base each strategy strictly on the chemical nature of the sites in the Site Map
[59]

Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)

ABSTRACT OPERATIONS ONLY. Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)
[60]

"" target_strategies = ['A','B','C'] for stg_key in target_strategies: stg_content = strategy_dict[stg_key] if stg_key =='A': tactic_instructions = r

STRUCTURAL-ALERT / COMPLEXITY COMPLIANCE. Unless the strategy explicitly accepts druggability damage (Strategy A), structural modifications should reduce complexity, avoid unnecessary chiral centers, and avoid spiro / bridged / fully-sp3 fused ring systems. [Task: Emit three Strategies sigma_k = < V_k, pi_k, tau_k >] For each strategy, output exactly the ...
[61]

High intensity

"High intensity" -- Fill pocket void with ring. Substantial volume addition (Delta N_heavy >= +4); insert a complete standard ring system (e.g., phenyl / pyridyl / morpholino) into a void identified for sigma_A
[62]

Medium intensity

"Medium intensity" -- Add functional group. Moderate volume addition (+1 <= Delta N_heavy <= +3); attach a small group (isopropyl, CF3, amide linker)
[63]

Low intensity

"Low intensity" -- Isosteric tweak. 17 Minimal volume change (Delta N_heavy in {0, +1}); bioisosteric substitution (-CH3 -> -CF3, phenyl -> pyridyl) to retune electrostatics
[64]

Counterfactual

"Counterfactual" -- Delete functional group. Reverse-direction probe (Delta N_heavy <= -1): remove a key H-bond donor / acceptor or anchor. Expected signal: if Vina worsens, the original group is load-bearing; if Vina is unchanged, the strategy probes a non-causal direction; if Vina improves, the targeted group is actively harmful. """ elif stg_key =='B':...
[65]

High intensity

"High intensity" -- Prune Liability / Tension sites. Aggressive pruning (Delta N_heavy <= -4); remove bulky lipophilic groups, open complex fused rings into a single ring, or strip multiple chiral centers in one move
[66]

Medium intensity

"Medium intensity" -- Trim peripheral liabilities. Surgical pruning (-3 <= Delta N_heavy <= -1); drop redundant terminals or simplify a substituted ring into an unsubstituted standard ring
[67]

Low intensity

"Low intensity" -- Shave solvent-exposed atoms. Minimal pruning (Delta N_heavy = -1); remove a single solvent-exposed atom (terminal methyl / halogen) that contributes no binding energy
[68]

Counterfactual

"Counterfactual" -- Add sp3 bloat. Reverse-direction probe (Delta N_heavy >= +3): attach a synthetically difficult, unnecessary sp3-rich bulky group (tert-butyl / cyclopentyl) to a solvent-exposed atom. Expected signal: if SA / QED collapse without Vina gain, mass at this site is purely harmful, confirming the pruning direction is causal. """ else: # stg_...
[69]

High intensity

"High intensity" -- Replace core scaffold. Major topological shift with bounded mass change (-2 <= Delta N_heavy <= +2); swap a central ring or core linker for a 18 fundamentally different standard ring (phenyl -> tetrahydropyran / pyrimidine / piperazine) to reshape 3D geometry and Fsp3 without molecular-weight bloat
[70]

Medium intensity

"Medium intensity" -- Peripheral bioisostere swap. Local topological shift (-1 <= Delta N_heavy <= +1); replace a peripheral group with a recognized bioisostere (carboxylic acid -> tetrazole, amide -> 1,2,4-triazole)
[71]

Low intensity

"Low intensity" -- Regio-isomeric shift. No mass change (Delta N_heavy = 0); keep the groups, alter connectivity (ortho -> meta / para; reverse an amide -C(=O)NH- -> -NHC(=O)-)
[72]

Counterfactual

"Counterfactual" -- Break geometric constraint. Deliberately violate a mandatory geometry: convert a planar aromatic ring essential for pi-stacking into a saturated ring (benzene -> cyclohexane), or rigidify a flexible linker via an alkyne to disrupt induced fit. Expected signal: if Vina collapses, the original rigid / planar topology is load-bearing for ...
[73]

Target: [one atom index]

"Grow": Attach a new fragment by replacing an implicit hydrogen on the target. Target: [one atom index]
[74]

Replace_Terminal_Group

"Replace_Terminal_Group": Cut a peripheral group (-OH, -CH3, halogen, ...) and swap it. Target: [one atom index of the group to be removed]
[75]

Replace_Sidechain_or_Ring

"Replace_Sidechain_or_Ring": 19 Cut an entire sidechain or ring system attached to the main scaffold. Target: [one atom index -- MUST be the anchoring atom of the sidechain / ring]
[76]

Delete_Group

"Delete_Group": Remove a peripheral group without replacement (bond capped by H). Target: [one atom index]. [Fragment Constraints -- SEMANTIC TAGS ONLY] Do NOT output raw numbers for physicochemical properties. Use these tags: - size : "Small" (MW < 150) / "Medium" (150-250) / "Large" (> 250) / "Any" - polarity : "Hydrophilic" / "Neutral" / "Lipophilic" /...
[77]

Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name

SITE-LEVEL ABSTRACTION. Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name. The molecule mutates across rounds; atom indices are not stable, but each site is named after its originating fragment / pharmacophore role
[78]

For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies

EVIDENCE AGGREGATION. For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies. Cite probe_ids when stating a verdict
[79]

For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)

STRATEGY VERDICT. For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)
[80]

Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge

SEMANTIC ENVELOPES. Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge

Showing first 80 references.

[1] [1]

Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

Cele Abad-Zapatero and James T Metz. Ligand efficiency indices as guideposts for drug discovery.Drug discovery today, 10(7):464–469, 2005

2005

[2] [2]

Liddia: Language-based intelligent drug discovery agent

Reza Averly, Frazier N Baker, Ian A Watson, and Xia Ning. Liddia: Language-based intelligent drug discovery agent. InProceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 12015–12039, 2025

2025

[3] [3]

New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays

Jonathan B Baell and Georgina A Holloway. New substructure filters for removal of pan assay interference compounds (pains) from screening libraries and for their exclusion in bioassays. Journal of medicinal chemistry, 53(7):2719–2740, 2010

2010

[4] [4]

The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

Steven H Bertz. The first general index of molecular complexity.Journal of the American Chemical Society, 103(12):3599–3601, 1981

1981

[5] [5]

Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

2012

[6] [6]

Generating novel leads for drug discovery using llms with logical feedback

Shreyas Bhat Brahmavar, Ashwin Srinivasan, Tirtharaj Dash, Sowmya Ramaswamy Krishnan, Lovekesh Vig, Arijit Roy, and Raviprasad Aduri. Generating novel leads for drug discovery using llms with logical feedback. InProceedings of the AAAI conference on artificial intelligence, volume 38, pages 21–29, 2024

2024

[7] [7]

Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research

Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, and Quanquan Gu. Decomposed direct preference optimization for structure-based drug design.Transactions on Machine Learning Research

[8] [8]

On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

Jorg Degen, Christof Wegscheid-Gerlach, Andrea Zaliani, and Matthias Rarey. On the art of compiling and using’drug-like’chemical fragment spaces.ChemMedChem, 3(10):1503, 2008

2008

[9] [9]

Tagmol: Target-aware gradient-guided molecule generation

Vineeth Dorna, D Subhalingam, Keshav Kolluru, Shreshth Tuli, Mrityunjay Singh, Saurabh Singal, NM Anoop Krishnan, and Sayan Ranu. Tagmol: Target-aware gradient-guided molecule generation. InICML’24 Workshop ML for Life and Material Science: From Theory to Industry Applications

[10] [10]

Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

Peter Ertl and Ansgar Schuffenhauer. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions.Journal of cheminfor- matics, 1(1):8, 2009

2009

[11] [11]

Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

Paul G Francoeur, Tomohide Masuda, Jocelyn Sunseri, Andrew Jia, Richard B Iovanisci, Ian Snyder, and David R Koes. Three-dimensional convolutional neural networks and a cross- docked data set for structure-based drug design.Journal of chemical information and modeling, 60(9):4200–4215, 2020

2020

[12] [12]

Cidd: Collaborative intelligence for structure-based drug design empowered by llms

Bowen Gao, Yanwen Huang, Yiqiao Liu, Wenxuan Xie, Bowei He, Haichuan Tan, Wei-Ying Ma, Ya-Qin Zhang, and Yanyan Lan. Cidd: Collaborative intelligence for structure-based drug design empowered by llms. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025

2025

[13] [13]

Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

Johann Gasteiger and Mario Marsili. Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges.Tetrahedron, 36(22):3219–3228, 1980

1980

[14] [14]

A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery

Arup K Ghose, Vellarkad N Viswanadhan, and John J Wendoloski. A knowledge-based approach in designing combinatorial or medicinal chemistry libraries for drug discovery. 1. a qualitative and quantitative characterization of known drug databases.Journal of combinatorial chemistry, 1(1):55–68, 1999

1999

[15] [15]

3d equivariant diffusion for target-aware molecule generation and affinity prediction

Jiaqi Guan, Wesley Wei Qian, Xingang Peng, Yufeng Su, Jian Peng, and Jianzhu Ma. 3d equivariant diffusion for target-aware molecule generation and affinity prediction. In The Eleventh International Conference on Learning Representations, 2023. URL https: //openreview.net/forum?id=kJqXEPXMsE0. 10

2023

[16] [16]

Decompdiff: Diffusion models with decomposed priors for structure- based drug design

Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, and Quanquan Gu. Decompdiff: Diffusion models with decomposed priors for structure- based drug design. InInternational Conference on Machine Learning, pages 11827–11846. PMLR, 2023

2023

[17] [17]

The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

Andreia P Guerreiro, Carlos M Fonseca, and Luís Paquete. The hypervolume indicator: Computational problems and algorithms.ACM Computing Surveys (CSUR), 54(6):1–42, 2021

2021

[18] [18]

arXiv preprint arXiv:2308.07413 , year=

Charles Harris, Kieran Didi, Arian R Jamasb, Chaitanya K Joshi, Simon V Mathis, Pietro Lio, and Tom Blundell. Benchmarking generated poses: How rational is structure-based drug design with generative models?arXiv preprint arXiv:2308.07413, 2023

work page arXiv 2023

[19] [19]

Empowering llms for structure-based drug design via exploration-augmented latent inference

Xuanning Hu, Anchen Li, Qianli Xing, Jinglong Ji, Hao Tuo, and Bo Yang. Empowering llms for structure-based drug design via exploration-augmented latent inference. InProceedings of the ACM Web Conference 2026, pages 4244–4255, 2026

2026

[20] [20]

Protein-ligand interaction prior for binding- aware 3d molecule diffusion models

Zhilin Huang, Ling Yang, Xiangxin Zhou, Zhilong Zhang, Wentao Zhang, Xiawu Zheng, Jie Chen, Yu Wang, Bin Cui, and Wenming Yang. Protein-ligand interaction prior for binding- aware 3d molecule diffusion models. InThe Twelfth International Conference on Learning Representations, 2024

2024

[21] [21]

Benchmarking the reliability of qikprop

Leukothea Ioakimidis, Loizos Thoukydidis, Amin Mirza, Saira Naeem, and Jóhannes Reynisson. Benchmarking the reliability of qikprop. correlation between experimental and predicted values. QSAR & Combinatorial Science, 27(4):445–456, 2008

2008

[22] [22]

Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

Clemens Isert, Kenneth Atz, and Gisbert Schneider. Structure-based drug design with geometric deep learning.Current Opinion in Structural Biology, 79:102548, 2023

2023

[23] [23]

A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

Jan H Jensen. A graph-based genetic algorithm and generative model/monte carlo tree search for the exploration of chemical space.Chemical science, 10(12):3567–3572, 2019

2019

[24] [24]

Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

Amit Kadan, Kevin Ryczko, Erika Lloyd, Adrian Roitberg, and Takeshi Yamazaki. Guided multi-objective generative ai to enhance structure-based drug design.Chemical Science, 16(29): 13196–13210, 2025

2025

[25] [25]

Rdkit documentation.Release, 1(1-79):4, 2013

Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013

2013

[26] [26]

Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 64:4–17, 2012

2012

[27] [27]

A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

Shitong Luo, Jiaqi Guan, Jianzhu Ma, and Jian Peng. A 3d generative model for structure-based drug design.Advances in Neural Information Processing Systems, 34:6229–6239, 2021

2021

[28] [28]

Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3 (1):33, 2011

2011

[29] [29]

Pocket2mol: Efficient molecular sampling based on 3d protein pockets

Xingang Peng, Shitong Luo, Jiaqi Guan, Qi Xie, Jian Peng, and Jianzhu Ma. Pocket2mol: Efficient molecular sampling based on 3d protein pockets. InInternational conference on machine learning, pages 17644–17655. PMLR, 2022

2022

[30] [30]

Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

Alleyn T Plowright, Craig Johnstone, Jan Kihlberg, Jonas Pettersson, Graeme Robb, and Richard A Thompson. Hypothesis driven drug design: improving quality and effectiveness of the design-make-test-analyse cycle.Drug discovery today, 17(1-2):56–62, 2012

2012

[31] [31]

Piloting structure-based drug design via modality-specific optimal schedule

Keyue Qiu, Yuxuan Song, Zhehuan Fan, Peidong Liu, Zhe Zhang, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Piloting structure-based drug design via modality-specific optimal schedule. InInternational Conference on Machine Learning, pages 50619–50644. PMLR, 2025

2025

[32] [32]

Empower structure-based molecule optimization with gradient guided bayesian flow networks

Keyue Qiu, Yuxuan Song, Jie Yu, Hongbo Ma, Ziyao Cao, Zhilong Zhang, Yushuai Wu, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Empower structure-based molecule optimization with gradient guided bayesian flow networks. InInternational Conference on Machine Learning, pages 50645–50671. PMLR, 2025. 11

2025

[33] [33]

Molcraft: Structure-based drug design in continuous parameter space

Yanru Qu, Keyue Qiu, Yuxuan Song, Jingjing Gong, Jiawei Han, Mingyue Zheng, Hao Zhou, and Wei-Ying Ma. Molcraft: Structure-based drug design in continuous parameter space. In International Conference on Machine Learning, pages 41749–41768. PMLR, 2024

2024

[34] [34]

Mollm: Multi-objective large language model for molecular design–optimizing with experts

Nian Ran, Yue Wang, and Richard Allmendinger. Mollm: Multi-objective large language model for molecular design–optimizing with experts. 2025

2025

[35] [35]

Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

Sebastian Salentin, Sven Schreiber, V Joachim Haupt, Melissa F Adasme, and Michael Schroeder. Plip: fully automated protein–ligand interaction profiler.Nucleic acids research, 43 (W1):W443–W447, 2015

2015

[36] [36]

Oleg Trott and Arthur J Olson. Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.Journal of computational chemistry, 31(2):455–461, 2010

2010

[37] [37]

Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

Mingyang Wang, Zhe Wang, Huiyong Sun, Jike Wang, Chao Shen, Gaoqi Weng, Xin Chai, Honglin Li, Dongsheng Cao, and Tingjun Hou. Deep learning approaches for de novo drug design: An overview.Current opinion in structural biology, 72:135–144, 2022

2022

[38] [38]

The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

Steven S Wesolowski and Dean G Brown. The strategies and politics of successful design, make, test, and analyze (dmta) cycles in lead generation.Lead Generation, pages 487–512, 2016

2016

[39] [39]

Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

Kehan Wu, Yingce Xia, Pan Deng, Renhe Liu, Yuan Zhang, Han Guo, Yumeng Cui, Qizhi Pei, Lijun Wu, Shufang Xie, et al. Tamgen: drug design with target-aware molecule generation through a chemical language model.Nature Communications, 15(1):9360, 2024

2024

[40] [40]

Learning subpocket prototypes for generalizable structure-based drug design

Zaixi Zhang and Qi Liu. Learning subpocket prototypes for generalizable structure-based drug design. InInternational Conference on Machine Learning, pages 41382–41398. PMLR, 2023

2023

[41] [41]

Structure-based drug design with geometric deep learning: A comprehensive survey

Zaixi Zhang, Jiaxian Yan, Yining Huang, Qi Liu, Enhong Chen, Mengdi Wang, and Marinka Zitnik. Structure-based drug design with geometric deep learning: A comprehensive survey. ACM Computing Surveys, 58(5):1–35, 2025

2025

[42] [42]

preserve affinity while improving druggability

Jingyuan Zhou, Dengwei Zhao, Hao Qian, Shikui Tu, and Lei Xu. Multi-objective structure- based drug design using causal discovery.IEEE Transactions on Computational Biology and Bioinformatics, 2025. 12 A Prompts Prompt A.1: MOO-awareness prompt injected into the design stage of CIDD. [Multi-Objective Optimization Guidance] You are optimizing a ligand agai...

2025

[43] [43]

Identify all optimization goals mentioned in the reasoning

[44] [44]

If only affinity-related goals are mentioned, output: affinity

[45] [45]

If only druggability-related goals are mentioned, output: druggability

[46] [46]

the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one

If both are mentioned but one is the main driver of the edit (e.g. the other is described as a side benefit, a constraint to preserve, or a minor consideration), output the dominant one

[47] [47]

Only output "both" when the reasoning treats affinity and druggability as co-equal targets of the same edit

[48] [48]

general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent

Ignore meta-commentary that is not about this specific edit (e.g. general background, descriptions of the pocket, summaries of prior rounds) when deciding the intent. Reasoning text: {reasoning} Prompt A.3: Prompt for PROBE in site map construction. You are an elite Director of Medicinal Chemistry, Expert Toxicologist, and Structural Biologist. Your task ...

[49] [49]

Fused ring systems should not be completely saturated (all sp3 carbons)

[50] [50]

Ring carbon atoms should ideally have consistent hybridization

[51] [51]

[Input Data]

Reactive groups (e.g., Michael acceptors, anilines) count as structural alerts. [Input Data]

[52] [52]

Structure SMILES : {smi}

[53] [53]

Mapped SMILES : {mapped_smi}

[54] [54]

Binding Metrics : Vina Score {score} | Ligand Efficiency (LE) {mc_data['le']:.3f}

[55] [55]

Properties : MW {mc_data['mw']:.1f} | LogP {mc_data['logp']:.2f} | TPSA {mc_data['tpsa']:.1f}

[56] [56]

Complexity : Chiral Centers {mc_data['chiral_centers']} | BertzCT {mc_data['sa_proxy']:.1f}

[57] [57]

Flatten this complex sp3 ring system

Structural Alerts: {','.join(mc_data['alerts']) if mc_data['alerts'] else 'Clean'} 14 [Diagnosis Signals] A. Interaction signals (PLIP): {interaction_text} B. Geometric signals -- Clashes: {clash_text} C. Geometric signals -- Voids / Buriedness: {void_text} D. Geometric signals -- Chemical Mismatch: {mismatch_text} E. Fragment-level interaction mapping (B...

[58] [58]

Base each strategy strictly on the chemical nature of the sites in the Site Map

NO PROMPT COPYING. Base each strategy strictly on the chemical nature of the sites in the Site Map

[59] [59]

Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)

ABSTRACT OPERATIONS ONLY. Each strategy must be expressed as a chemically explicit edit prescription (e.g., extending an H-bond donor, hopping to an achiral bioisostere, ring-opening, pruning)

[60] [60]

"" target_strategies = ['A','B','C'] for stg_key in target_strategies: stg_content = strategy_dict[stg_key] if stg_key =='A': tactic_instructions = r

STRUCTURAL-ALERT / COMPLEXITY COMPLIANCE. Unless the strategy explicitly accepts druggability damage (Strategy A), structural modifications should reduce complexity, avoid unnecessary chiral centers, and avoid spiro / bridged / fully-sp3 fused ring systems. [Task: Emit three Strategies sigma_k = < V_k, pi_k, tau_k >] For each strategy, output exactly the ...

[61] [61]

High intensity

"High intensity" -- Fill pocket void with ring. Substantial volume addition (Delta N_heavy >= +4); insert a complete standard ring system (e.g., phenyl / pyridyl / morpholino) into a void identified for sigma_A

[62] [62]

Medium intensity

"Medium intensity" -- Add functional group. Moderate volume addition (+1 <= Delta N_heavy <= +3); attach a small group (isopropyl, CF3, amide linker)

[63] [63]

Low intensity

"Low intensity" -- Isosteric tweak. 17 Minimal volume change (Delta N_heavy in {0, +1}); bioisosteric substitution (-CH3 -> -CF3, phenyl -> pyridyl) to retune electrostatics

[64] [64]

Counterfactual

"Counterfactual" -- Delete functional group. Reverse-direction probe (Delta N_heavy <= -1): remove a key H-bond donor / acceptor or anchor. Expected signal: if Vina worsens, the original group is load-bearing; if Vina is unchanged, the strategy probes a non-causal direction; if Vina improves, the targeted group is actively harmful. """ elif stg_key =='B':...

[65] [65]

High intensity

"High intensity" -- Prune Liability / Tension sites. Aggressive pruning (Delta N_heavy <= -4); remove bulky lipophilic groups, open complex fused rings into a single ring, or strip multiple chiral centers in one move

[66] [66]

Medium intensity

"Medium intensity" -- Trim peripheral liabilities. Surgical pruning (-3 <= Delta N_heavy <= -1); drop redundant terminals or simplify a substituted ring into an unsubstituted standard ring

[67] [67]

Low intensity

"Low intensity" -- Shave solvent-exposed atoms. Minimal pruning (Delta N_heavy = -1); remove a single solvent-exposed atom (terminal methyl / halogen) that contributes no binding energy

[68] [68]

Counterfactual

"Counterfactual" -- Add sp3 bloat. Reverse-direction probe (Delta N_heavy >= +3): attach a synthetically difficult, unnecessary sp3-rich bulky group (tert-butyl / cyclopentyl) to a solvent-exposed atom. Expected signal: if SA / QED collapse without Vina gain, mass at this site is purely harmful, confirming the pruning direction is causal. """ else: # stg_...

[69] [69]

High intensity

"High intensity" -- Replace core scaffold. Major topological shift with bounded mass change (-2 <= Delta N_heavy <= +2); swap a central ring or core linker for a 18 fundamentally different standard ring (phenyl -> tetrahydropyran / pyrimidine / piperazine) to reshape 3D geometry and Fsp3 without molecular-weight bloat

[70] [70]

Medium intensity

"Medium intensity" -- Peripheral bioisostere swap. Local topological shift (-1 <= Delta N_heavy <= +1); replace a peripheral group with a recognized bioisostere (carboxylic acid -> tetrazole, amide -> 1,2,4-triazole)

[71] [71]

Low intensity

"Low intensity" -- Regio-isomeric shift. No mass change (Delta N_heavy = 0); keep the groups, alter connectivity (ortho -> meta / para; reverse an amide -C(=O)NH- -> -NHC(=O)-)

[72] [72]

Counterfactual

"Counterfactual" -- Break geometric constraint. Deliberately violate a mandatory geometry: convert a planar aromatic ring essential for pi-stacking into a saturated ring (benzene -> cyclohexane), or rigidify a flexible linker via an alkyne to disrupt induced fit. Expected signal: if Vina collapses, the original rigid / planar topology is load-bearing for ...

[73] [73]

Target: [one atom index]

"Grow": Attach a new fragment by replacing an implicit hydrogen on the target. Target: [one atom index]

[74] [74]

Replace_Terminal_Group

"Replace_Terminal_Group": Cut a peripheral group (-OH, -CH3, halogen, ...) and swap it. Target: [one atom index of the group to be removed]

[75] [75]

Replace_Sidechain_or_Ring

"Replace_Sidechain_or_Ring": 19 Cut an entire sidechain or ring system attached to the main scaffold. Target: [one atom index -- MUST be the anchoring atom of the sidechain / ring]

[76] [76]

Delete_Group

"Delete_Group": Remove a peripheral group without replacement (bond capped by H). Target: [one atom index]. [Fragment Constraints -- SEMANTIC TAGS ONLY] Do NOT output raw numbers for physicochemical properties. Use these tags: - size : "Small" (MW < 150) / "Medium" (150-250) / "Large" (> 250) / "Any" - polarity : "Hydrophilic" / "Neutral" / "Lipophilic" /...

[77] [77]

Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name

SITE-LEVEL ABSTRACTION. Index every entry by the abstract site (s_1..s_n), not by atom indices and not by strategy name. The molecule mutates across rounds; atom indices are not stable, but each site is named after its originating fragment / pharmacophore role

[78] [78]

For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies

EVIDENCE AGGREGATION. For each site, pull together every probe whose target_site equals this site -- possibly drawn from multiple strategies. Cite probe_ids when stating a verdict

[79] [79]

For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)

STRATEGY VERDICT. For each strategy sigma_k, state whether the observed forward_shape / counterfactual_signal validates or invalidates tau_k (the trade-off sigma_k claimed to accept)

[80] [80]

Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge

SEMANTIC ENVELOPES. Translate physical responses into the 5 semantic tags used by the Fragment Assembly Engine: size / polarity / flexibility / shape / charge