Human-level molecular optimization driven by mol-gene evolution

2); 2) ((1) Hainan Institute of Zhejiang University; (2) Institute of Marine Biology; (3) College of Pharmaceutical Sciences; Cancer Center; Chang-Yu Hsieh (3); Churu Mao (2); Jiebin Fang (1; Ocean College; Pharmacology

arxiv: 2406.12910 · v1 · submitted 2024-06-13 · 💻 cs.LG · cs.AI· cs.NE· physics.chem-ph· q-bio.BM

Human-level molecular optimization driven by mol-gene evolution

Jiebin Fang (1 , 2) , Churu Mao (2) , Yuchen Zhu (3) , Xiaoming Chen (2) , Chang-Yu Hsieh (3) , Zhongjun Ma (1 , 2) ((1) Hainan Institute of Zhejiang University

show 7 more authors

(2) Institute of Marine Biology Pharmacology Ocean College Zhejiang University (3) College of Pharmaceutical Sciences Cancer Center Zhejiang University)

This is my paper

Pith reviewed 2026-05-24 00:28 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.NEphysics.chem-phq-bio.BM

keywords molecular optimizationdiscrete variational autoencodergenetic algorithmsmol-genedrug discoverylead optimizationde novo generationdeep learning

0 comments

The pith

Encoding molecules as mol-genes via discrete VAE lets genetic algorithms perform human-level structural optimization for drugs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Deep Genetic Molecular Modification Algorithm to handle lead optimization after de novo molecule generation. It encodes molecules into mol-genes using a discrete variational autoencoder and then evolves those representations with genetic algorithms. This setup aims to produce pharmacologically similar but structurally distinct compounds while balancing novelty and properties. The approach is positioned as achieving modification levels comparable to medicinal chemists. Demonstrations in several applications illustrate its use for revealing optimization trade-offs.

Core claim

The DGMM brings structure modification to the level of medicinal chemists by encoding molecules as mol-genes via D-VAE and applying genetic algorithms for flexible structural optimization. The mol-gene allows for the discovery of pharmacologically similar but structurally distinct compounds, and reveals the trade-offs of structural optimization in drug discovery.

What carries the argument

The mol-gene, defined as the quantization code from the discrete variational autoencoder, which acts as the genetic representation enabling evolutionary structural changes.

If this is right

The method supports discovery of pharmacologically similar yet structurally distinct compounds.
It highlights specific trade-offs between structural novelty and retained properties during optimization.
Effectiveness appears across multiple drug discovery applications.
The representation enables flexible modifications that mimic chemist-level adjustments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The encoding might allow similar evolutionary optimization in related discrete design tasks such as materials or proteins.
If the preservation holds, the approach could reduce the need for manual structural tweaks in early-stage design.
Scalability tests on broader chemical spaces would reveal where the mol-gene representation begins to lose fidelity.

Load-bearing premise

The discrete VAE encoding preserves enough pharmacological and structural information so that evolution does not lose key properties.

What would settle it

Evolved molecules that lose essential pharmacological activity or fail structural validity checks in standard property prediction benchmarks would show the encoding does not preserve sufficient information.

read the original abstract

De novo molecule generation allows the search for more drug-like hits across a vast chemical space. However, lead optimization is still required, and the process of optimizing molecular structures faces the challenge of balancing structural novelty with pharmacological properties. This study introduces the Deep Genetic Molecular Modification Algorithm (DGMM), which brings structure modification to the level of medicinal chemists. A discrete variational autoencoder (D-VAE) is used in DGMM to encode molecules as quantization code, mol-gene, which incorporates deep learning into genetic algorithms for flexible structural optimization. The mol-gene allows for the discovery of pharmacologically similar but structurally distinct compounds, and reveals the trade-offs of structural optimization in drug discovery. We demonstrate the effectiveness of the DGMM in several applications.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DGMM is a D-VAE plus genetic algorithm setup for molecular lead optimization, but the abstract supplies no reconstruction metrics or property preservation checks to support the human-level claim.

read the letter

The paper's main contribution is framing molecule optimization as evolution on discrete VAE codes called mol-genes. It takes the standard generative-plus-evolutionary recipe and applies it to lead optimization with the goal of producing structurally novel but pharmacologically similar compounds. That integration is straightforward and could be useful in the subfield if the encoding step actually works as advertised.

Referee Report

2 major / 0 minor

Summary. The manuscript introduces the Deep Genetic Molecular Modification Algorithm (DGMM) that encodes molecules into quantization codes termed 'mol-genes' via a discrete variational autoencoder (D-VAE) and then applies genetic algorithms to perform structural optimization, claiming this achieves human-level lead optimization by discovering pharmacologically similar yet structurally distinct compounds while revealing trade-offs in drug discovery; effectiveness is demonstrated across several (unspecified) applications.

Significance. If the central claim holds with supporting validation, the work would represent a concrete integration of discrete latent representations with evolutionary search for molecular design, offering a potentially more flexible alternative to purely generative or reinforcement-learning approaches in balancing novelty against ADMET/potency constraints.

major comments (2)

[Abstract] Abstract: the claim that the D-VAE-derived mol-gene 'incorporates deep learning into genetic algorithms for flexible structural optimization' and enables 'human-level' performance rests on the unverified assumption that quantization codes preserve sufficient pharmacological and structural signal; no reconstruction fidelity, property-prediction R² on held-out molecules, or ablation of GA steps versus direct property erosion are reported, directly undermining the weakest assumption identified in the stress-test note.
[Abstract] Abstract (applications paragraph): the statement 'We demonstrate the effectiveness of the DGMM in several applications' provides no quantitative metrics, baselines, error bars, or comparison to medicinal-chemist performance, so the load-bearing claim of human-level optimization cannot be evaluated from the supplied evidence.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight areas where the abstract can be strengthened to better support the manuscript's claims. We address each point below.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that the D-VAE-derived mol-gene 'incorporates deep learning into genetic algorithms for flexible structural optimization' and enables 'human-level' performance rests on the unverified assumption that quantization codes preserve sufficient pharmacological and structural signal; no reconstruction fidelity, property-prediction R² on held-out molecules, or ablation of GA steps versus direct property erosion are reported, directly undermining the weakest assumption identified in the stress-test note.

Authors: We agree that explicit reporting of these metrics in the abstract would strengthen the presentation. The manuscript body validates the D-VAE through its training objective and downstream use in DGMM, but we will revise the abstract to include concise statements on reconstruction fidelity, held-out property-prediction performance, and an ablation comparing GA evolution to direct optimization, thereby directly addressing the concern about signal preservation. revision: yes
Referee: [Abstract] Abstract (applications paragraph): the statement 'We demonstrate the effectiveness of the DGMM in several applications' provides no quantitative metrics, baselines, error bars, or comparison to medicinal-chemist performance, so the load-bearing claim of human-level optimization cannot be evaluated from the supplied evidence.

Authors: We acknowledge that the abstract's applications statement is too high-level to allow evaluation of the human-level claim. We will revise this paragraph to include the key quantitative metrics, baselines, and error bars from the experiments reported in the full manuscript. Where direct comparisons to medicinal-chemist performance exist in our results, they will be noted; otherwise the claim language will be adjusted to match the available evidence. revision: yes

Circularity Check

0 steps flagged

No circularity: standard VAE+GA pipeline with independent encoding and search steps

full rationale

The paper describes a conventional two-stage pipeline: a discrete VAE learns a quantization code (mol-gene) representation from molecular data, after which a genetic algorithm operates on those codes for optimization. No equation or claim reduces a result to its own input by construction, no fitted parameter is relabeled as a prediction, and no load-bearing premise rests on self-citation. The derivation chain is self-contained against external molecular benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract only provides no explicit details on free parameters, axioms, or invented entities; mol-gene is introduced as a new representation but without independent evidence or fitting details.

pith-pipeline@v0.9.0 · 5731 in / 1157 out tokens · 24571 ms · 2026-05-24T00:28:36.910446+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

A discrete variational autoencoder (D-VAE) is used in DGMM to encode molecules as quantization code, mol-gene, which incorporates deep learning into genetic algorithms for flexible structural optimization.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The mol-gene allows for the discovery of pharmacologically similar but structurally distinct compounds

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · 2 internal anchors

[1]

Jorgensen, W. L. Efficient drug lead discovery and optimization. Acc. Chem. Res. 42, 724-733 (2009)

work page 2009
[2]

& Wiesmann, C

Eder, J., Sedrani, R. & Wiesmann, C. The discovery of first-in-class drugs: origins and evolution. Nat. Rev. Drug Discov. 13, 577-587 (2014)

work page 2014
[3]

G., Zotchev, S

Atanasov, A. G., Zotchev, S. B., Dirsch, V. M. & Supuran, C. T. Natural products in drug discovery: advances and opportunities. Nat. Rev. Drug Discov. 20, 200-216 (2021)

work page 2021
[4]

Luttens, A. et al. Ultralarge virtual screening identifies SARS-CoV-2 main protease inhibitors with broad-spectrum activity against coronaviruses. J. Am. Chem. Soc. 144, 2905-2920 (2022)

work page 2022
[5]

Sadybekov, A. V. & Katritch, V. C o m p u t a t i o n a l a p p r o a c h e s s t r e a m l i n i n g d r u g d i s c o v e r y. Nature 616, 673-685 (2023)

work page 2023
[6]

Sadybekov, A. A. et al. Synthon-based ligand discovery in virtual libraries of over 11 billion compounds. Nature 601, 452-459 (2022)

work page 2022
[7]

Spiegel, J. O. & Durrant, J. D. AutoGrow4: an open-source genetic algorithm for de novo drug design and lead optimization. J. Cheminformatics 12, 1-16 (2020)

work page 2020
[8]

Tan, Y. et al. Drlinker: Deep reinforcement learning for optimization in fragment linking design. J. Chem. Inf. Model. 62, 5907-5917 (2022)

work page 2022
[9]

Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038-1040 (2019). 7

work page 2019
[10]

Loeffler, H. H. et al. Reinvent 4: Modern AI–driven generative molecule design. J. Cheminformatics 16, 20 (2024)

work page 2024
[11]

Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019)

work page 2019
[12]

Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Preprint at https://arxiv.org/abs/2210.13695 (2022)

work page arXiv 2022
[13]

Huang, L. et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun. 15, 2657 (2024)

work page 2024
[14]

Reader, J. C. et al. Structure-guided evolution of potent and selective CHK1 inhibitors through scaffold morphing. J. Med. Chem. 54, 8328-8342 (2011)

work page 2011
[15]

Zhang, C. et al. Potent noncovalent inhibitors of the main protease of SARS-CoV-2 from molecular sculpting of the drug perampanel guided by free energy perturbation calculations. ACS Cent. Sci. 7, 467-475 (2021)

work page 2021
[16]

C., Chan, A

Ho, T. C., Chan, A. H. & Ganesan, A. Thirty years of HDAC inhibitors: 2020 insight and hindsight. J. Med. Chem. 63, 12460-12484 (2020)

work page 2020
[17]

Lamanna, G. et al. GENERA: a combined genetic/deep-learning algorithm for multiobjective target-oriented de novo design. J. Chem. Inf. Model. 63, 5107-5119 (2023)

work page 2023
[18]

R., Parthasarathy, S

Chen, Z., Min, M. R., Parthasarathy, S. & Ning, X. A deep generative model for molecule optimization via one fragment modification. Nat. Mach. Intell. 3, 1040-1049 (2021)

work page 2021
[19]

Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013)

work page internal anchor Pith review Pith/arXiv arXiv 2013
[20]

Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268-276 (2018)

work page 2018
[21]

& Wei, G

Feng, H., Wang, R., Zhan, C. & Wei, G. Multiobjective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex. J. Med. Chem. 66, 12479-12498 (2023)

work page 2023
[22]

Lam, H. Y. I. et al. Application of variational graph encoders as an effective generalist algorithm in computer-aided drug design. Nat. Mach. Intell. 5, 754-764 (2023)

work page 2023
[23]

Reiser, P. et al. Graph neural networks for materials science and chemistry. Commun. Mater. 3, 93 (2022)

work page 2022
[24]

Heid, E. et al. Chemprop: A machine learning package for chemical property prediction. J. Chem. Inf. Model. 64, 9-17 (2023)

work page 2023
[25]

& Aspuru-Guzik, A

Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): A 100% robust molecular string repre-sentation. Mach. learn.: sci. technol. 1, 045024 (2020)

work page 2020
[26]

& Aila, T

Karras, T., Laine, S. & Aila, T. A style-based generator architecture for generative adversarial networks. In Proc. IEEE/CVF conference on computer vision and pattern recognition (IEEE, 2019)

work page 2019
[27]

Distilling the Knowledge in a Neural Network

Hinton, G., Vinyals, O. & Dean, J. Distilling the knowledge in a neural network. Preprint at https://arxiv.org/abs/1503.02531 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015
[28]

R., Paolini, G

Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90-98 (2012)

work page 2012
[29]

& Schuffenhauer, A

Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminformatics 1, 1-11 (2009)

work page 2009
[30]

Lu, C. et al. OPLS4: Improving force field accuracy on challenging regimes of chemical space. J. Chem. Theory Comput. 17, 4291-4300 (2021)

work page 2021
[31]

Yang, Y. et al. Efficient exploration of chemical space with docking and deep learning. J. Chem. Theory Comput. 17, 7106-7119 (2021)

work page 2021
[32]

& Irwin, J

Sterling, T. & Irwin, J. J. ZINC 15–ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324-2337 (2015)

work page 2015
[33]

K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L

Gilson, M. K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L. & Chong, J. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045-D1053 (2016)

work page 2015
[34]

Matthews, T. P. et al. Identification of inhibitors of checkpoint kinase 1 through template screening. J. Med. Chem. 52, 4810-4819 (2009)

work page 2009
[35]

Bowers, K. J. et al. Scalable algorithms for molecular dynamics simulations on commodity clusters. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (ACM, 2006)

work page 2006
[36]

Farid, R., Day, T., Friesner, R. A. & Pearlstein, R. A. New insights about HERG blockade obtained from protein modeling, potential energy mapping, and docking studies. Bioorg. Med. Chem. 14, 3160-3173 (2006)

work page 2006
[37]

& Suzuki, T

Mukherjee, A., Zamani, F. & Suzuki, T. Evolution of slow-binding inhibitors targeting histone deacetylase isoforms. J. Med. Chem 66, 11672-11700 (2023)

work page 2023
[38]

Poor aqueous solubility—an industry wide problem in drug discovery

Lipinski, C. Poor aqueous solubility—an industry wide problem in drug discovery. Am. Pharm. Rev. 5, 82-85 (2002)

work page 2002
[39]

& Rodriguez-Nogales, C

Rossier, B., Jordan, O., Allémann, E. & Rodriguez-Nogales, C. Nanocrystals and nanosuspensions: an exploration from classic formulations to advanced drug delivery systems. Drug Deliv. Transl. Res. (2024)

work page 2024
[40]

Xiong, L. et al. Discovery of a Potent and Cell-Active Inhibitor of DNA 6mA Demethylase ALKBH1. J. Am. Chem. Soc. (2024)

work page 2024
[41]

& Olson, A

Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multi-threading. J. Comput. Chem. 31, 455-461 (2010)

work page 2010
[42]

& Sherman, W

Madhavi-Sastry G., Adzhigirey, M., Day, T., Annabhimoju, R. & Sherman, W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 27, 221-234 (2013)

work page 2013

[1] [1]

Jorgensen, W. L. Efficient drug lead discovery and optimization. Acc. Chem. Res. 42, 724-733 (2009)

work page 2009

[2] [2]

& Wiesmann, C

Eder, J., Sedrani, R. & Wiesmann, C. The discovery of first-in-class drugs: origins and evolution. Nat. Rev. Drug Discov. 13, 577-587 (2014)

work page 2014

[3] [3]

G., Zotchev, S

Atanasov, A. G., Zotchev, S. B., Dirsch, V. M. & Supuran, C. T. Natural products in drug discovery: advances and opportunities. Nat. Rev. Drug Discov. 20, 200-216 (2021)

work page 2021

[4] [4]

Luttens, A. et al. Ultralarge virtual screening identifies SARS-CoV-2 main protease inhibitors with broad-spectrum activity against coronaviruses. J. Am. Chem. Soc. 144, 2905-2920 (2022)

work page 2022

[5] [5]

Sadybekov, A. V. & Katritch, V. C o m p u t a t i o n a l a p p r o a c h e s s t r e a m l i n i n g d r u g d i s c o v e r y. Nature 616, 673-685 (2023)

work page 2023

[6] [6]

Sadybekov, A. A. et al. Synthon-based ligand discovery in virtual libraries of over 11 billion compounds. Nature 601, 452-459 (2022)

work page 2022

[7] [7]

Spiegel, J. O. & Durrant, J. D. AutoGrow4: an open-source genetic algorithm for de novo drug design and lead optimization. J. Cheminformatics 12, 1-16 (2020)

work page 2020

[8] [8]

Tan, Y. et al. Drlinker: Deep reinforcement learning for optimization in fragment linking design. J. Chem. Inf. Model. 62, 5907-5917 (2022)

work page 2022

[9] [9]

Zhavoronkov, A. et al. Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat. Biotechnol. 37, 1038-1040 (2019). 7

work page 2019

[10] [10]

Loeffler, H. H. et al. Reinvent 4: Modern AI–driven generative molecule design. J. Cheminformatics 16, 20 (2024)

work page 2024

[11] [11]

Zhou, Z., Kearnes, S., Li, L., Zare, R. N. & Riley, P. Optimization of molecules via deep reinforcement learning. Sci. Rep. 9, 10752 (2019)

work page 2019

[12] [12]

Schneuing, A. et al. Structure-based drug design with equivariant diffusion models. Preprint at https://arxiv.org/abs/2210.13695 (2022)

work page arXiv 2022

[13] [13]

Huang, L. et al. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat. Commun. 15, 2657 (2024)

work page 2024

[14] [14]

Reader, J. C. et al. Structure-guided evolution of potent and selective CHK1 inhibitors through scaffold morphing. J. Med. Chem. 54, 8328-8342 (2011)

work page 2011

[15] [15]

Zhang, C. et al. Potent noncovalent inhibitors of the main protease of SARS-CoV-2 from molecular sculpting of the drug perampanel guided by free energy perturbation calculations. ACS Cent. Sci. 7, 467-475 (2021)

work page 2021

[16] [16]

C., Chan, A

Ho, T. C., Chan, A. H. & Ganesan, A. Thirty years of HDAC inhibitors: 2020 insight and hindsight. J. Med. Chem. 63, 12460-12484 (2020)

work page 2020

[17] [17]

Lamanna, G. et al. GENERA: a combined genetic/deep-learning algorithm for multiobjective target-oriented de novo design. J. Chem. Inf. Model. 63, 5107-5119 (2023)

work page 2023

[18] [18]

R., Parthasarathy, S

Chen, Z., Min, M. R., Parthasarathy, S. & Ning, X. A deep generative model for molecule optimization via one fragment modification. Nat. Mach. Intell. 3, 1040-1049 (2021)

work page 2021

[19] [19]

Kingma, D. P. & Welling, M. Auto-encoding variational bayes. Preprint at https://arxiv.org/abs/1312.6114 (2013)

work page internal anchor Pith review Pith/arXiv arXiv 2013

[20] [20]

Gómez-Bombarelli, R. et al. Automatic chemical design using a data-driven continuous representation of molecules. ACS Cent. Sci. 4, 268-276 (2018)

work page 2018

[21] [21]

& Wei, G

Feng, H., Wang, R., Zhan, C. & Wei, G. Multiobjective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex. J. Med. Chem. 66, 12479-12498 (2023)

work page 2023

[22] [22]

Lam, H. Y. I. et al. Application of variational graph encoders as an effective generalist algorithm in computer-aided drug design. Nat. Mach. Intell. 5, 754-764 (2023)

work page 2023

[23] [23]

Reiser, P. et al. Graph neural networks for materials science and chemistry. Commun. Mater. 3, 93 (2022)

work page 2022

[24] [24]

Heid, E. et al. Chemprop: A machine learning package for chemical property prediction. J. Chem. Inf. Model. 64, 9-17 (2023)

work page 2023

[25] [25]

& Aspuru-Guzik, A

Krenn, M., Häse, F., Nigam, A., Friederich, P. & Aspuru-Guzik, A. Self-referencing embedded strings (SELFIES): A 100% robust molecular string repre-sentation. Mach. learn.: sci. technol. 1, 045024 (2020)

work page 2020

[26] [26]

& Aila, T

Karras, T., Laine, S. & Aila, T. A style-based generator architecture for generative adversarial networks. In Proc. IEEE/CVF conference on computer vision and pattern recognition (IEEE, 2019)

work page 2019

[27] [27]

Distilling the Knowledge in a Neural Network

Hinton, G., Vinyals, O. & Dean, J. Distilling the knowledge in a neural network. Preprint at https://arxiv.org/abs/1503.02531 (2015)

work page internal anchor Pith review Pith/arXiv arXiv 2015

[28] [28]

R., Paolini, G

Bickerton, G. R., Paolini, G. V., Besnard, J., Muresan, S. & Hopkins, A. L. Quantifying the chemical beauty of drugs. Nat. Chem. 4, 90-98 (2012)

work page 2012

[29] [29]

& Schuffenhauer, A

Ertl, P. & Schuffenhauer, A. Estimation of synthetic accessibility score of drug-like molecules based on molecular complexity and fragment contributions. J. Cheminformatics 1, 1-11 (2009)

work page 2009

[30] [30]

Lu, C. et al. OPLS4: Improving force field accuracy on challenging regimes of chemical space. J. Chem. Theory Comput. 17, 4291-4300 (2021)

work page 2021

[31] [31]

Yang, Y. et al. Efficient exploration of chemical space with docking and deep learning. J. Chem. Theory Comput. 17, 7106-7119 (2021)

work page 2021

[32] [32]

& Irwin, J

Sterling, T. & Irwin, J. J. ZINC 15–ligand discovery for everyone. J. Chem. Inf. Model. 55, 2324-2337 (2015)

work page 2015

[33] [33]

K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L

Gilson, M. K., Liu, T., Baitaluk, M., Nicola, G., Hwang, L. & Chong, J. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Res. 44, D1045-D1053 (2016)

work page 2015

[34] [34]

Matthews, T. P. et al. Identification of inhibitors of checkpoint kinase 1 through template screening. J. Med. Chem. 52, 4810-4819 (2009)

work page 2009

[35] [35]

Bowers, K. J. et al. Scalable algorithms for molecular dynamics simulations on commodity clusters. In Proceedings of the 2006 ACM/IEEE Conference on Supercomputing (ACM, 2006)

work page 2006

[36] [36]

Farid, R., Day, T., Friesner, R. A. & Pearlstein, R. A. New insights about HERG blockade obtained from protein modeling, potential energy mapping, and docking studies. Bioorg. Med. Chem. 14, 3160-3173 (2006)

work page 2006

[37] [37]

& Suzuki, T

Mukherjee, A., Zamani, F. & Suzuki, T. Evolution of slow-binding inhibitors targeting histone deacetylase isoforms. J. Med. Chem 66, 11672-11700 (2023)

work page 2023

[38] [38]

Poor aqueous solubility—an industry wide problem in drug discovery

Lipinski, C. Poor aqueous solubility—an industry wide problem in drug discovery. Am. Pharm. Rev. 5, 82-85 (2002)

work page 2002

[39] [39]

& Rodriguez-Nogales, C

Rossier, B., Jordan, O., Allémann, E. & Rodriguez-Nogales, C. Nanocrystals and nanosuspensions: an exploration from classic formulations to advanced drug delivery systems. Drug Deliv. Transl. Res. (2024)

work page 2024

[40] [40]

Xiong, L. et al. Discovery of a Potent and Cell-Active Inhibitor of DNA 6mA Demethylase ALKBH1. J. Am. Chem. Soc. (2024)

work page 2024

[41] [41]

& Olson, A

Trott, O. & Olson, A. J. AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multi-threading. J. Comput. Chem. 31, 455-461 (2010)

work page 2010

[42] [42]

& Sherman, W

Madhavi-Sastry G., Adzhigirey, M., Day, T., Annabhimoju, R. & Sherman, W. Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments. J. Comput. Aided Mol. Des. 27, 221-234 (2013)

work page 2013