JEDEL: Zero-Shot DNA-Encoded Library Design for Early-Stage Drug Discovery
Pith reviewed 2026-06-26 09:15 UTC · model grok-4.3
The pith
JEDEL maps 3D pharmacophore patterns of ligands to scalable, synthesis-ready DNA-encoded library instructions in zero-shot fashion.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
JEDEL is the first model to map pharmacophore interaction patterns to actionable, scalable synthesis instructions, enabling the design of targeted libraries comprising potentially millions of molecules. Unlike existing generative approaches that produce virtual compounds requiring downstream synthesis planning, JEDEL operates within the space of purchasable building blocks and validated reactions, ensuring that every output is experimentally realizable by construction. JEDEL learns a predictive alignment between pharmacophore geometry and molecular structure and decodes this into combinatorial synthesis routes at scale. Across 18 protein targets, it generates focused libraries that outperfor
What carries the argument
The predictive alignment between pharmacophore geometry and molecular structure, decoded into combinatorial synthesis routes using only purchasable building blocks and validated reactions.
If this is right
- Focused libraries generated by JEDEL outperform random and diversity-based baselines in predicted binding affinity across 18 protein targets.
- The approach achieves higher pharmacophore recovery and sample efficiency without target-specific retraining.
- Every generated molecule is experimentally realizable by construction since it uses validated reactions and purchasable blocks.
- JEDEL shifts the paradigm from generating virtual compounds to designing experimentally deployable libraries at the scale of millions of molecules.
Where Pith is reading between the lines
- The same alignment approach might extend to other library formats that rely on combinatorial synthesis from standard reagents.
- Adding filters for properties such as cell permeability during the decoding step could produce libraries better suited for downstream biological tests.
- Running the model on pharmacophores from multiple ligands for the same target could further improve library focus beyond single-ligand inputs.
Load-bearing premise
The learned predictive alignment between pharmacophore geometry and molecular structure generalizes across protein targets in a zero-shot manner, and the predicted binding affinities used for evaluation accurately reflect real-world performance without requiring experimental synthesis or binding validation.
What would settle it
Synthesize a few hundred molecules from one JEDEL-designed library and one baseline library for the same protein target, then measure their experimental binding affinities to test whether the JEDEL set shows measurably stronger binding on average.
read the original abstract
We present JEDEL, a framework for generating synthesis-ready DNA-encoded libraries (DELs) directly from three-dimensional pharmacophore representations of active ligands. JEDEL is the first model to map pharmacophore interaction patterns to actionable, scalable synthesis instructions, enabling the design of targeted libraries comprising potentially millions of molecules. Unlike existing generative approaches that produce virtual compounds requiring downstream synthesis planning, JEDEL operates within the space of purchasable building blocks and validated reactions, ensuring that every output is experimentally realizable by construction. JEDEL learns a predictive alignment between pharmacophore geometry and molecular structure and decodes this into combinatorial synthesis routes at scale. Across 18 protein targets, it generates focused libraries that outperform random and diversity-based baselines in predicted binding affinity, pharmacophore recovery, and sample efficiency, without target-specific retraining. JEDEL enables a shift from virtual molecule generation to experimentally deployable library design.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents JEDEL, a framework for zero-shot generation of synthesis-ready DNA-encoded libraries (DELs) directly from 3D pharmacophore representations of active ligands. It claims to be the first to map pharmacophore interaction patterns to actionable combinatorial synthesis routes using purchasable building blocks and validated reactions, enabling targeted libraries of potentially millions of molecules. Across 18 protein targets, JEDEL-generated libraries are reported to outperform random and diversity-based baselines in predicted binding affinity, pharmacophore recovery, and sample efficiency, without target-specific retraining.
Significance. If the central claims hold with proper validation, this would be a notable contribution to early-stage drug discovery by shifting from virtual compound generation to directly deployable, scalable DEL design. The zero-shot aspect and focus on experimentally realizable outputs address a practical gap in the field. However, the current evidence base relies entirely on unvalidated in silico proxies, limiting immediate significance.
major comments (3)
- [Abstract / Evaluation] Abstract and evaluation sections: The reported outperformance on 'predicted binding affinity' across 18 targets supplies no methodological details on the affinity predictor (e.g., its architecture, training data, cross-validation against experimental Kd/IC50 values, or error bars), making it impossible to assess whether the metrics support the zero-shot generalization claim or reduce to self-referential evaluations.
- [Results / Methods] Results and methods: No experimental synthesis or binding assay validation is described for any JEDEL-generated libraries; all headline metrics (affinity, recovery, efficiency) rest on in silico predictions whose correlation with real-world performance is unknown, which is load-bearing for the 'actionable' and 'experimentally realizable by construction' framing.
- [Model description] § on model architecture (inferred from abstract): The claim that JEDEL 'learns a predictive alignment between pharmacophore geometry and molecular structure' lacks any equations, derivation steps, or ablation studies showing that performance does not arise from fitted parameters or baseline definitions.
minor comments (1)
- [Methods] Dataset descriptions for the 18 protein targets and baseline implementations are not provided in sufficient detail for reproducibility.
Simulated Author's Rebuttal
We thank the referee for their constructive comments. We address each major comment point by point below, indicating planned revisions where the manuscript will be updated.
read point-by-point responses
-
Referee: [Abstract / Evaluation] Abstract and evaluation sections: The reported outperformance on 'predicted binding affinity' across 18 targets supplies no methodological details on the affinity predictor (e.g., its architecture, training data, cross-validation against experimental Kd/IC50 values, or error bars), making it impossible to assess whether the metrics support the zero-shot generalization claim or reduce to self-referential evaluations.
Authors: We agree that additional details are required. The revised Methods section will include a full description of the affinity predictor architecture, training data sources, cross-validation procedure against experimental Kd/IC50 values, and reporting of error bars to enable proper evaluation of the metrics. revision: yes
-
Referee: [Results / Methods] Results and methods: No experimental synthesis or binding assay validation is described for any JEDEL-generated libraries; all headline metrics (affinity, recovery, efficiency) rest on in silico predictions whose correlation with real-world performance is unknown, which is load-bearing for the 'actionable' and 'experimentally realizable by construction' framing.
Authors: This comment correctly identifies that the work is computational. We will revise the Discussion to clarify that 'experimentally realizable by construction' refers specifically to the use of purchasable building blocks and validated reactions, while noting that wet-lab synthesis and binding assays are planned as future work and outside the scope of the current manuscript. The in silico results are presented as such. revision: partial
-
Referee: [Model description] § on model architecture (inferred from abstract): The claim that JEDEL 'learns a predictive alignment between pharmacophore geometry and molecular structure' lacks any equations, derivation steps, or ablation studies showing that performance does not arise from fitted parameters or baseline definitions.
Authors: We will expand the model architecture section in the revised manuscript to provide the equations describing the predictive alignment, derivation steps, and ablation studies that demonstrate performance contributions beyond fitted parameters or baseline definitions. revision: yes
- Experimental synthesis and binding assay validation of JEDEL-generated libraries, as no such wet-lab experiments were conducted in this computational study.
Circularity Check
No circularity: no equations, derivations, or self-referential predictions visible
full rationale
The provided abstract and text describe a generative framework for DEL design but contain no mathematical derivations, equations, fitted parameters, or self-citations that reduce any claimed result to its inputs by construction. Performance claims reference 'predicted binding affinity' and baselines, yet no load-bearing step is shown where a prediction is statistically forced by the fitting process itself or where uniqueness is imported via author self-citation. The derivation chain is therefore self-contained against external benchmarks as no internal reduction is exhibited.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Guan, J., Qian, W.W., Peng, X., Su, Y., Peng, J., Ma, J.: 3D Equivariant Diffusion for Target-Aware Molecule Generation and Affinity Prediction (2023)
2023
-
[2]
In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S
Peng, X., Luo, S., Guan, J., Xie, Q., Peng, J., Ma, J.: Pocket2Mol: Effi- cient molecular sampling based on 3D protein pockets. In: Chaudhuri, K., Jegelka, S., Song, L., Szepesvari, C., Niu, G., Sabato, S. (eds.) Proceedings of the 39th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 162, pp. 17644–17655. PMLR,...
2022
-
[3]
Digital Discovery3, 1308–1318 (2024) https: //doi.org/10.1039/D4DD00076E
Jocys, Z., Grundy, J., Farrahi, K.: Drugpose: benchmarking 3d generative meth- ods for early stage drug discovery. Digital Discovery3, 1308–1318 (2024) https: //doi.org/10.1039/D4DD00076E
-
[4]
https://arxiv.org/abs/2406.04628
Luo, S., Gao, W., Wu, Z., Peng, J., Coley, C.W., Ma, J.: Projecting Molecules into Synthesizable Chemical Spaces (2024). https://arxiv.org/abs/2406.04628
arXiv 2024
-
[5]
https://arxiv.org/abs/2405
Cretu, M., Harris, C., Roy, J., Bengio, E., Li` o, P.: SynFlowNet: Towards Molecule Design with Guaranteed Synthesis Pathways (2024). https://arxiv.org/abs/2405. 01155
2024
-
[6]
Gao, W., Mercado, R., Coley, C.W.: Amortized Tree Generation for Bottom-up Synthesis Planning and Synthesizable Molecular Design (2022). https://arxiv. org/abs/2110.06389
arXiv 2022
-
[7]
ACS Combinatorial Science 17(9), 476–480 (2015) https://doi.org/10.1021/acscombsci.5b00106
Cavett, V.J., Paegel, B.M.: DNA-encoded solid-phase synthesis: Encoding lan- guage design and complex oligomer library synthesis. ACS Combinatorial Science 17(9), 476–480 (2015) https://doi.org/10.1021/acscombsci.5b00106
-
[8]
ACS Combinatorial Science19(3), 181–192 (2017) https://doi.org/10.1021/acscombsci.6b00192
Price, A.K., Paegel, B.M.: An integrated microfluidic processor for DNA-encoded combinatorial library functional screening. ACS Combinatorial Science19(3), 181–192 (2017) https://doi.org/10.1021/acscombsci.6b00192
-
[9]
Science (2024) https://doi.org/10.1126/science.adn3412
Keller, B.M.,et al.: Highly pure DNA-encoded chemical libraries by dual-linker solid-phase synthesis. Science (2024) https://doi.org/10.1126/science.adn3412
-
[10]
Nature Chemical Biology5(9), 647–654 (2009) https://doi.org/10.1038/ nchembio.211
Clark, M.A., Acharya, R.A., Arico-Muendel, C.C., Belyanskaya, S.L., Ben- jamin, D.R., Carlson, N.R., Centrella, P.A., Chiu, C.H., Creaser, S.P., Cuozzo, J.W.,et al.: Design, synthesis and selection of DNA-encoded small-molecule libraries. Nature Chemical Biology5(9), 647–654 (2009) https://doi.org/10.1038/ nchembio.211
2009
-
[11]
Proceedings of the National Academy of Sciences89(12), 5381–5383 (1992) https://doi.org/10.1073/ pnas.89.12.5381 14
Brenner, S., Lerner, R.A.: Encoded combinatorial chemistry. Proceedings of the National Academy of Sciences89(12), 5381–5383 (1992) https://doi.org/10.1073/ pnas.89.12.5381 14
1992
-
[12]
Journal of Medicinal Chemistry63(16), 8857–8866 (2020) https: //doi.org/10.1021/acs.jmedchem.0c00452
McCloskey, K., Sigel, E.A., Kearnes, S., Xue, L., Tian, X., Mocber, D., Ramsun- dar, B., Pande, V.: Machine learning on DNA-encoded libraries: a new paradigm for hit finding. Journal of Medicinal Chemistry63(16), 8857–8866 (2020) https: //doi.org/10.1021/acs.jmedchem.0c00452
-
[13]
npj Drug Discovery2, 5 (2025) https://doi.org/10.1038/s44386-025-00007-4
Iqbal, S., Jiang, W., Hansen, E., Ghosh, A., Hou, Y., Wang, X., Li, J.: Evaluation of DNA encoded library and machine learning model combinations for hit discov- ery. npj Drug Discovery2, 5 (2025) https://doi.org/10.1038/s44386-025-00007-4
-
[14]
Journal of Medici- nal Chemistry63(16), 8857–8866 (2020) https://doi.org/10.1021/acs.jmedchem
McCloskey, K., Sigel, E.A., Kearnes, S., Xue, L., Tian, X., Mocber, D., Ram- sundar, B., Mani, V.S., Husain, I., Iqbal, S., Riley, P.: Machine learning on DNA-encoded libraries: A new paradigm for hit finding. Journal of Medici- nal Chemistry63(16), 8857–8866 (2020) https://doi.org/10.1021/acs.jmedchem. 0c00452
-
[15]
Lim, K.S., Reidenbach, A.G., Hua, B.K., Mason, J.W., Gerry, C.J., Clemons, P.A., Coley, C.W.: Machine learning on DNA-encoded library count data using an uncertainty-aware probabilistic loss function. Journal of Chemical Information and Modeling62(9), 2248–2262 (2022) https://doi.org/10.1021/acs.jcim.2c00041
-
[16]
ACS Combinatorial Science22(8), 410–421 (2020) https://doi.org/10.1021/ acscombsci.0c00007
K´ om´ ar, P., Kalini´ c, M.: Denoising DNA encoded library screens with sparse learn- ing. ACS Combinatorial Science22(8), 410–421 (2020) https://doi.org/10.1021/ acscombsci.0c00007
2020
-
[17]
arXiv preprint (2024) arXiv:2410.08938
Chen, B., Danel, T., Dreiman, G.H.S., McEnaney, P.J., Jain, N., Novikov, K., Potapov, V., Harris, B., Krauklis, K., Ross, G., Franke, B., Gasser, M.T., Sul- tan, M.M.: KinDEL: DNA-encoded library dataset for kinase inhibitors. arXiv preprint (2024) arXiv:2410.08938. Accepted at ICML 2025
arXiv 2024
-
[18]
Kaggle Competition Dataset (2024)
Blevins, W.M., Quigley, I., Bio, L.: BELKA: Big Encoded Library for Chemi- cal Assessment. Kaggle Competition Dataset (2024). https://www.kaggle.com/ competitions/leash-BELKA
2024
-
[19]
Highly accurate protein structure prediction with AlphaFold.Nature, 596(7873):583–589, 2021
Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., ˇZ´ ıdek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S.A.A., Ballard, A.J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., Back, T., Petersen, S., Reiman, D., Clancy, E., Zielinski, M., Steinegger, M., Pacholska, M., Berg...
-
[20]
Nature Communications14(1), 6234 (2023) https://doi.org/10.1038/s41467-023-41454-9 15
Zhu, H., Zhou, R., Tang, J., Li, M.: A pharmacophore-guided deep learning approach for bioactive molecular generation. Nature Communications14(1), 6234 (2023) https://doi.org/10.1038/s41467-023-41454-9 15
-
[21]
In: International Conference on Learning Representations (ICLR) (2025)
Adams, K., Abeywardane, K., Fromer, J., Coley, C.W.: ShEPhERD: Diffus- ing shape, electrostatics, and pharmacophores for bioisosteric drug design. In: International Conference on Learning Representations (ICLR) (2025)
2025
-
[22]
arXiv preprint arXiv:2505.10545 (2025)
Alakhdar, A., Poczos, B., Washburn, N.: Pharmacophore-conditioned diffusion model for ligand-based de novo drug design. arXiv preprint arXiv:2505.10545 (2025)
arXiv 2025
-
[23]
Satorras, V.G., Hoogeboom, E., Welling, M.: E(n) Equivariant Graph Neural Networks (2021)
2021
-
[24]
https://arxiv.org/abs/2302.07541
Zhang, Z., Zhao, B., Xie, A., Bian, Y., Zhou, S.: Activity Cliff Prediction: Dataset and Benchmark (2023). https://arxiv.org/abs/2302.07541
arXiv 2023
-
[25]
In: Proceedings of the 37th Inter- national Conference on Machine Learning
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th Inter- national Conference on Machine Learning. PMLR, vol. 119, pp. 1597–1607 (2020)
2020
-
[26]
In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp
Assran, M., Duval, Q., Misra, I., Bojanowski, P., Vincent, P., Rabbat, M., LeCun, Y., Ballas, N.: Self-supervised learning from images with a joint-embedding pre- dictive architecture. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15619–15629 (2023)
2023
-
[27]
Riniker, S., Landrum, G.A.: Better informed distance geometry: Using what we know to improve conformation generation. Journal of Chemical Information and Modeling55(12), 2562–2574 (2015) https://doi.org/10.1021/acs.jcim.5b00654
-
[28]
Wang, R., Fang, X., Lu, Y., Wang, S.: The PDBbind database: Collection of binding affinities for protein–ligand complexes with known three-dimensional structures. Journal of Medicinal Chemistry47(12), 2977–2980 (2004) https: //doi.org/10.1021/jm030580l
-
[29]
Journal of Chem- ical Information and Modeling53(8), 1893–1904 (2013) https://doi.org/10.1021/ ci300604z 16
Koes, D.R., Baumgartner, M.P., Camacho, C.J.: Lessons learned in empirical scoring with smina from the CSAR 2011 benchmarking exercise. Journal of Chem- ical Information and Modeling53(8), 1893–1904 (2013) https://doi.org/10.1021/ ci300604z 16
2011
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.