Target-Aware Bandit Allocation for Scalable Surrogate Optimization in Chemical Space
Pith reviewed 2026-06-26 05:26 UTC · model grok-4.3
The pith
Bandit allocation of surrogate inference across chemical partitions enables optimization over libraries too large for full evaluation.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By modeling partitions of the chemical action space as arms in a multi-armed bandit and applying optimism-under-uncertainty selection, BOBa adaptively allocates surrogate inference and downstream evaluations only to empirically promising partitions, removing the need to perform full-library inference while still identifying high-utility candidates from synthesis-on-demand libraries.
What carries the argument
The BOBa framework, which partitions the action space and treats each partition as a bandit arm whose reward signal guides allocation of surrogate computations.
If this is right
- Surrogate inference cost drops while screening performance remains competitive with full-library methods.
- Virtual screening becomes practical for current synthesis-on-demand libraries of billions to trillions of compounds.
- A tunable tradeoff appears between the fraction of partitions explored and the quality of the final candidate set.
- Optimism-under-uncertainty bandit policies are required for effective concentration of computation on high-utility regions.
Where Pith is reading between the lines
- The same partition-and-bandit pattern could be tested on other large discrete spaces such as protein design or materials libraries.
- The quality of the initial partitioning step directly limits how quickly the bandit can locate good regions.
- Combining the allocation layer with more expressive surrogates or active-learning loops could further reduce the number of expensive evaluations needed.
Load-bearing premise
The chemical action space admits a partitioning into arms such that partial observations from the bandit process can reliably identify and concentrate on high-utility regions without systematic bias from the partition boundaries.
What would settle it
A controlled experiment on a library whose highest-utility compounds lie entirely inside one partition, yet the bandit consistently under-allocates inference to that partition after initial observations, producing lower final performance than uniform allocation across all partitions.
Figures
read the original abstract
Identifying high-utility candidates from massive discrete spaces under expensive evaluations is a recurring challenge across the sciences, with structure-based drug discovery as a prominent example. While surrogate-based optimization can increase sample efficiency by reducing the number of expensive evaluations, modern molecular libraries have reached billions to trillions of compounds, making full-library surrogate inference itself a major computational bottleneck. We introduce BOBa, a bandit-guided surrogate optimization framework that eliminates full-library inference by adaptively allocating computation across partitions of the action space. By treating partitions as arms in a multi-armed bandit, BOBa concentrates inference and evaluations on empirically promising partitions while maintaining principled exploration. Experiments on real-world synthesis-on-demand libraries demonstrate that optimism-under-uncertainty bandits, combined with meaningful action space partitioning, are essential for effective allocation of inference and evaluations. Our findings reveal a tunable tradeoff between screening performance and surrogate inference cost, which supports practical optimization over current libraries, and establishes a viable route to ultra-large library virtual screening.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BOBa, a bandit-guided surrogate optimization framework for identifying high-utility candidates in massive discrete chemical spaces. It treats partitions of the action space as arms in a multi-armed bandit to adaptively allocate surrogate inference and evaluations, avoiding full-library computation. The central claim is that optimism-under-uncertainty bandits combined with meaningful action space partitioning are essential for effective allocation, supported by experiments on real-world synthesis-on-demand libraries that also reveal a tunable tradeoff between screening performance and inference cost.
Significance. If the results hold, the approach could enable practical virtual screening over billion- to trillion-scale libraries by reducing the computational cost of surrogate inference, addressing a key scalability bottleneck in structure-based drug discovery. The explicit tradeoff analysis is a practical contribution.
major comments (2)
- [Abstract] Abstract: the claim that 'experiments on real-world synthesis-on-demand libraries demonstrate that optimism-under-uncertainty bandits, combined with meaningful action space partitioning, are essential' is presented without any quantitative results, error bars, baseline comparisons, or details on performance metrics, leaving the central empirical claim without verifiable support in the provided description.
- [Experiments (implied by abstract claim)] The load-bearing assumption that partitions admit sufficiently uniform utility within arms (so that partial observations can reliably concentrate on high-utility regions) is not tested against boundary bias. Any fixed partition (e.g., by scaffold or fingerprint clustering) can split chemically similar high-scoring molecules, potentially causing an arm with low observed mean to contain the global optimum; no ablation or sensitivity analysis on partition construction appears to address this.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback. We address each major comment below and indicate where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that 'experiments on real-world synthesis-on-demand libraries demonstrate that optimism-under-uncertainty bandits, combined with meaningful action space partitioning, are essential' is presented without any quantitative results, error bars, baseline comparisons, or details on performance metrics, leaving the central empirical claim without verifiable support in the provided description.
Authors: The abstract is a concise summary; the full quantitative results (performance metrics, baseline comparisons, error bars, and statistical details) appear in the Experiments section. To address the concern, we will revise the abstract to include a brief sentence highlighting key empirical outcomes from the synthesis-on-demand library experiments. revision: yes
-
Referee: [Experiments (implied by abstract claim)] The load-bearing assumption that partitions admit sufficiently uniform utility within arms (so that partial observations can reliably concentrate on high-utility regions) is not tested against boundary bias. Any fixed partition (e.g., by scaffold or fingerprint clustering) can split chemically similar high-scoring molecules, potentially causing an arm with low observed mean to contain the global optimum; no ablation or sensitivity analysis on partition construction appears to address this.
Authors: We agree this is an important robustness consideration. The manuscript employs scaffold-based partitioning for chemical interpretability and shows effective allocation in experiments, but does not include an explicit ablation across partition types or direct sensitivity analysis to boundary effects. We will add this analysis in revision, comparing scaffold, fingerprint clustering, and alternative schemes to quantify impact on allocation quality. revision: yes
Circularity Check
No significant circularity; applies standard bandits to partitioned spaces without self-referential reductions
full rationale
The manuscript presents BOBa as a framework that treats chemical-space partitions as arms in a multi-armed bandit and allocates surrogate inference accordingly. The central claim rests on experimental demonstration that optimism-under-uncertainty bandits plus meaningful partitioning improve allocation efficiency. No equations, derivations, or self-citations are shown that reduce the reported performance gains to quantities defined by the method itself, to fitted parameters renamed as predictions, or to load-bearing self-citations. The work therefore applies existing bandit algorithms to a new domain without the circular patterns enumerated in the analysis criteria.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
and Singh, Isha and Levit, Anat and Moroz, Yurii S
Lyu, Jiankun and Wang, Sheng and Balius, Trent E. and Singh, Isha and Levit, Anat and Moroz, Yurii S. and O’Meara, Matthew J. and Che, Tao and Algaa, Enkhjargal and Tolmachova, Kateryna and Tolmachev, Andrey A. and Shoichet, Brian K. and Roth, Bryan L. and Irwin, John J. , year =. Ultra-large library docking for discovering new chemotypes , volume =. Natu...
-
[2]
Graff, David E. and Shakhnovich, Eugene I. and Coley, Connor W. , year =. Accelerating high-throughput virtual screening through molecular pool-based active learning , volume =. Chemical Science , publisher =. doi:10.1039/d0sc06805e , number =
-
[3]
Kristiadi, Agustinus and Strieth-Kalthoff, Felix and Skreta, Marta and Poupart, Pascal and Aspuru-Guzik, Al\'. A sober look at LLMs for material discovery: are they actually good for bayesian optimization over molecules? , url =. Proceedings of the 41st International Conference on Machine Learning , articleno =. doi:10.48550/arXiv.2402.05015 , year =
-
[4]
Gorgulla, Christoph and Cecchini, Domiziana and Nigam, AkshatKumar and Tang, Ming and Reis, Joana and Koop, Matt and Gottinger, Andrea and Nicoll, Callum Robert and Jayaraj, Abhilash and Cinaroglu, Suleyman Selim and Torner, Ricarda and Seo, Hyuk-Soo and Dhe-Paganon, Sirano and Secker, Christopher and Haddadnia, Mohammad and Malets, Yehor and Hasson, Alex...
-
[5]
Landrum, Greg and Tosco, Paolo and Kelley, Brian and Rodriguez-Schmidt, Ricardo and Cosgrove, David and Riniker, Sereina and Gedeck, Peter and Vianello, Riccardo and Schneider, Nadine and Kawashima, Eisuke and N, Dan and Jones, Gareth and Dalke, Andrew and Cole, Brian and Swain, Matt and Turk, Samo and Savelyev, Alexander and Vaucher, Alain and Wójcikowsk...
-
[6]
Open Babel: An open chemical toolbox , volume =
O’Boyle, Noel M and Banck, Michael and James, Craig A and Morley, Chris and Vandermeersch, Tim and Hutchison, Geoffrey R , year =. Open Babel: An open chemical toolbox , volume =. Journal of Cheminformatics , publisher =. doi:10.1186/1758-2946-3-33 , number =
-
[7]
Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening , volume =
Yu, Yuejiang and Cai, Chun and Wang, Jiayue and Bo, Zonghua and Zhu, Zhengdan and Zheng, Hang , year =. Uni-Dock: GPU-Accelerated Docking Enables Ultralarge Virtual Screening , volume =. Journal of Chemical Theory and Computation , publisher =. doi:10.1021/acs.jctc.2c01145 , number =
-
[8]
Proceedings of the 40th International Conference on Machine Learning (ICML 2023) , pages =
Unifying Molecular and Textual Representations via Multi-task Language Modelling , url =. Proceedings of the 40th International Conference on Machine Learning (ICML 2023) , pages =. 2023 , publisher =. doi:10.48550/arXiv.2301.12586 , author =
-
[9]
Liu , title =
Colin Raffel and Noam Shazeer and Adam Roberts and Katherine Lee and Sharan Narang and Michael Matena and Yanqi Zhou and Wei Li and Peter J. Liu , title =. Journal of Machine Learning Research , year =
-
[10]
Auer, Peter and Cesa-Bianchi, Nicolò and Fischer, Paul , year =. Finite-time Analysis of the Multiarmed Bandit Problem , volume =. Machine Learning , publisher =. doi:10.1023/a:1013689704352 , number =
-
[11]
Lattimore, Tor and Szepesvári, Csaba , year =. Bandit Algorithms , url =. doi:10.1017/9781108571401 , publisher =
-
[12]
Daxberger, Erik and Kristiadi, Agustinus and Immer, Alexander and Eschenhagen, Runa and Bauer, Matthias and Hennig, Philipp , title =. Proceedings of the 35th International Conference on Neural Information Processing Systems (NeurIPS 2021) , pages =. doi:10.48550/arXiv.2106.14806 , year =
-
[13]
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning , author =. Proceedings of the 33rd International Conference on Machine Learning (ICML 2016) , pages =. 2016 , editor =. doi:10.48550/arXiv.1506.02142 , abstract =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1506.02142 2016
-
[14]
and Izmailov, Pavel and Garipov, Timur and Vetrov, Dmitry P
Maddox, Wesley J. and Izmailov, Pavel and Garipov, Timur and Vetrov, Dmitry P. and Wilson, Andrew Gordon , title =. Proceedings of the 32nd International Conference on Neural Information Processing Systems (NeurIPS 2019) , pages =. 2019 , url =
2019
-
[15]
Kingma and Jimmy Ba , title =
Diederik P. Kingma and Jimmy Ba , title =. 3rd International Conference on Learning Representations (ICLR 2015) , year =
2015
-
[16]
SGDR: Stochastic Gradient Descent with Warm Restarts
Ilya Loshchilov and Frank Hutter , title =. 5th International Conference on Learning Representations (ICLR 2017) , year =. doi:10.48550/arXiv.1608.03983 , url =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1608.03983 2017
-
[17]
ICML 2023 Workshop on Structured Probabilistic Inference
Graph Neural Network Powered Bayesian Optimization for Large Molecular Spaces , author=. ICML 2023 Workshop on Structured Probabilistic Inference. 2023 , url=
2023
-
[18]
Restrepo, Guillermo , year =. Chemical space: limits, evolution and modelling of an object bigger than our universal library , volume =. Digital Discovery , publisher =. doi:10.1039/d2dd00030j , number =
-
[19]
The elephant in the lab: synthesizability in generative small-molecule design , volume =
Papidocha, Sven M and Burger, Andreas and Bernales, Varinia and Aspuru-Guzik, Alán , year =. The elephant in the lab: synthesizability in generative small-molecule design , volume =. doi:10.1016/j.coche.2025.101217 , journal =
-
[20]
Shoichet, Brian K. , year =. Virtual screening of chemical libraries , volume =. Nature , publisher =. doi:10.1038/nature03197 , number =
-
[21]
The cost of drug development: A systematic review , volume =
Morgan, Steve and Grootendorst, Paul and Lexchin, Joel and Cunningham, Colleen and Greyson, Devon , year =. The cost of drug development: A systematic review , volume =. Health Policy , publisher =. doi:10.1016/j.healthpol.2010.12.002 , number =
-
[22]
Active-learning strategies in computer-assisted drug discovery
Reker, Daniel and Schneider, Gisbert , year =. Active-learning strategies in computer-assisted drug discovery , volume =. Drug Discovery Today , publisher =. doi:10.1016/j.drudis.2014.12.004 , number =
-
[23]
Practical considerations for active machine learning in drug discovery
Reker, Daniel , year =. Practical considerations for active machine learning in drug discovery , volume =. doi:10.1016/j.ddtec.2020.06.001 , journal =
-
[24]
Hoffmann, Torsten and Gastreich, Marcus , year =. The next level in chemical space navigation: going far beyond enumerable compound libraries , volume =. Drug Discovery Today , publisher =. doi:10.1016/j.drudis.2019.02.013 , number =
-
[25]
Warr, Wendy A. and Nicklaus, Marc C. and Nicolaou, Christos A. and Rarey, Matthias , year =. Exploration of Ultralarge Compound Collections for Drug Discovery , volume =. Journal of Chemical Information and Modeling , publisher =. doi:10.1021/acs.jcim.2c00224 , number =
-
[26]
ICLR 2025 Workshop on Generative and Experimental Perspectives for Biomolecular Design , year=
Active Learning on Synthons for Molecular Design , author=. ICLR 2025 Workshop on Generative and Experimental Perspectives for Biomolecular Design , year=
2025
-
[27]
Kozyrev, Vladimir and Sindt, Fran. Active Learning to Select the Most Suitable Reagents and One-Step Organic Chemistry Reactions for Prioritizing Target-Specific Hits from Ultralarge Chemical Spaces , volume =. Journal of Chemical Information and Modeling , publisher =. 2025 , pages =. doi:10.1021/acs.jcim.4c02097 , number =
-
[28]
Klarich, Kathryn and Goldman, Brian and Kramer, Trevor and Riley, Patrick and Walters, W. Patrick , year =. Thompson Sampling – An Efficient Method for Searching Ultralarge Synthesis on Demand Databases , volume =. Journal of Chemical Information and Modeling , publisher =. doi:10.1021/acs.jcim.3c01790 , number =
-
[29]
and Gathiaka, Symon and Walters, W
Zhao, Hongtao and Nittinger, Eva and Yu, Melissa A. and Gathiaka, Symon and Walters, W. Patrick and Tyrchan, Christian , year =. Enhanced Thompson sampling by roulette wheel selection for screening ultralarge combinatorial libraries , volume =. Journal of Cheminformatics , publisher =. doi:10.1186/s13321-025-01105-1 , number =
-
[30]
Nazarova, Antonina L. and Sadybekov, Anastasiia V. and Sadybekov, Arman A. and Protopopov, Mykola and Radchenko, Dmytro S. and Moroz, Yurii S. and Tarkhanova, Olga O. and Katritch, Vsevolod , year =. V-synthes2 - the Next Generation Tool for Structure-based Virtual Screening of Giga-scale Chemical Spaces , url =. doi:10.21203/rs.3.rs-7782723/v1 , publisher =
-
[31]
Gorgulla, Christoph and Boeszoermenyi, Andras and Wang, Zi-Fu and Fischer, Patrick D. and Coote, Paul W. and Padmanabha Das, Krishna M. and Malets, Yehor S. and Radchenko, Dmytro S. and Moroz, Yurii S. and Scott, David A. and Fackeldey, Konstantin and Hoffmann, Moritz and Iavniuk, Iryna and Wagner, Gerhard and Arthanari, Haribabu , year =. An open-source ...
-
[32]
Pyzer-Knapp, E. O. , year =. Bayesian optimization for accelerated drug discovery , volume =. IBM Journal of Research and Development , publisher =. doi:10.1147/jrd.2018.2881731 , number =
-
[33]
A Tutorial on Bayesian Optimization
A Tutorial on Bayesian Optimization , year =. arXiv Preprint , author =. doi:10.48550/arXiv.1807.02811 , url =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1807.02811
-
[34]
2023 , publisher =
Garnett, Roman , title =. 2023 , publisher =
2023
-
[35]
Lyu, Jiankun and Irwin, John J. and Shoichet, Brian K. , year =. Modeling the expansion of virtual screening libraries , volume =. Nature Chemical Biology , publisher =. doi:10.1038/s41589-022-01234-w , number =
-
[36]
Gloriam, David E. , year =. Bigger is better in virtual drug screens , volume =. Nature , publisher =. doi:10.1038/d41586-019-00145-6 , number =
-
[37]
Synthesis Lectures on Artificial Intelligence and Machine Learning
Settles, Burr , year =. Active Learning , url =. doi:10.1007/978-3-031-01560-1 , journal =
-
[38]
Bayesian Optimal Active Search and Surveying
Garnett, Roman and Krishnamurthy, Yamuna and Xiong, Xuehan and Schneider, Jeff and Mann, Richard , title =. Proceedings of the 29th International Conference on Machine Learning (ICML 2012) , pages =. 2012 , publisher =. doi:10.48550/arXiv.1206.6406 , url =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1206.6406 2012
-
[39]
Introducing the ‘active search’ method for iterative virtual screening , volume =
Garnett, Roman and G\". Introducing the ‘active search’ method for iterative virtual screening , volume =. Journal of Computer-Aided Molecular Design , publisher =. 2015 , pages =. doi:10.1007/s10822-015-9832-9 , number =
-
[40]
Efficient nonmyopic active search with applications in drug and materials discovery
Jiang, Shali and Malkomes, Gustavo and Moseley, Benjamin and Garnett, Roman , keywords =. Efficient nonmyopic active search with applications in drug and materials discovery , publisher =. 2018 , copyright =. doi:10.48550/ARXIV.1811.08871 , url =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1811.08871 2018
-
[41]
Some aspects of the sequential design of experiments , volume =
Robbins, Herbert , year =. Some aspects of the sequential design of experiments , volume =. Bulletin of the American Mathematical Society , publisher =. doi:10.1090/s0002-9904-1952-09620-8 , number =
-
[42]
Ultra-large library screening with an evolutionary algorithm in Rosetta (REvoLd) , volume =
Eisenhuth, Paul and Liessmann, Fabian and Moretti, Rocco and Meiler, Jens , year =. Ultra-large library screening with an evolutionary algorithm in Rosetta (REvoLd) , volume =. Communications Chemistry , publisher =. doi:10.1038/s42004-025-01758-x , number =
-
[43]
Sadybekov, Arman A. and Sadybekov, Anastasiia V. and Liu, Yongfeng and Iliopoulos-Tsoutsouvas, Christos and Huang, Xi-Ping and Pickett, Julie and Houser, Blake and Patel, Nilkanth and Tran, Ngan K. and Tong, Fei and Zvonok, Nikolai and Jain, Manish K. and Savych, Olena and Radchenko, Dmytro S. and Nikas, Spyros P. and Petasis, Nicos A. and Moroz, Yurii S....
-
[44]
Grygorenko, Oleksandr O. and Radchenko, Dmytro S. and Dziuba, Igor and Chuprina, Alexander and Gubina, Kateryna E. and Moroz, Yurii S. , year =. Generating Multibillion Chemical Space of Readily Accessible Screening Compounds , volume =. iScience , publisher =. doi:10.1016/j.isci.2020.101681 , number =
-
[45]
Bong, Seoung Min and Moon, Jin Ho and Nam, Ki Hyun and Lee, Ki Seog and Chi, Young Min and Hwang, Kwang Yeon , year =. Structural studies of human brain‐type creatine kinase complexed with the ADP–Mg2+NO3-–creatine transition‐state analogue complex , volume =. FEBS Letters , publisher =. doi:10.1016/j.febslet.2008.10.039 , number =
-
[46]
Structure-based design of potent and selective inhibitors of the
Maspero, Elena and Cappa, Anna and Weber, Janine and Trifirò, Paolo and Amici, Raffaella and Bruno, Agostino and Fagà, Giovanni and Cecatiello, Valentina and Fattori, Raimondo and Leuzzi, Brian and Taibi, Vincenzo and Meroni, Giuseppe and Pasi, Maurizio and Romussi, Alessia and Sartori, Luca and Villa, Manuela and Vultaggio, Stefania and Cirò, Marco and S...
-
[47]
Liu, Fangyu and Mailhot, Olivier and Glenn, Isabella S. and Vigneron, Seth F. and Bassim, Violla and Xu, Xinyu and Fonseca-Valencia, Karla and Smith, Matthew S. and Radchenko, Dmytro S. and Fraser, James S. and Moroz, Yurii S. and Irwin, John J. and Shoichet, Brian K. , year =. The impact of library size and scale of testing on virtual screening , volume ...
-
[48]
Levine, Nir and Crammer, Koby and Mannor, Shie , title =. Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS 2017) , pages =. 2017 , publisher =. doi:10.48550/arXiv.1702.07274 , url =
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1702.07274 2017
-
[49]
and Varun, Begur V
Naik, Maruti and Raichurkar, Anandkumar and Bandodkar, Balachandra S. and Varun, Begur V. and Bhat, Shantika and Kalkhambkar, Rajesh and Murugan, Kannan and Menon, Rani and Bhat, Jyothi and Paul, Beena and Iyer, Harini and Hussein, Syeed and Tucker, Julie A. and Vogtherr, Martin and Embrey, Kevin J. and McMiken, Helen and Prasad, Swati and Gill, Adrian an...
2015
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.