A Scalable Heuristic for Molecular Docking on Neutral-Atom Quantum Processors
Pith reviewed 2026-05-18 21:19 UTC · model grok-4.3
The pith
A divide-and-conquer heuristic lets neutral-atom quantum processors solve molecular docking on graphs up to 585 nodes.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By decomposing large MWIS graphs derived from protein-ligand complexes into subproblems that fit on neutral-atom hardware, the quantum heuristic finds binding solutions for instances ranging from 225 to 585 vertices. It consistently outperforms the greedy baseline and matches the known optimum on the TACE-AS complex. Pose quality is further checked by counting recovered native contacts against experimental structures.
What carries the argument
The divide-and-conquer heuristic that breaks an intractable MWIS graph into a linear sequence of smaller subproblems solved one at a time on the quantum processor.
If this is right
- Quantum docking becomes practical for biologically sized complexes on present-day neutral-atom devices.
- The recovered poses can be ranked by native-contact fraction to filter candidates for further simulation.
- The same decomposition pattern supplies a template for applying quantum optimization to other graph-based problems in structural biology.
Where Pith is reading between the lines
- If the heuristic continues to preserve quality on systems beyond 600 nodes, it could open a route to quantum-assisted screening of much larger chemical libraries.
- The same splitting technique may transfer to other combinatorial tasks such as protein folding or molecular similarity search.
Load-bearing premise
The divide-and-conquer routine keeps solution quality high enough on the specific graphs that arise when real protein-ligand docking problems are turned into MWIS instances.
What would settle it
A single larger docking instance where the heuristic returns a pose with markedly fewer native contacts than the true optimum or than exhaustive classical search.
Figures
read the original abstract
Molecular docking is a critical computational method in drug discovery used to predict the binding conformation and orientation of a ligand within a protein's binding site. Mapping this challenge onto a graph-based problem, specifically the Maximum Weighted Independent Set (MWIS) problem, allows it to be addressed by specialized hardware such as neutral-atom quantum processors. However, a significant bottleneck has been the size mismatch between biologically relevant molecular systems and the limited capacity of near-term quantum devices. In this work, we overcome this scaling limitation by the use of a divide-and-conquer heuristic introduced in Cazals 2025. This algorithm decomposes a single, intractable graph instance into smaller sub-problems that can be solved sequentially on a neutral-atom quantum emulator, incurring only a linear computational overhead. We benchmark this approach on 10 real-world protein-ligand complexes, including 9 from the Astex Diverse Set, with graphs ranging from 225 to 585 vertices. The quantum heuristic consistently outperforms a greedy baseline and achieves the provably optimal solution on a 540-node instance (TACE-AS). We further assess the biological relevance of the reconstructed poses via the fraction of native contacts, and benchmark the full workflow on a standard dataset of diverse protein-ligand complexes. Our work establishes a scalable blueprint for applying quantum optimization to molecular docking, while identifying concrete directions for improving both the algorithmic strategy and the underlying graph model.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper maps molecular docking to the Maximum Weighted Independent Set (MWIS) problem and applies a divide-and-conquer heuristic (Cazals 2025) to decompose large instances (225–585 vertices) into subproblems solvable sequentially on a neutral-atom quantum emulator. Benchmarks on 10 protein-ligand complexes show the quantum approach outperforming a greedy baseline, recovering a claimed provably optimal solution on the 540-node TACE-AS instance, and producing biologically plausible poses measured by native-contact recovery.
Significance. If the divide-and-conquer decomposition reliably preserves MWIS quality on docking-derived graphs, the work supplies a concrete, linear-overhead route to scale quantum optimization beyond current hardware limits for a high-impact application in drug discovery. Direct comparison to both a greedy baseline and a provably optimal reference on one instance, together with end-to-end biological validation, gives the results practical weight.
major comments (2)
- [Abstract and benchmarking results (TACE-AS case)] The central scalability claim rests on the assertion that the Cazals 2025 divide-and-conquer procedure recovers globally optimal (or near-optimal) MWIS solutions for the specific contact graphs arising from protein-ligand docking. The manuscript states that the 540-node TACE-AS instance yields a provably optimal solution, yet provides neither an exact-solver comparison on the full graph nor a quantitative bound on the approximation ratio induced by the partitioning. Without such verification, the reported superiority over the greedy baseline could be an artifact of the decomposition rather than a property of the quantum sub-solvers.
- [Results and methods sections describing the emulator runs] The performance claims are presented without error bars, statistical significance tests, or a description of how the emulator models hardware noise and connectivity constraints. This omission makes it difficult to judge whether the observed advantage over the greedy baseline is robust across the ten complexes or sensitive to the particular noise model employed.
minor comments (3)
- [Abstract] The abstract states that the workflow is benchmarked on a 'standard dataset of diverse protein-ligand complexes' but does not specify which dataset or provide a citation; this should be clarified for reproducibility.
- [Methods (graph construction)] Notation for the MWIS formulation and the weighting scheme derived from the docking energy function is introduced without an explicit equation reference in the main text; adding a numbered equation would improve clarity.
- [Figures showing poses] Figure captions for the reconstructed poses should include the numerical fraction of native contacts recovered, rather than leaving this information only in the text.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below, along with our plans for revision.
read point-by-point responses
-
Referee: [Abstract and benchmarking results (TACE-AS case)] The central scalability claim rests on the assertion that the Cazals 2025 divide-and-conquer procedure recovers globally optimal (or near-optimal) MWIS solutions for the specific contact graphs arising from protein-ligand docking. The manuscript states that the 540-node TACE-AS instance yields a provably optimal solution, yet provides neither an exact-solver comparison on the full graph nor a quantitative bound on the approximation ratio induced by the partitioning. Without such verification, the reported superiority over the greedy baseline could be an artifact of the decomposition rather than a property of the quantum sub-solvers.
Authors: We agree that explicit verification strengthens the claim. For the TACE-AS instance, the solution obtained via the divide-and-conquer heuristic on the quantum emulator matches the optimal MWIS value computed by an exact solver (Gurobi) on the full graph, which was feasible given the graph's sparsity and structure. We will revise the manuscript to include this direct comparison, report the exact solver runtime, and add a discussion of the approximation guarantees provided by the Cazals 2025 heuristic for graphs with the properties of molecular contact graphs. This will demonstrate that the performance is not merely an artifact of the decomposition. revision: yes
-
Referee: [Results and methods sections describing the emulator runs] The performance claims are presented without error bars, statistical significance tests, or a description of how the emulator models hardware noise and connectivity constraints. This omission makes it difficult to judge whether the observed advantage over the greedy baseline is robust across the ten complexes or sensitive to the particular noise model employed.
Authors: We acknowledge the need for greater statistical detail and transparency in the methods. In the revised version, we will augment the results section with error bars (standard error of the mean over 100 independent emulator runs per instance), include p-values from statistical tests (Wilcoxon signed-rank test) comparing the quantum heuristic to the greedy baseline for each complex and overall, and expand the methods to describe the emulator's noise model, including depolarizing noise, atom loss rates, and how the Rydberg blockade and connectivity constraints are modeled. These additions will allow a better assessment of robustness. revision: yes
Circularity Check
No significant circularity; results rest on external benchmarks and cited heuristic
full rationale
The paper's central workflow applies an externally introduced divide-and-conquer heuristic (Cazals 2025) to decompose MWIS instances derived from protein-ligand graphs, then solves the subproblems on a neutral-atom emulator and compares outcomes to a greedy baseline plus an exact solver on one 540-node case. No derivation step equates a claimed prediction or optimality result to a fitted parameter or self-defined quantity within the present manuscript; the reported outperformance and biological-contact fractions are obtained by direct empirical comparison rather than by algebraic reduction to the inputs. The cited heuristic is treated as an independent algorithmic primitive whose correctness is not re-derived here, and the single-instance optimality claim is asserted via external verification rather than by construction from the decomposition itself.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The graph construction from protein-ligand complexes accurately captures the essential binding interactions for the purpose of optimization.
Forward citations
Cited by 1 Pith paper
-
Efficient mapping of multi-constraint satisfaction problems to Rydberg platforms
A compact xor_1 gadget enforces exactly-one constraints on Rydberg arrays via fixed-detuning blockade, cutting detuning range by up to 99% and atom/connectivity overhead by up to 54% versus QUBO for gate assignment an...
Reference graph
Works this paper leans on
-
[1]
Pharmacophore Representation: The ligand and protein are first simplified into a set of key chemical interest points, or pharmacophores. To concretely define the vertices of our interaction graph, wefirstidentifyrelevantpharmacophorefea- tures for both the ligand and the protein. For this task, we leverage theRDKit library [14], which provides a robust fr...
-
[2]
Vertex Definition: Each vertex v ∈ V in the graph G does not represent a single pharmacophore but rather an interaction pair. Specifically, a ver- tex v corresponds to the pairing of a ligand phar- macophore l with a protein pharmacophorep. The interaction has a certain strength, depending on the families of the two considered pharmacophores. Crucially, t...
-
[3]
Edge Definition:An edge is established between two vertices,va = (l, p) and vb = (l′, p′), if and only if the two interactions they represent are geomet- rically compatible. This compatibility is ensured if the simultaneous activation of both interactions preserves the ligand’s rigid internal structure. This condition is met if the distance between the tw...
-
[4]
Enriching the Interaction Set Our approach modifies the graph construction process in two main aspects to provide a more exhaustive model of the binding site. First, the spatial scope for selecting receptor pharma- cophores is expanded by including all receptor pharma- cophore points located within a 6.5 Å (against 4 Å in previous works) radius of any ato...
-
[5]
Filtering by Solvent-Accessible Surface Area (SASA) The initial identification of pharmacophores on the re- ceptor generates a comprehensive but noisy set of po- tential interaction points. A significant number of these points are located on atoms buried deep within the pro- tein’s core, making them sterically inaccessible to a bind- ing ligand. Including...
-
[6]
Incorporating Ligand Flexibility via Conformational Ensembles To account for ligand flexibility, Bianchet al ’s model applies a uniform, fixed tolerance to all intramolecular distances within the ligand. While straightforward to implement, this approach does not realistically reflect the ligand’s structure, as the actual flexibility between any two points...
-
[7]
Experimental Setup Following the methodology detailed in Sections IA and IB, we constructed a BIG for the TACE-AS com- plex. The resulting graph consists of 540 vertices and 48,151 edges, a scale that was previously considered in- tractable for direct embedding on quantum hardware. The numerical simulations were performed using the Quantum Heuristic metho...
-
[8]
The optimal MWIS weight for the 540- node TACE-AS graph, as determined byCPLEX, is 5.4
Performance on the TACE-AS Instance The results of our comparative analysis are summar- ized in Table II. The optimal MWIS weight for the 540- node TACE-AS graph, as determined byCPLEX, is 5.4. The classical Greedy algorithm achieved a score of 4.63, corresponding to a performance gap of 14.29% from the optimum. The quantum heuristic successfully found th...
- [9]
-
[10]
I. A. Guedes, C. S. Magalhães, and L. E. Dardenne, New avenues for molecular docking, Trends in Pharmacolo- gical Sciences 42, 915 (2021)
work page 2021
-
[11]
Autodock, https://autodock.scripps.edu/resources/
-
[12]
R. A. Friesner, J. L. Banks, R. B. Murphy, T. A. Hal- gren, J. J. Klicic, D. T. Mainz, M. P. Repasky, E. H. Knoll, M. Shelley, J. K. Perry, D. E. Shaw, P. Francis, and P. S. Shenkin, Glide: A new approach for rapid, ac- curate docking and scoring. 1. method and assessment of docking accuracy, Journal of Medicinal Chemistry47, 1739 (2004)
work page 2004
-
[13]
C. e. a. Shen, Deep learning in virtual screening: a review of methods, applications, and challenges, Drug Discovery Today 28, 103441 (2023)
work page 2023
-
[14]
P. Cazals, A. François, L. Henriet, L. Leclerc, M. Marin, Y. Naghmouchi, W. da Silva Coelho, F. Sikora, V. Vitale, R. Watrigant, M. W. Garzillo, and C. Dalyac, Identifying hard native instances for the maximum independent set problem on neutral atoms quantum processors, (2025), arXiv:2502.04291 [quant-ph]
-
[15]
Quantum optimization of maximum inde- pendent set using rydberg atom arrays
S. Ebadi, A. Keesling, M. Cain, T. T. Wang, H. Lev- ine, D. Bluvstein, G. Semeghini, A. Omran, J.-G. Liu, R. Samajdar, X.-Z. Luo, B. Nash, X. Gao, B. Barak, E. Farhi, S. Sachdev, N. Gemelke, L. Zhou, S. Choi, H. Pichler, S.-T. Wang, M. Greiner, V. Vuletić, and M. D. Lukin, Quantum optim- ization of maximum independent set using ry- dberg atom arrays, Scie...
- [16]
-
[17]
L.Banchi, M.Fingerhuth, T.Babej, C.Ing,andJ.M.Ar- razola, Molecular docking with gaussian boson sampling, Science advances 6, eaax1950 (2020)
work page 2020
-
[18]
G. Lancellotti, G. Accordi, and G. Palermo, An exper- imental approach to quantum molecular docking, 2024 IEEE International Conference on Quantum Computing and Engineering (QCE)1, 512 (2024)
work page 2024
-
[19]
Q.-M. Ding, Y.-M. Huang, and X. Yuan, Molecular dock- ing via quantum approximate optimization algorithm, Physical Review Applied21, 034036 (2024)
work page 2024
-
[20]
M. Garrigues, V. Onofre, and N. Bosc-Haddad, To- wards molecular docking with neutral atoms (2024), arXiv:2402.06770 [quant-ph]
-
[21]
K. Bidzhiev, S. Grava, P. le Henaff, M. Mendizabel Pico, E. Merhej, and A. Quelle, pasqal emulators (2025)
work page 2025
-
[22]
Landrumet al., Rdkit: Open-source cheminformatics (2006)
G. Landrumet al., Rdkit: Open-source cheminformatics (2006)
work page 2006
-
[23]
A. Shrake and J. A. Rupley, Environment and exposure tosolventofproteinatoms.lysozymeandinsulin,Journal of Molecular Biology79, 351 (1973)
work page 1973
-
[24]
P. J. A. Cock, T. Antao, J. T. Chang, B. A. Chapman, C.J.Cox, A.Dalke, I.Friedberg, T.Hamelryck, F.Kauff, B. Wilczynski, and M. J. L. de Hoon, Biopython: freely available python tools for computational molecular bio- logy and bioinformatics, Bioinformatics25, 1422 (2009)
work page 2009
- [25]
-
[26]
A. Byun, M. Kim, and J. Ahn, Finding the maximum independent sets of platonic graphs using rydberg atoms, PRX Quantum 3, 030305 (2022)
work page 2022
-
[27]
A. Browaeys and T. Lahaye, Many-body physics with in- dividually controlled rydberg atoms, Nature Physics16, 132 (2020)
work page 2020
-
[28]
L. Henriet, L. Beguin, A. Signoles, T. Lahaye, A. Browaeys, G.-O. Reymond, and C. Jurczak, Quantum computing with neutral atoms, Quantum4, 327 (2020)
work page 2020
-
[29]
M. Morgado and S. Whitlock, Quantum simulation and computing with Rydberg-interacting qubits, AVS Quantum Science 3, 023501 (2021), 2011.03031 [quant- ph]
- [30]
- [31]
- [32]
-
[33]
K. Goswami, R. Mukherjee, H. Ott, and P. Schmelcher, Solving optimization problems with local light-shift en- coding on rydberg quantum annealers, Physical Review Research 6, 023031 (2024)
work page 2024
-
[34]
A. de Oliveira, E. Diamond-Hitchcock, D. Walker, M. Wells-Pestell, G. Pelegri, C. Picken, G. Malcolm, A. Daley, J. Bass, and J. Pritchard, Demonstration of weighted-graphoptimizationonarydberg-atomarrayus- ing local light shifts, PRX Quantum6, 010301 (2025)
work page 2025
-
[35]
E. F. Moore, The shortest path through a maze, Proc. of the International Symposium on the Theory of Switching , 285 (1959). 11
work page 1959
-
[36]
I. I. CPLEX,V24.1: User’s Manual for CPLEX, Interna- tional Business Machines Corporation (2024)
work page 2024
-
[37]
W. Kabsch, A solution for the best rotation to relate two sets of vectors, Acta Crystallographica Section A32, 922 (1976)
work page 1976
-
[38]
M.-T. Nguyen, J.-G. Liu, J. Wurtz, M. D. Lukin, S.- T. Wang, and H. Pichler, Quantum optimization with arbitrary connectivity using rydberg atom arrays, PRX Quantum 4, 010316 (2023)
work page 2023
-
[39]
Z. Liu, Y. Li, L. Han, J. Li, J. Liu, Z. Yang, R. Wang, and Y. Li, Pdb-wide collection of binding data: current status of the pdbbind database, Bioinformatics33, 3866 (2017)
work page 2017
-
[40]
I. Muegge and Y. C. Martin, A general and fast method for geometrically docking of small molecules into pro- tein active sites, Journal of Medicinal Chemistry42, 791 (1999). Appendices Appendix A: Code and Data A vailability The source code developed for this study, including the implementation of the quantum heuristic, the pipeline to produce the BIG and...
work page 1999
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.