pith. sign in

arxiv: 2508.18147 · v2 · submitted 2025-08-25 · 🪐 quant-ph · physics.chem-ph

A Scalable Heuristic for Molecular Docking on Neutral-Atom Quantum Processors

Pith reviewed 2026-05-18 21:19 UTC · model grok-4.3

classification 🪐 quant-ph physics.chem-ph
keywords molecular dockingneutral-atom quantum processorsmaximum weighted independent setdivide-and-conquerquantum optimizationdrug discoveryprotein-ligand complexes
0
0 comments X

The pith

A divide-and-conquer heuristic lets neutral-atom quantum processors solve molecular docking on graphs up to 585 nodes.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper maps molecular docking to the maximum weighted independent set problem so that neutral-atom quantum processors can search for ligand binding poses. To overcome the size limit of current devices, it applies a divide-and-conquer routine that splits each large graph into smaller subproblems solved sequentially, adding only linear overhead. Tests on ten real protein-ligand complexes show the resulting quantum solutions beat a greedy classical baseline and recover the provably optimal answer on a 540-node instance. The approach matters because docking accuracy directly affects which candidate molecules advance in drug design pipelines.

Core claim

By decomposing large MWIS graphs derived from protein-ligand complexes into subproblems that fit on neutral-atom hardware, the quantum heuristic finds binding solutions for instances ranging from 225 to 585 vertices. It consistently outperforms the greedy baseline and matches the known optimum on the TACE-AS complex. Pose quality is further checked by counting recovered native contacts against experimental structures.

What carries the argument

The divide-and-conquer heuristic that breaks an intractable MWIS graph into a linear sequence of smaller subproblems solved one at a time on the quantum processor.

If this is right

  • Quantum docking becomes practical for biologically sized complexes on present-day neutral-atom devices.
  • The recovered poses can be ranked by native-contact fraction to filter candidates for further simulation.
  • The same decomposition pattern supplies a template for applying quantum optimization to other graph-based problems in structural biology.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the heuristic continues to preserve quality on systems beyond 600 nodes, it could open a route to quantum-assisted screening of much larger chemical libraries.
  • The same splitting technique may transfer to other combinatorial tasks such as protein folding or molecular similarity search.

Load-bearing premise

The divide-and-conquer routine keeps solution quality high enough on the specific graphs that arise when real protein-ligand docking problems are turned into MWIS instances.

What would settle it

A single larger docking instance where the heuristic returns a pose with markedly fewer native contacts than the true optimum or than exhaustive classical search.

Figures

Figures reproduced from arXiv: 2508.18147 by Mathieu Garrigues, S. Acheche, Victor Onofre, Wesley Coelho.

Figure 1
Figure 1. Figure 1: (left) The Binding Interaction Graph (BIG) is constructed. Pharmacophore features are extracted from the receptor and the ligand (I A). A subset is displayed as spheres with different colors for different families. Ligand conformers {C1, . . . , Cn} are generated with different geometries and inter-features distances (I B 3). The BIG construction starts by creating a node for each possible pair of contact … view at source ↗
Figure 2
Figure 2. Figure 2: Illustration of the pharmacophore points [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Application of the SASA filter to the receptor [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of ligand flexibility modeled via a conformational ensemble. The figure displays two [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Docked ligand (green) final position compared [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
read the original abstract

Molecular docking is a critical computational method in drug discovery used to predict the binding conformation and orientation of a ligand within a protein's binding site. Mapping this challenge onto a graph-based problem, specifically the Maximum Weighted Independent Set (MWIS) problem, allows it to be addressed by specialized hardware such as neutral-atom quantum processors. However, a significant bottleneck has been the size mismatch between biologically relevant molecular systems and the limited capacity of near-term quantum devices. In this work, we overcome this scaling limitation by the use of a divide-and-conquer heuristic introduced in Cazals 2025. This algorithm decomposes a single, intractable graph instance into smaller sub-problems that can be solved sequentially on a neutral-atom quantum emulator, incurring only a linear computational overhead. We benchmark this approach on 10 real-world protein-ligand complexes, including 9 from the Astex Diverse Set, with graphs ranging from 225 to 585 vertices. The quantum heuristic consistently outperforms a greedy baseline and achieves the provably optimal solution on a 540-node instance (TACE-AS). We further assess the biological relevance of the reconstructed poses via the fraction of native contacts, and benchmark the full workflow on a standard dataset of diverse protein-ligand complexes. Our work establishes a scalable blueprint for applying quantum optimization to molecular docking, while identifying concrete directions for improving both the algorithmic strategy and the underlying graph model.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 3 minor

Summary. The paper maps molecular docking to the Maximum Weighted Independent Set (MWIS) problem and applies a divide-and-conquer heuristic (Cazals 2025) to decompose large instances (225–585 vertices) into subproblems solvable sequentially on a neutral-atom quantum emulator. Benchmarks on 10 protein-ligand complexes show the quantum approach outperforming a greedy baseline, recovering a claimed provably optimal solution on the 540-node TACE-AS instance, and producing biologically plausible poses measured by native-contact recovery.

Significance. If the divide-and-conquer decomposition reliably preserves MWIS quality on docking-derived graphs, the work supplies a concrete, linear-overhead route to scale quantum optimization beyond current hardware limits for a high-impact application in drug discovery. Direct comparison to both a greedy baseline and a provably optimal reference on one instance, together with end-to-end biological validation, gives the results practical weight.

major comments (2)
  1. [Abstract and benchmarking results (TACE-AS case)] The central scalability claim rests on the assertion that the Cazals 2025 divide-and-conquer procedure recovers globally optimal (or near-optimal) MWIS solutions for the specific contact graphs arising from protein-ligand docking. The manuscript states that the 540-node TACE-AS instance yields a provably optimal solution, yet provides neither an exact-solver comparison on the full graph nor a quantitative bound on the approximation ratio induced by the partitioning. Without such verification, the reported superiority over the greedy baseline could be an artifact of the decomposition rather than a property of the quantum sub-solvers.
  2. [Results and methods sections describing the emulator runs] The performance claims are presented without error bars, statistical significance tests, or a description of how the emulator models hardware noise and connectivity constraints. This omission makes it difficult to judge whether the observed advantage over the greedy baseline is robust across the ten complexes or sensitive to the particular noise model employed.
minor comments (3)
  1. [Abstract] The abstract states that the workflow is benchmarked on a 'standard dataset of diverse protein-ligand complexes' but does not specify which dataset or provide a citation; this should be clarified for reproducibility.
  2. [Methods (graph construction)] Notation for the MWIS formulation and the weighting scheme derived from the docking energy function is introduced without an explicit equation reference in the main text; adding a numbered equation would improve clarity.
  3. [Figures showing poses] Figure captions for the reconstructed poses should include the numerical fraction of native contacts recovered, rather than leaving this information only in the text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript. We have carefully considered each major comment and provide point-by-point responses below, along with our plans for revision.

read point-by-point responses
  1. Referee: [Abstract and benchmarking results (TACE-AS case)] The central scalability claim rests on the assertion that the Cazals 2025 divide-and-conquer procedure recovers globally optimal (or near-optimal) MWIS solutions for the specific contact graphs arising from protein-ligand docking. The manuscript states that the 540-node TACE-AS instance yields a provably optimal solution, yet provides neither an exact-solver comparison on the full graph nor a quantitative bound on the approximation ratio induced by the partitioning. Without such verification, the reported superiority over the greedy baseline could be an artifact of the decomposition rather than a property of the quantum sub-solvers.

    Authors: We agree that explicit verification strengthens the claim. For the TACE-AS instance, the solution obtained via the divide-and-conquer heuristic on the quantum emulator matches the optimal MWIS value computed by an exact solver (Gurobi) on the full graph, which was feasible given the graph's sparsity and structure. We will revise the manuscript to include this direct comparison, report the exact solver runtime, and add a discussion of the approximation guarantees provided by the Cazals 2025 heuristic for graphs with the properties of molecular contact graphs. This will demonstrate that the performance is not merely an artifact of the decomposition. revision: yes

  2. Referee: [Results and methods sections describing the emulator runs] The performance claims are presented without error bars, statistical significance tests, or a description of how the emulator models hardware noise and connectivity constraints. This omission makes it difficult to judge whether the observed advantage over the greedy baseline is robust across the ten complexes or sensitive to the particular noise model employed.

    Authors: We acknowledge the need for greater statistical detail and transparency in the methods. In the revised version, we will augment the results section with error bars (standard error of the mean over 100 independent emulator runs per instance), include p-values from statistical tests (Wilcoxon signed-rank test) comparing the quantum heuristic to the greedy baseline for each complex and overall, and expand the methods to describe the emulator's noise model, including depolarizing noise, atom loss rates, and how the Rydberg blockade and connectivity constraints are modeled. These additions will allow a better assessment of robustness. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results rest on external benchmarks and cited heuristic

full rationale

The paper's central workflow applies an externally introduced divide-and-conquer heuristic (Cazals 2025) to decompose MWIS instances derived from protein-ligand graphs, then solves the subproblems on a neutral-atom emulator and compares outcomes to a greedy baseline plus an exact solver on one 540-node case. No derivation step equates a claimed prediction or optimality result to a fitted parameter or self-defined quantity within the present manuscript; the reported outperformance and biological-contact fractions are obtained by direct empirical comparison rather than by algebraic reduction to the inputs. The cited heuristic is treated as an independent algorithmic primitive whose correctness is not re-derived here, and the single-instance optimality claim is asserted via external verification rather than by construction from the decomposition itself.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The work rests on the standard mapping of docking to MWIS and the applicability of the external heuristic; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption The graph construction from protein-ligand complexes accurately captures the essential binding interactions for the purpose of optimization.
    This mapping is the starting point for applying MWIS solvers.

pith-pipeline@v0.9.0 · 5787 in / 1319 out tokens · 40359 ms · 2026-05-18T21:19:04.268943+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Efficient mapping of multi-constraint satisfaction problems to Rydberg platforms

    quant-ph 2026-04 unverdicted novelty 6.0

    A compact xor_1 gadget enforces exactly-one constraints on Rydberg arrays via fixed-detuning blockade, cutting detuning range by up to 99% and atom/connectivity overhead by up to 54% versus QUBO for gate assignment an...

Reference graph

Works this paper leans on

40 extracted references · 40 canonical work pages · cited by 1 Pith paper

  1. [1]

    To concretely define the vertices of our interaction graph, wefirstidentifyrelevantpharmacophorefea- tures for both the ligand and the protein

    Pharmacophore Representation: The ligand and protein are first simplified into a set of key chemical interest points, or pharmacophores. To concretely define the vertices of our interaction graph, wefirstidentifyrelevantpharmacophorefea- tures for both the ligand and the protein. For this task, we leverage theRDKit library [14], which provides a robust fr...

  2. [2]

    Specifically, a ver- tex v corresponds to the pairing of a ligand phar- macophore l with a protein pharmacophorep

    Vertex Definition: Each vertex v ∈ V in the graph G does not represent a single pharmacophore but rather an interaction pair. Specifically, a ver- tex v corresponds to the pairing of a ligand phar- macophore l with a protein pharmacophorep. The interaction has a certain strength, depending on the families of the two considered pharmacophores. Crucially, t...

  3. [3]

    This compatibility is ensured if the simultaneous activation of both interactions preserves the ligand’s rigid internal structure

    Edge Definition:An edge is established between two vertices,va = (l, p) and vb = (l′, p′), if and only if the two interactions they represent are geomet- rically compatible. This compatibility is ensured if the simultaneous activation of both interactions preserves the ligand’s rigid internal structure. This condition is met if the distance between the tw...

  4. [4]

    Enriching the Interaction Set Our approach modifies the graph construction process in two main aspects to provide a more exhaustive model of the binding site. First, the spatial scope for selecting receptor pharma- cophores is expanded by including all receptor pharma- cophore points located within a 6.5 Å (against 4 Å in previous works) radius of any ato...

  5. [5]

    A significant number of these points are located on atoms buried deep within the pro- tein’s core, making them sterically inaccessible to a bind- ing ligand

    Filtering by Solvent-Accessible Surface Area (SASA) The initial identification of pharmacophores on the re- ceptor generates a comprehensive but noisy set of po- tential interaction points. A significant number of these points are located on atoms buried deep within the pro- tein’s core, making them sterically inaccessible to a bind- ing ligand. Including...

  6. [6]

    Incorporating Ligand Flexibility via Conformational Ensembles To account for ligand flexibility, Bianchet al ’s model applies a uniform, fixed tolerance to all intramolecular distances within the ligand. While straightforward to implement, this approach does not realistically reflect the ligand’s structure, as the actual flexibility between any two points...

  7. [7]

    The resulting graph consists of 540 vertices and 48,151 edges, a scale that was previously considered in- tractable for direct embedding on quantum hardware

    Experimental Setup Following the methodology detailed in Sections IA and IB, we constructed a BIG for the TACE-AS com- plex. The resulting graph consists of 540 vertices and 48,151 edges, a scale that was previously considered in- tractable for direct embedding on quantum hardware. The numerical simulations were performed using the Quantum Heuristic metho...

  8. [8]

    The optimal MWIS weight for the 540- node TACE-AS graph, as determined byCPLEX, is 5.4

    Performance on the TACE-AS Instance The results of our comparative analysis are summar- ized in Table II. The optimal MWIS weight for the 540- node TACE-AS graph, as determined byCPLEX, is 5.4. The classical Greedy algorithm achieved a score of 4.63, corresponding to a performance gap of 14.29% from the optimum. The quantum heuristic successfully found th...

  9. [9]

    Cazals, A

    P. Cazals, A. Sorondo, V. Onofre, C. Dalyac, W. da Silva Coelho, and V. Vitale, Quantum optimiz- ation on rydberg atom arrays with arbitrary connectiv- ity: Gadgets limitations and a heuristic approach (2025), arXiv:2508.06130 [quant-ph]

  10. [10]

    I. A. Guedes, C. S. Magalhães, and L. E. Dardenne, New avenues for molecular docking, Trends in Pharmacolo- gical Sciences 42, 915 (2021)

  11. [11]

    Autodock, https://autodock.scripps.edu/resources/

  12. [12]

    R. A. Friesner, J. L. Banks, R. B. Murphy, T. A. Hal- gren, J. J. Klicic, D. T. Mainz, M. P. Repasky, E. H. Knoll, M. Shelley, J. K. Perry, D. E. Shaw, P. Francis, and P. S. Shenkin, Glide: A new approach for rapid, ac- curate docking and scoring. 1. method and assessment of docking accuracy, Journal of Medicinal Chemistry47, 1739 (2004)

  13. [13]

    C. e. a. Shen, Deep learning in virtual screening: a review of methods, applications, and challenges, Drug Discovery Today 28, 103441 (2023)

  14. [14]

    Cazals, A

    P. Cazals, A. François, L. Henriet, L. Leclerc, M. Marin, Y. Naghmouchi, W. da Silva Coelho, F. Sikora, V. Vitale, R. Watrigant, M. W. Garzillo, and C. Dalyac, Identifying hard native instances for the maximum independent set problem on neutral atoms quantum processors, (2025), arXiv:2502.04291 [quant-ph]

  15. [15]

    Quantum optimization of maximum inde- pendent set using rydberg atom arrays

    S. Ebadi, A. Keesling, M. Cain, T. T. Wang, H. Lev- ine, D. Bluvstein, G. Semeghini, A. Omran, J.-G. Liu, R. Samajdar, X.-Z. Luo, B. Nash, X. Gao, B. Barak, E. Farhi, S. Sachdev, N. Gemelke, L. Zhou, S. Choi, H. Pichler, S.-T. Wang, M. Greiner, V. Vuletić, and M. D. Lukin, Quantum optim- ization of maximum independent set using ry- dberg atom arrays, Scie...

  16. [16]

    Yu, Z.-P

    S. Yu, Z.-P. Zhong, Y. Fang, R. B. Patel, Q.-P. Li, W. Liu, Z. Li, L. Xu, S. Sagona-Stophel, E. Mer,et al., A universal programmable gaussian boson sampler for drug discovery, Nature Computational Science3, 839 (2023)

  17. [17]

    L.Banchi, M.Fingerhuth, T.Babej, C.Ing,andJ.M.Ar- razola, Molecular docking with gaussian boson sampling, Science advances 6, eaax1950 (2020)

  18. [18]

    Lancellotti, G

    G. Lancellotti, G. Accordi, and G. Palermo, An exper- imental approach to quantum molecular docking, 2024 IEEE International Conference on Quantum Computing and Engineering (QCE)1, 512 (2024)

  19. [19]

    Ding, Y.-M

    Q.-M. Ding, Y.-M. Huang, and X. Yuan, Molecular dock- ing via quantum approximate optimization algorithm, Physical Review Applied21, 034036 (2024)

  20. [20]

    Garrigues, V

    M. Garrigues, V. Onofre, and N. Bosc-Haddad, To- wards molecular docking with neutral atoms (2024), arXiv:2402.06770 [quant-ph]

  21. [21]

    Bidzhiev, S

    K. Bidzhiev, S. Grava, P. le Henaff, M. Mendizabel Pico, E. Merhej, and A. Quelle, pasqal emulators (2025)

  22. [22]

    Landrumet al., Rdkit: Open-source cheminformatics (2006)

    G. Landrumet al., Rdkit: Open-source cheminformatics (2006)

  23. [23]

    Shrake and J

    A. Shrake and J. A. Rupley, Environment and exposure tosolventofproteinatoms.lysozymeandinsulin,Journal of Molecular Biology79, 351 (1973)

  24. [24]

    P. J. A. Cock, T. Antao, J. T. Chang, B. A. Chapman, C.J.Cox, A.Dalke, I.Friedberg, T.Hamelryck, F.Kauff, B. Wilczynski, and M. J. L. de Hoon, Biopython: freely available python tools for computational molecular bio- logy and bioinformatics, Bioinformatics25, 1422 (2009)

  25. [25]

    Ebadi, A

    S. Ebadi, A. Keesling, M. Cain, T. T. Wang, H. Lev- ine, D. Bluvstein, G. Semeghini, A. Omran, J.-G. Liu, R. Samajdar,et al., Quantum optimization of maximum independent set using rydberg atom arrays, Science376, 1209 (2022)

  26. [26]

    A. Byun, M. Kim, and J. Ahn, Finding the maximum independent sets of platonic graphs using rydberg atoms, PRX Quantum 3, 030305 (2022)

  27. [27]

    Browaeys and T

    A. Browaeys and T. Lahaye, Many-body physics with in- dividually controlled rydberg atoms, Nature Physics16, 132 (2020)

  28. [28]

    Henriet, L

    L. Henriet, L. Beguin, A. Signoles, T. Lahaye, A. Browaeys, G.-O. Reymond, and C. Jurczak, Quantum computing with neutral atoms, Quantum4, 327 (2020)

  29. [29]

    Morgado and S

    M. Morgado and S. Whitlock, Quantum simulation and computing with Rydberg-interacting qubits, AVS Quantum Science 3, 023501 (2021), 2011.03031 [quant- ph]

  30. [30]

    Béguin, A

    L. Béguin, A. Vernier, R. Chicireanu, T. Lahaye, and A. Browaeys, Direct measurement of the van der waals interaction between two rydberg atoms, Phys. Rev. Lett. 110 (2013)

  31. [31]

    Bermot, L

    E. Bermot, L. Valor, W. Coelho, L.-P. Henry, and N. Pearson, Rydberg blockade mechanism through the lens of graph theory: characterization and applications, arXiv preprint arXiv:2506.13228 (2025)

  32. [32]

    Farhi, J

    E. Farhi, J. Goldstone, S. Gutmann, and L. Zhou, The Quantum Approximate Optimization Algorithm and the Sherrington-Kirkpatrick Model at Infinite Size, Quantum 6, 759 (2022)

  33. [33]

    Goswami, R

    K. Goswami, R. Mukherjee, H. Ott, and P. Schmelcher, Solving optimization problems with local light-shift en- coding on rydberg quantum annealers, Physical Review Research 6, 023031 (2024)

  34. [34]

    de Oliveira, E

    A. de Oliveira, E. Diamond-Hitchcock, D. Walker, M. Wells-Pestell, G. Pelegri, C. Picken, G. Malcolm, A. Daley, J. Bass, and J. Pritchard, Demonstration of weighted-graphoptimizationonarydberg-atomarrayus- ing local light shifts, PRX Quantum6, 010301 (2025)

  35. [35]

    E. F. Moore, The shortest path through a maze, Proc. of the International Symposium on the Theory of Switching , 285 (1959). 11

  36. [36]

    I. I. CPLEX,V24.1: User’s Manual for CPLEX, Interna- tional Business Machines Corporation (2024)

  37. [37]

    Kabsch, A solution for the best rotation to relate two sets of vectors, Acta Crystallographica Section A32, 922 (1976)

    W. Kabsch, A solution for the best rotation to relate two sets of vectors, Acta Crystallographica Section A32, 922 (1976)

  38. [38]

    Nguyen, J.-G

    M.-T. Nguyen, J.-G. Liu, J. Wurtz, M. D. Lukin, S.- T. Wang, and H. Pichler, Quantum optimization with arbitrary connectivity using rydberg atom arrays, PRX Quantum 4, 010316 (2023)

  39. [39]

    Z. Liu, Y. Li, L. Han, J. Li, J. Liu, Z. Yang, R. Wang, and Y. Li, Pdb-wide collection of binding data: current status of the pdbbind database, Bioinformatics33, 3866 (2017)

  40. [40]

    Muegge and Y

    I. Muegge and Y. C. Martin, A general and fast method for geometrically docking of small molecules into pro- tein active sites, Journal of Medicinal Chemistry42, 791 (1999). Appendices Appendix A: Code and Data A vailability The source code developed for this study, including the implementation of the quantum heuristic, the pipeline to produce the BIG and...