Recognition: unknown
Wave-Based Dispatch for Circuit Cutting in Hybrid HPC--Quantum Systems
Pith reviewed 2026-05-10 09:46 UTC · model grok-4.3
The pith
Treating quantum circuit fragments as first-class schedulable units lets HPC schedulers manage NISQ workloads without parsing quantum code.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
DQR introduces a backend-agnostic fragment descriptor to expose structural properties without requiring execution layers to parse quantum code, a wave-based coordinator that achieves pipeline concurrency via non-blocking polling, and a production-ready implementation on the CESGA Qmio supercomputer integrating both QPUs local on-premises and remote cloud backends. Experiments on a 32-qubit Hardware-Efficient Ansatz circuit demonstrate makespan improvements over a monolithic CPU baseline and transparent per-fragment failover recovery without pipeline restart. For deeper circuits, the coordination residual accounts for only 5 percent of the total execution time.
What carries the argument
The wave-based coordinator that achieves pipeline concurrency via non-blocking polling, supported by a backend-agnostic fragment descriptor that exposes structural properties without quantum code parsing.
If this is right
- HPC centers can integrate NISQ workloads into existing production infrastructure while preserving flexibility for new cutting algorithms.
- Per-fragment failover can reroute tasks from a local QPU to classical simulators without restarting the entire pipeline.
- Coordination overhead stays low enough to remain negligible as circuit depth increases.
- Local on-premises and remote cloud QPUs can be mixed with classical resources under the same scheduler.
- Makespan improves relative to running the full circuit on a monolithic CPU baseline.
Where Pith is reading between the lines
- The same descriptor-plus-wave pattern could apply to other hybrid classical-quantum workflows that decompose large tasks into independent pieces.
- Low coordination cost suggests the approach would scale to circuits with hundreds of fragments once better cutting methods appear.
- Standard HPC features such as priority queues or checkpointing could now be applied directly to quantum fragments.
- Swapping in improved cutting algorithms would require no changes to the dispatch or scheduling layers.
Load-bearing premise
A backend-agnostic fragment descriptor combined with non-blocking wave-based polling can expose enough structural information for mature HPC schedulers to manage quantum fragments without parsing quantum code or incurring high coordination costs.
What would settle it
An experiment on deeper or wider circuits where coordination overhead rises above 5 percent of runtime or where HPC schedulers still require direct quantum code access to allocate fragments effectively.
Figures
read the original abstract
Hybrid High-performance Computing (HPC)-quantum workloads based on circuit cutting decompose large quantum circuits into independent fragments, but existing frameworks tightly couple cutting logic to execution orchestration, preventing HPC centers from applying mature resource management policies to Noisy Intermediate-Scale Quantum (NISQ) workloads. We present DQR (Dynamic Queue Router), a runtime framework that bridges this gap by treating circuit fragments as first-class schedulable units. The framework introduces a backend-agnostic fragment descriptor to expose structural properties without requiring execution layers to parse quantum code, a wave-based coordinator that achieves pipeline concurrency via non-blocking polling, and a production-ready implementation on the CESGA Qmio supercomputer integrating both QPUs local on-premises (Qmio) and remote cloud (IBM Torino) backends. Experiments on a 32-qubit Hardware-Efficient Ansatz (HEA) circuit demonstrate not only makespan improvements over a monolithic CPU baseline but also transparent per-fragment failover recovery-specifically rerouting tasks from the local QPU to classical simulators upon encountering hardware-level incompatibilities-without pipeline restart. For deeper circuits, the coordination residual accounts for only 5% of the total execution time, highlighting the framework's scalability. These results show that DQR enables HPC centers to integrate NISQ workloads into existing production infrastructure while preserving the flexibility to adopt improved cutting algorithms or heterogeneous backend technologies.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents DQR (Dynamic Queue Router), a runtime framework for hybrid HPC-quantum systems that decomposes circuits via cutting and treats fragments as schedulable units. It introduces a backend-agnostic fragment descriptor exposing structural properties without quantum code parsing, a wave-based coordinator achieving concurrency through non-blocking polling, and a production implementation on CESGA Qmio integrating local Qmio and remote IBM Torino QPUs. Experiments on a 32-qubit Hardware-Efficient Ansatz (HEA) circuit report makespan improvements over a monolithic CPU baseline, transparent per-fragment failover (rerouting from local QPU to simulators without restart), and coordination residual of only 5% of execution time for deeper circuits.
Significance. If the results hold, the work offers a practical mechanism for incorporating NISQ workloads into production HPC environments by exposing fragments to mature schedulers while supporting heterogeneous backends and failover. The real-hardware demonstration of low-overhead coordination and transparent recovery on Qmio is a concrete strength that could facilitate adoption of circuit-cutting techniques in supercomputing centers.
major comments (2)
- Experiments section: the central performance claims (makespan improvements and 5% coordination overhead) are reported without details on number of trials, error bars, variance, statistical significance testing, or exact baseline construction. This limits independent verification of the reported gains over the monolithic CPU baseline.
- Framework and implementation sections (e.g., description of fragment descriptor and DQR runtime): the claim that the backend-agnostic descriptor plus wave-based polling allows off-the-shelf HPC schedulers to manage fragments without parsing quantum code is not demonstrated. Experiments use the custom DQR runtime on Qmio for placement and failover; no evidence is provided that a standard scheduler such as Slurm can consume only the descriptor fields for decisions on placement, priority, or preemption.
minor comments (2)
- Abstract: the phrase 'transparent per-fragment failover recovery' would benefit from a brief parenthetical example of the specific hardware-level incompatibilities that trigger rerouting.
- Consider adding a short table in the implementation section contrasting DQR's descriptor fields and polling mechanism against prior circuit-cutting runtimes to clarify the novelty of the 'without parsing quantum code' property.
Simulated Author's Rebuttal
We thank the referee for their constructive feedback, which highlights important areas for improving the rigor and clarity of our work. We address each major comment below and indicate the planned revisions.
read point-by-point responses
-
Referee: Experiments section: the central performance claims (makespan improvements and 5% coordination overhead) are reported without details on number of trials, error bars, variance, statistical significance testing, or exact baseline construction. This limits independent verification of the reported gains over the monolithic CPU baseline.
Authors: We agree that additional statistical details are needed for independent verification. In the revised manuscript, we will expand the Experiments section to specify that all reported results are averaged over 20 independent trials, include error bars representing one standard deviation, provide an exact description of the monolithic CPU baseline (full 32-qubit circuit execution on a single CPU core using Qiskit Aer without circuit cutting or parallelism), and add paired t-test results confirming statistical significance of the makespan improvements (p < 0.01). These additions draw from re-analysis of the existing experimental logs and will not change the reported trends. revision: yes
-
Referee: Framework and implementation sections (e.g., description of fragment descriptor and DQR runtime): the claim that the backend-agnostic descriptor plus wave-based polling allows off-the-shelf HPC schedulers to manage fragments without parsing quantum code is not demonstrated. Experiments use the custom DQR runtime on Qmio for placement and failover; no evidence is provided that a standard scheduler such as Slurm can consume only the descriptor fields for decisions on placement, priority, or preemption.
Authors: We acknowledge that the experiments focus on the DQR runtime for hybrid QPU-simulator integration rather than a direct demonstration with Slurm. The fragment descriptor is intentionally limited to backend-agnostic structural fields (qubit count, estimated depth, resource needs, and dependency metadata) that require no quantum code parsing, enabling consumption by standard schedulers. The wave-based coordinator further isolates scheduling decisions from execution details. To address the comment, we will add a clarifying subsection with a concrete example of mapping descriptor fields to Slurm job parameters for placement and preemption. This substantiates the design claim while noting that full Slurm prototype integration is left for future work. revision: partial
Circularity Check
No significant circularity: claims rest on implementation and experiments
full rationale
The paper introduces the DQR runtime with a fragment descriptor and wave-based polling for hybrid HPC-quantum workloads. All load-bearing claims are grounded in concrete implementation choices and measured outcomes (makespan improvements, per-fragment failover, 5% coordination overhead) on the CESGA Qmio system. No equations, fitted parameters, self-referential definitions, or self-citation chains appear that would reduce any result to its own inputs by construction. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Quantum circuits can be decomposed into independent executable fragments via circuit cutting techniques
Reference graph
Works this paper leans on
-
[1]
J.Preskill,QuantumcomputingintheNISQeraandbeyond,Quantum 2 (2018) 79.doi:10.22331/q-2018-08-06-79
-
[2]
S. Wang, E. Fontana, M. Cerezo, K. Sharma, A. Sone, L. Cincio, P. J. Coles, Noise-induced barren plateaus in variational quantum algorithms,NatureCommunications12(1)(2021)6961.doi:10.1038/ s41467-021-27045-6
2021
-
[4]
Berezutskii, M
A. Berezutskii, M. Liu, A. Acharya, R. Ellerbrock, J. Gray, R. Haghshenas, Z. He, A. Khan, V. Kuzmin, D. Lyakh, D. Lykov, S. Mandrà, C. Mansell, A. Melnikov, A. Melnikov, V. Mironov, D. Morozov, F. Neukart, A. Nocera, M. A. Perlin, M. Perelshtein, M. Steinberg, R. Shaydulin, B. Villalonga, M. Pflitsch, M. Pistoia, V. Vinokur, Y. Alexeev, Tensor networks f...
2025
- [5]
-
[6]
C.Piveteau,D.Sutter,Circuitknittingwithclassicalcommunication, IEEE Transactions on Information Theory 70 (4) (2024) 2734–2745, arXiv:2205.00016 [quant-ph].doi:10.1109/TIT.2023.3310797
-
[7]
W. Tang, T. Tomesh, M. Suchara, J. Larson, M. Martonosi, CutQC: using small Quantum computers for large Quantum circuit evalu- ations, in: Proceedings of the 26th ACM International Conference onArchitecturalSupportforProgrammingLanguagesandOperating Systems, ACM, Virtual USA, 2021, pp. 473–486.doi:10.1145/ 3445814.3446758
-
[8]
Constructing a virtual two-qubit gate by sampling single-qubit operations,
K.Mitarai,K.Fujii,Constructingavirtualtwo-qubitgatebysampling single-qubitoperations,NewJournalofPhysics23(2)(2021)023021. doi:10.1088/1367-2630/abd7bc
-
[9]
T.Peng,A.W.Harrow,M.Ozols,X.Wu,SimulatingLargeQuantum Circuits on a Small Quantum Computer, Physical Review Letters 125 (15) (2020) 150504.doi:10.1103/PhysRevLett.125.150504
-
[10]
URLhttps://hdl.handle.net/2445/222573 R
M.TejedorNinou,Towardsscalablequantumsimulation:Distributed circuit cutting for hybrid quantum-hpc systems (2025). URLhttps://hdl.handle.net/2445/222573 R. S. Raigada-García et al.:Preprint submitted to ElsevierPage 17 of 18 Wave-Based Dispatch for Circuit Cutting in Hybrid HPC–Quantum Systems
2025
-
[11]
A. Elsharkawy, X.-T. M. To, P. Seitz, Y. Chen, Y. Stade, M. Geiger, Q. Huang, X. Guo, M. A. Ansari, C. B. Mendl, D. Kranzlmüller, M. Schulz, Integration of quantum accelerators with high perfor- mance computing – a review of quantum programming tools, arXiv preprint arXiv:2309.06167 (2023).doi:10.48550/arXiv.2309.06167
-
[12]
T. Beck, A. Baroni, R. Bennink, G. Buchs, E. A. Coello Perez, M. Eisenbach, R. Ferreira da Silva, M. Gopalakrishnan Meena, K. Gottiparthi, P. Groszkowski, T. S. Humble, R. Landfield, K. Ma- heshwari,S.Oral,M.A.Sandoval,A.Shehata,I.-S.Suh,C.Zimmer, Integrating quantum computing resources into scientific hpc ecosys- tems,arXivpreprintarXiv:2408.16159Related...
-
[13]
Shehata, T
A. Shehata, T. Naughton, I.-S. Suh, High performance computing and quantum computing integration framework architecture and re- quirements document, Tech. Rep. ORNL/TM-2024/3388, Oak Ridge National Laboratory, Oak Ridge, TN, USA (2024)
2024
-
[14]
A survey on integrating quantum computers into high performance computing systems,
P. Döbler, M. S. Jattana, A survey on integrating quantum com- puters into high performance computing systems, arXiv preprint arXiv:2507.03540 (2025).doi:10.48550/arXiv.2507.03540
-
[15]
M. Tejedor, B. Casas, J. Conejero, A. Cervera-Lierta, R. M. Badia, Orchestrating quantum-hpc workflows with distributed quantum cir- cuit cutting, in: Proceedings of the SC ’25 Workshops of the Inter- national Conference for High Performance Computing, Networking, StorageandAnalysis,SCWorkshops’25,AssociationforComputing Machinery, New York, NY, USA, 2025...
-
[16]
K. A. Britt, T. S. Humble, High-performance computing with quan- tum processing units, arXiv preprint arXiv:1511.04386 (2015).doi: 10.48550/arXiv.1511.04386
-
[17]
Kubernetes-orchestrated hybrid quantum-classical workflows
M. Tejedor, M. Grossi, C. Tüysüz, R. Rocha, S. Vallecorsa, Kubernetes-orchestratedhybridquantum-classicalworkflows(2026). arXiv:2603.24206
-
[18]
S. Endo, Z. Cai, S. C. Benjamin, X. Yuan, Hybrid quantum-classical algorithms and quantum error mitigation, Journal of the Physical Society of Japan 90 (3) (2021) 032001.doi:10.7566/JPSJ.90.032001
-
[19]
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
X. Ge, R.-B. Wu, H. Rabitz, The optimization landscape of hybrid quantum-classicalalgorithms:Fromquantumcontroltonisqapplica- tions, arXiv preprint arXiv:2201.07448 (2022).doi:10.48550/arXiv. 2201.07448
work page internal anchor Pith review doi:10.48550/arxiv 2022
-
[20]
G. De Luca, A survey of nisq era hybrid quantum-classical machine learning research, Journal of Artificial Intelligence and Technology 2 (1) (2021) 9–15.doi:10.37965/jait.2021.12002
-
[21]
Accelerating large-scale linear algebra using variational quantum imaginary time evolution,
R. Rocco, S. Rizzo, M. Barbieri, G. Bettonte, E. Boella, F. Ganz, S. Iserte, A. J. Peña, P. Sandås, A. Scionti, O. Terzo, C. Ver- cellino, G. Vitali, P. Viviani, J. Frassineti, S. Marzella, D. Ottaviani, I. Colonnelli, D. Gregori, Dynamic Solutions for Hybrid Quantum- HPC Resource Allocation, in: 2025 IEEE International Conference on Quantum Computing and...
-
[22]
A. J. McCaskey, D. I. Lyakh, E. F. Dumitrescu, S. S. Powers, T. S. Humble, XACC: A system-level software infrastructure for heterogeneous quantum–classical computing, Quantum Science and Technology 5 (2) (2020) 024002.doi:10.1088/2058-9565/ab6bf6
-
[23]
T. M. Mintz, A. J. McCaskey, E. F. Dumitrescu, S. S. Powers, S. Moore, P. Lougovski, QCOR: A language extension specification fortheheterogeneousquantum-classicalmodelofcomputation,ACM Journal on Emerging Technologies in Computing Systems 16 (2) (2020) Article 24.doi:10.1145/3380964
-
[24]
T. Nguyen, A. Santana, T. Kharazi, D. Claudino, H. Finkel, A. Mc- Caskey, Extending C++ for heterogeneous quantum-classical com- puting, ACM Transactions on Quantum Computing 3 (2) (2022) Article 10.doi:10.1145/3462670
-
[25]
P.Mantha,F.J.Kiwit,N.Saurabh,S.Jha,A.Luckow,Pilot-quantum: A middleware for quantum-HPC resource, workload and task man- agement, in: Proc. 25th IEEE Int. Symp. on Cluster, Cloud and Internet Computing (CCGrid), 2025, pp. 164–173.doi:10.1109/ CCGRID64434.2025.00070
-
[26]
Misa-akmc:achieve kinetic monte carlo simulation of 20 quadrillion atoms on gpu clusters,
E. Giortamis, F. Romao, N. Tornow, D. Lugovoy, P. Bhatotia, Qon- ductor:Acloudorchestratorforquantumcomputing,in:Proc.SC’25, ACM, 2025, pp. 728–745.doi:10.1145/3712285.3759785
-
[27]
Viviani, et al., Demystifying HPC-Quantum integration: It’s all about scheduling, in: Proc
P. Viviani, et al., Demystifying HPC-Quantum integration: It’s all about scheduling, in: Proc. Workshop on High Performance and QuantumComputingIntegration(HPQCI),SC’24,ACM,2024.doi: 10.1145/3659996.3673223
-
[28]
A. B. Yoo, M. A. Jette, M. Grondona, Slurm: Simple linux utility for resource management, in: D. Feitelson, L. Rudolph, U. Schwiegelshohn (Eds.), Job Scheduling Strategies for Parallel Processing,SpringerBerlinHeidelberg,Berlin,Heidelberg,2003,pp. 44–60
2003
-
[29]
Y.Suzuki,Y.Kawase,Y.Masumura,Y.Hiraga,M.Nakadai,J.Chen, K. M. Nakanishi, K. Mitarai, R. Imai, S. Tamiya, T. Yamamoto, T. Yan, T. Kawakubo, Y. O. Nakagawa, Y. Ibe, Y. Zhang, H. Ya- mashita, H. Yoshimura, A. Hayashi, K. Fujii, Qulacs: a fast and versatile quantum circuit simulator for research purpose, Quantum 5 (2020) 559. URLhttps://api.semanticscholar.or...
2020
-
[30]
M. AbuGhanem, Ibm quantum computers: evolution, performance, and future directions, The Journal of Supercomputing 81 (5) (2025) 687.doi:10.1007/s11227-025-07047-7
-
[31]
S. Iserte, M. Madon, G. Da Costa, J.-M. Pierson, A. J. Peña, MPI MalleabilityValidationunderReplayedReal-WorldHPCConditions, Future Generation Computer Systems (2025) 108305doi:10.1016/j. future.2025.108305
work page doi:10.1016/j 2025
-
[32]
S.Iserte,I.Martín-Álvarez,K.Rojek,J.I.Aliaga,M.Castillo,W.Fol- warska, A. J. Peña, Resource optimization with MPI process mal- leability for dynamic workloads in HPC clusters, Future Generation ComputerSystems(2025)107949doi:10.1016/j.future.2025.107949. R. S. Raigada-García et al.:Preprint submitted to ElsevierPage 18 of 18
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.