HPC-vQPU: A Service-Export Architecture for Virtual QPUs on Batch-Scheduled HPC Systems
Pith reviewed 2026-06-30 21:03 UTC · model grok-4.3
The pith
Batch-scheduled HPC systems can export interactive device-faithful virtual QPUs using only outbound coordination.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
HPC-vQPU separates a cloud-facing control plane that owns device identity and task lifecycle from an HPC-resident execution plane that claims work through scheduler-backed jobs. All coordination is outbound and agent-initiated. The central abstraction is a topology- and calibration-aware device snapshot that is bound atomically at claim time and carried into execution as an immutable contract, ensuring each job remains hermetic while still reflecting current device semantics.
What carries the argument
The topology- and calibration-aware device snapshot bound atomically at claim time and carried as an immutable contract into the execution plane.
If this is right
- Service overhead remains bounded and additive rather than multiplicative with simulator cost.
- Workload scaling behavior stays confined to the underlying simulator.
- Snapshots that carry calibration data produce measurable shifts in simulation outputs.
- Claim-time binding prevents execution from using stale device information after a mutation occurs.
- Concurrent agents together with explicit recovery complete each task exactly once even after agent or node failure.
Where Pith is reading between the lines
- The same outbound-plus-snapshot pattern could be applied to export other interactive device services from batch HPC environments.
- Changing the snapshot format might allow the service to support additional quantum simulators without altering the coordination layer.
- If queue delays become long, periodic snapshot refresh before execution could be added as an extension while keeping the outbound rule.
Load-bearing premise
An outbound-only agent-initiated coordination model together with atomic claim-time snapshot binding can preserve topology, native-gate, and calibration semantics across queue delays, node isolation, and partial failures without any inbound paths into the cluster.
What would settle it
An experiment in which device calibration data changes after a snapshot is bound but before the job runs, yet the output still matches the new calibration instead of the bound snapshot.
Figures
read the original abstract
Device-aware quantum simulation increasingly requires HPC-scale accelerators, yet secure supercomputers expose batch-scheduled execution environments rather than the interactive, backend-oriented interfaces expected by quantum software. The key obstacle is not only remote job submission: an HPC-hosted virtual QPU must preserve topology, native-gate, and calibration semantics across queue delay, scheduler allocation, compute-node isolation, and partial execution-side failures, without opening inbound paths into the cluster. We present HPC-vQPU, a service-export architecture for virtual QPUs on batch-scheduled HPC systems. HPC-vQPU separates a cloud-facing control plane, which owns device identity, task lifecycle, snapshot binding, and event projection, from an HPC-resident execution plane, which claims work and realises it through scheduler-backed GPU jobs. Coordination is exclusively outbound and agent initiated. The central abstraction is a topology- and calibration-aware device snapshot bound atomically at claim time and carried into execution as an immutable contract, making each scheduled job hermetic while preserving fresh device semantics. We implement HPC-vQPU at the Pawsey Supercomputing Research Centre using Setonix GPUs, Qiskit-Aer/cuQuantum, and IBM Fez calibration data. Production experiments show that service overhead is bounded and additive, while workload scaling remains confined to the simulator; calibration-bearing snapshots produce measurable output shifts; claim-time binding prevents stale execution after pre-claim device mutation; concurrent agents complete 50/50 tasks exactly once; and explicit recovery restores stale running tasks after agent failure. These results show that secure, scheduler-mediated HPC infrastructure can export device-faithful quantum simulation as an interactive virtual-QPU service.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents HPC-vQPU, a service-export architecture separating a cloud-facing control plane (handling device identity, task lifecycle, snapshot binding, and event projection) from an HPC-resident execution plane (claiming and realizing work via scheduler-backed GPU jobs). Coordination is exclusively outbound and agent-initiated; the core abstraction is a topology- and calibration-aware device snapshot bound atomically at claim time and carried as an immutable contract. Implemented on Pawsey Setonix with Qiskit-Aer/cuQuantum and IBM Fez data, the reported experiments demonstrate bounded additive service overhead, measurable output shifts from calibration snapshots, prevention of stale execution after pre-claim mutation, exactly-once completion of concurrent tasks, and recovery after agent failure. The central claim is that this enables secure, scheduler-mediated HPC systems to export device-faithful quantum simulation as an interactive virtual-QPU service while preserving semantics across queue delays, allocation, isolation, and partial failures without inbound cluster access.
Significance. If the preservation claim holds under the full set of conditions, the work provides a practical bridge between batch-scheduled HPC resources and interactive quantum backends, enabling secure device-faithful simulation services without compromising cluster security. The implementation on production hardware and the explicit handling of concurrency and recovery are concrete strengths.
major comments (1)
- [abstract and experimental results] The abstract and the paragraph on coordination state that the outbound-only, agent-initiated model with atomic claim-time snapshot binding must preserve topology, native-gate, and calibration semantics across queue delays, scheduler allocation, compute-node isolation, and partial failures. However, the reported experiments address calibration usage, pre-claim mutation prevention, exactly-once concurrency (50/50 tasks), and agent-failure recovery but provide no description of tests or measurements involving actual queue waiting periods before claim or node isolation effects on binding and execution. This leaves the central claim unverified for the full set of conditions listed.
minor comments (1)
- [abstract] The abstract states experimental outcomes but provides no details on experimental design, baselines, error bars, data exclusion rules, or statistical methods.
Simulated Author's Rebuttal
We thank the referee for the careful reading and for identifying the need to more explicitly connect the experimental results to the full set of preservation conditions listed in the abstract. We address the comment below.
read point-by-point responses
-
Referee: [abstract and experimental results] The abstract and the paragraph on coordination state that the outbound-only, agent-initiated model with atomic claim-time snapshot binding must preserve topology, native-gate, and calibration semantics across queue delays, scheduler allocation, compute-node isolation, and partial failures. However, the reported experiments address calibration usage, pre-claim mutation prevention, exactly-once concurrency (50/50 tasks), and agent-failure recovery but provide no description of tests or measurements involving actual queue waiting periods before claim or node isolation effects on binding and execution. This leaves the central claim unverified for the full set of conditions listed.
Authors: We agree that the experiments do not report direct measurements of queue-wait durations or isolated tests of node-allocation effects. The design, however, guarantees preservation across queue delays because binding occurs atomically at claim time—after any waiting period—so the captured topology, native gates, and calibration data are always those present at execution start. Scheduler allocation and compute-node isolation are addressed by the hermetic job model: the immutable snapshot contract travels with the job, as evidenced by the concurrent-task (exactly-once) and agent-recovery experiments that already exercise scheduler-mediated allocation and isolation. We will revise the manuscript to articulate this reasoning explicitly in the abstract and coordination sections, thereby strengthening the link between design and results without new experiments. This constitutes a partial revision. revision: partial
Circularity Check
No circularity detected
full rationale
The paper presents an architecture for exporting virtual QPUs on batch-scheduled HPC systems, along with implementation details and experimental measurements of overhead, calibration effects, concurrency, and recovery. No mathematical derivations, equations, fitted parameters, predictions, or self-citations appear in the provided text. All claims rest on described system behavior and empirical results rather than any reduction to inputs by construction, self-definition, or load-bearing self-reference.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Quantum computing in the nisq era and beyond,
J. Preskill, “Quantum computing in the nisq era and beyond,”Quantum, vol. 2, p. 79, 2018
2018
-
[2]
Arta: automating design space explo- ration of spin-qubit architectures,
N. Paraskevopoulos, D. Hamel, A. Sarkar, C. G. Almude- ver, and S. Feld, “Arta: automating design space explo- ration of spin-qubit architectures,”Quantum Information Processing, vol. 24, no. 6, p. 184, 2025
2025
-
[3]
High performance emulation of quantum circuits,
T. H¨aner, D. S. Steiger, M. Smelyanskiy, and M. Troyer, “High performance emulation of quantum circuits,” in SC’16: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE, 2016, pp. 866–874
2016
-
[4]
Focus on utility-scale computing: cloud simulators and lab are now retired,
IBM Quantum, “Focus on utility-scale computing: cloud simulators and lab are now retired,” https://quantum. cloud.ibm.com/announcements/en/product-updates/ 2024-05-15-2024-sunset-final-lab-simulators, 2024, product update announcement, accessed 2026-03-17
2024
-
[5]
Amazon braket faqs,
Amazon Web Services, “Amazon braket faqs,” https:// aws.amazon.com/braket/faqs/, 2026, accessed: 2026-03-
2026
-
[6]
FAQ entry for the SV1 simulator
-
[7]
Parsl: Pervasive parallel programming in python,
Y . Babuji, A. Woodard, Z. Li, D. S. Katz, B. Clifford, R. Kumar, L. Lacinski, R. Chard, J. M. Wozniak, I. Foster et al., “Parsl: Pervasive parallel programming in python,” inProceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, 2019, pp. 25–36
2019
-
[8]
Dask: Parallel computation with blocked algorithms and task scheduling
M. Rocklinet al., “Dask: Parallel computation with blocked algorithms and task scheduling.” inSciPy, 2015, pp. 126–132
2015
-
[9]
Pegasus, a workflow management system for science automation,
E. Deelman, K. Vahi, G. Juve, M. Rynge, S. Callaghan, P. J. Maechling, R. Mayani, W. Chen, R. F. Da Silva, M. Livnyet al., “Pegasus, a workflow management system for science automation,”Future Generation Computer Systems, vol. 46, pp. 17–35, 2015
2015
-
[10]
Swift: A language for distributed parallel scripting,
M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz, and I. Foster, “Swift: A language for distributed parallel scripting,”Parallel Computing, vol. 37, no. 9, pp. 633–652, 2011
2011
-
[11]
Frontier—world’s first exaflops supercom- puter,
V . Rajaraman, “Frontier—world’s first exaflops supercom- puter,”Resonance, vol. 28, no. 4, pp. 567–576, 2023
2023
- [12]
-
[13]
cuquantum sdk: A high-performance library for ac- celerating quantum science,
H. Bayraktar, A. Charara, D. Clark, S. Cohen, T. Costa, Y .-L. L. Fang, Y . Gao, J. Guan, J. Gunnels, A. Haidar et al., “cuquantum sdk: A high-performance library for ac- celerating quantum science,” in2023 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2023, pp. 1050–1061
2023
-
[14]
Projectq: an open source software framework for quantum computing,
D. S. Steiger, T. H ¨aner, and M. Troyer, “Projectq: an open source software framework for quantum computing,” Quantum, vol. 2, p. 49, 2018
2018
-
[15]
Ibm quantum learning platform,
IBM Quantum, “Ibm quantum learning platform,” IBM, Tech. Rep., 2026, online educational resource. [Online]. Available: https://quantum.cloud.ibm.com/learning/en
2026
-
[16]
Archer2 user guide: Data manage- ment and transfer,
ARCHER2 Service, “Archer2 user guide: Data manage- ment and transfer,” https://docs.archer2.ac.uk/user-guide/ data/, 2026, accessed: 2026-03-18
2026
-
[17]
Getting started at nersc,
National Energy Research Scientific Computing Cen- ter, “Getting started at nersc,” https://docs.nersc.gov/ getting-started/, 2026, accessed: 2026-03-18
2026
-
[18]
Olcf policy guide,
Oak Ridge Leadership Computing Facility, “Olcf policy guide,” https://www.olcf.ornl.gov/for-users/ olcf-policy-guide/, 2026, accessed: 2026-03-18
2026
-
[19]
Getting started with supercomputing,
Pawsey Supercomputing Research Centre, “Getting started with supercomputing,” https: //pawsey.atlassian.net/wiki/spaces/US/pages/51925850/ Getting+Started+with+Supercomputing, 2026, accessed: 2026-03-18
-
[20]
Distributed computing in practice: the condor experience,
D. Thain, T. Tannenbaum, and M. Livny, “Distributed computing in practice: the condor experience,”Concur- rency and computation: practice and experience, vol. 17, no. 2-4, pp. 323–356, 2005
2005
-
[21]
A comprehensive perspective on pilot-job systems,
M. Turilli, M. Santcroos, and S. Jha, “A comprehensive perspective on pilot-job systems,”ACM Computing Sur- veys (CSUR), vol. 51, no. 2, pp. 1–32, 2018
2018
-
[22]
Virtual qpu: A novel implementation of quantum computing,
D. Zheng, J. Xv, X. Zhou, and Z. Shan, “Virtual qpu: A novel implementation of quantum computing,” Computers, Materials and Continua, vol. 87, no. 1,
-
[23]
Available: https://www.sciencedirect.com/ science/article/pii/S1546221826001712
[Online]. Available: https://www.sciencedirect.com/ science/article/pii/S1546221826001712
-
[24]
Slurm: Simple linux utility for resource management,
A. B. Yoo, M. A. Jette, and M. Grondona, “Slurm: Simple linux utility for resource management,” inWorkshop on job scheduling strategies for parallel processing. Springer, 2003, pp. 44–60
2003
-
[25]
Slurm workload manager documentation,
SchedMD, “Slurm workload manager documentation,” https://slurm.schedmd.com/documentation.html, 2026, ac- cessed: 2026-03-18
2026
-
[26]
Openpbs: Open source workload manager and job scheduler,
OpenPBS Project, “Openpbs: Open source workload manager and job scheduler,” https://www.openpbs.org/, 2026, accessed: 2026-03-18
2026
-
[27]
Qiskit aer documentation,
Qiskit Aer Developers, “Qiskit aer documentation,” https: //qiskit.github.io/qiskit-aer/, 2026, accessed: 2026-03-18
2026
-
[28]
A. Javadi-Abhari, M. Treinish, K. Krsulich, C. J. Wood, J. Lishman, J. Gacon, S. Martiel, P. D. Nation, L. S. Bishop, A. W. Crosset al., “Quantum computing with qiskit,”arXiv preprint arXiv:2405.08810, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[29]
Sv-sim: scalable pgas-based state vector simulation of quantum circuits,
A. Li, B. Fang, C. Granade, G. Prawiroatmodjo, B. Heim, M. Roetteler, and S. Krishnamoorthy, “Sv-sim: scalable pgas-based state vector simulation of quantum circuits,” in Proceedings of the International Conference for High Per- formance Computing, Networking, Storage and Analysis, 2021, pp. 1–14
2021
-
[30]
Performance evaluation and acceleration of the qtensor quantum circuit simulator on gpus,
D. Lykov, A. Chen, H. Chen, K. Keipert, Z. Zhang, T. Gibbs, and Y . Alexeev, “Performance evaluation and acceleration of the qtensor quantum circuit simulator on gpus,” in2021 IEEE/ACM Second International Workshop on Quantum Computing Software (QCS). IEEE, 2021, pp. 27–34
2021
-
[31]
Tackling the qubit mapping problem for nisq-era quantum devices,
G. Li, Y . Ding, and Y . Xie, “Tackling the qubit mapping problem for nisq-era quantum devices,” inProceedings of the twenty-fourth international conference on architec- tural support for programming languages and operating systems, 2019, pp. 1001–1014. 29
2019
-
[32]
Noise-adaptive compiler mappings for noisy intermediate-scale quantum computers,
P. Murali, J. M. Baker, A. Javadi-Abhari, F. T. Chong, and M. Martonosi, “Noise-adaptive compiler mappings for noisy intermediate-scale quantum computers,” in Proceedings of the twenty-fourth international conference on architectural support for programming languages and operating systems, 2019, pp. 1015–1029
2019
-
[33]
On the use of calibration data in error-aware compilation techniques for nisq devices,
H. Kurniawan, L. Rodr ´ıguez-Soriano, D. Cuomo, C. G. Almudever, and F. G. Herrero, “On the use of calibration data in error-aware compilation techniques for nisq devices,” 2024. [Online]. Available: https://arxiv.org/abs/2407.21462
-
[34]
5 petabyte simulation of a 45- qubit quantum circuit,
T. H¨aner and D. S. Steiger, “5 petabyte simulation of a 45- qubit quantum circuit,” inProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, 2017, pp. 1–10
2017
-
[35]
Quantum error mitigation,
Z. Cai, R. Babbush, S. C. Benjamin, S. Endo, W. J. Huggins, Y . Li, J. R. McClean, and T. E. O’Brien, “Quantum error mitigation,”Reviews of Modern Physics, vol. 95, no. 4, p. 045005, 2023
2023
-
[36]
Exponentially tighter bounds on limitations of quantum error mitigation,
Y . Quek, D. Stilck Fran c ¸a, S. Khatri, J. J. Meyer, and J. Eisert, “Exponentially tighter bounds on limitations of quantum error mitigation,”Nature Physics, vol. 20, no. 10, pp. 1648–1658, 2024
2024
-
[37]
Demonstrat- ing quantum error mitigation on logical qubits,
A. Zhang, H. Xie, Y . Gao, J.-N. Yang, Z. Bao, Z. Zhu, J. Chen, N. Wang, C. Zhang, J. Zhonget al., “Demonstrat- ing quantum error mitigation on logical qubits,”Nature Communications, 2025
2025
-
[38]
Amazon Braket: Quantum Computing Service,
Amazon Web Services, “Amazon Braket: Quantum Computing Service,” https://aws.amazon.com/braket/, 2025, accessed: 2025-12-15. [Online]. Available: https: //aws.amazon.com/braket/
2025
-
[39]
Manage cost,
IBM Quantum, “Manage cost,” https://quantum.cloud.ibm. com/docs/en/guides/manage-cost, 2026, accessed: 2026- 03-18
2026
-
[40]
Understanding and estimating the execution time of quantum circuits,
N. Ma, H. Li, N. Ma, and H. Li, “Understanding and estimating the execution time of quantum circuits,” New York, NY , USA, Nov. 2025, just Accepted. [Online]. Available: https://doi.org/10.1145/3778031
-
[41]
Pawsey Supercomputing Research Centre, “Setonix su- percomputer,” https://doi.org/10.48569/18sb-8s43, Perth, Western Australia, 2023
-
[42]
M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, R. Katz, A. Konwinski, G. Lee, D. Patterson, A. Rabkin, I. Stoica, and M. Zaharia, “A view of cloud computing,”Commun. ACM, vol. 53, no. 4, p. 50–58, Apr. 2010. [Online]. Available: https://doi.org/10.1145/1721654.1721672
-
[43]
Elastic fabric adapter,
Amazon Web Services, “Elastic fabric adapter,” https: //aws.amazon.com/hpc/efa/, 2026, accessed: 2026-03-18
2026
-
[44]
Amazon fsx for lustre,
——, “Amazon fsx for lustre,” https://aws.amazon.com/ fsx/lustre/, 2026, accessed: 2026-03-18
2026
-
[45]
Hexagonal architecture,
A. Cockburn, “Hexagonal architecture,” https://alistair. cockburn.us/hexagonal-architecture, 2005, accessed: 2026- 03-17
2005
-
[46]
Qiskit transpiler api documenta- tion,
IBM Quantum, “Qiskit transpiler api documenta- tion,” https://quantum.cloud.ibm.com/docs/en/api/qiskit/ transpiler, 2026, accessed: 2026-03-18
2026
-
[47]
Google Quantum AI, “Cirq,” https://quantumai.google/ cirq, 2026, accessed: 2026-03-17
2026
-
[48]
A conceptual architecture for a quantum-hpc middleware,
N. Saurabh, S. Jha, and A. Luckow, “A conceptual architecture for a quantum-hpc middleware,” in2023 IEEE international conference on quantum software (QSW). IEEE, 2023, pp. 116–127
2023
-
[49]
A framework for integrating quantum simulation and high performance computing,
A. Shehata, T. Naughton, and I.-S. Suh, “A framework for integrating quantum simulation and high performance computing,” in2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 2. IEEE, 2024, pp. 300–305
2024
-
[50]
Bridging paradigms: Designing for hpc-quantum conver- gence,
A. Shehata, P. Groszkowski, T. Naughton, M. G. Meena, E. Wong, D. Claudino, R. F. Da Silva, and T. Beck, “Bridging paradigms: Designing for hpc-quantum conver- gence,”Future Generation Computer Systems, vol. 174, p. 107980, 2026
2026
-
[51]
Pilot-quantum: A quantum-hpc middleware for resource, workload and task management,
P. Mantha, F. J. Kiwit, N. Saurabh, S. Jha, and A. Luckow, “Pilot-quantum: A quantum-hpc middleware for resource, workload and task management,” 2025. [Online]. Available: https://arxiv.org/abs/2412.18519
-
[52]
vqpu-hybrid- workflow: Hybrid (v)qpu, gpu, cpu workflow framework,
Pawsey Supercomputing Research Centre, “vqpu-hybrid- workflow: Hybrid (v)qpu, gpu, cpu workflow framework,” https://github.com/PawseySC/vqpu-hybrid-workflow, 2026, accessed: 2026-03-18
2026
-
[53]
Globus toolkit version 4: Software for service- oriented systems,
I. Foster, “Globus toolkit version 4: Software for service- oriented systems,”Journal of computer science and technology, vol. 21, no. 4, pp. 513–520, 2006
2006
-
[54]
The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update,
Galaxy Community, “The galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2022 update,”Nucleic acids research, vol. 50, no. W1, pp. W345–W351, 2022
2022
-
[55]
Celery documentation: Distributed task queue,
Celery Project, “Celery documentation: Distributed task queue,” https://docs.celeryq.dev/en/stable/, 2026, accessed: 2026-03-17
2026
-
[56]
Rq: Simple job queues for python,
RQ Developers, “Rq: Simple job queues for python,” https://python-rq.org/, 2026, accessed: 2026-03-17
2026
-
[57]
Quantum virtual machines,
R. Tao, H. Zhu, J. Nieh, J. Yao, and R. Gu, “Quantum virtual machines,” in19th USENIX Symposium on Op- erating Systems Design and Implementation (OSDI 25), 2025, pp. 411–428
2025
-
[58]
S. Liu, P. J. Elahi, and U. Varetto, “Dynq: A dy- namic topology-agnostic quantum virtual machine via quality-weighted community detection,”arXiv preprint arXiv:2601.19635, 2026
-
[59]
Formal verification of quantum programs: Theory, tools, and challenges,
M. Lewis, S. Soudjani, and P. Zuliani, “Formal verification of quantum programs: Theory, tools, and challenges,” ACM Transactions on Quantum Computing, vol. 5, no. 1, pp. 1–35, 2023
2023
-
[60]
Quantumnat: quantum noise-aware training with noise injection, quantization and normalization,
H. Wang, J. Gu, Y . Ding, Z. Li, F. T. Chong, D. Z. Pan, and S. Han, “Quantumnat: quantum noise-aware training with noise injection, quantization and normalization,” in Proceedings of the 59th ACM/IEEE design automation conference, 2022, pp. 1–6
2022
-
[61]
Lumi supercomputer,
LUMI Consortium, “Lumi supercomputer,” https:// lumi-supercomputer.eu/, 2026, accessed: 2026-03-18
2026
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.