A PennyLane-Centric Dataset to Enhance LLM-based Quantum Code Generation using RAG

Abdul Basit; Alberto Marchisio; Minghao Shao; Muhammad Haider Asif; Muhammad Kashif; Muhammad Shafique; Nouhaila Innan

arxiv: 2503.02497 · v4 · submitted 2025-03-04 · 💻 cs.SE · cs.AI· quant-ph

A PennyLane-Centric Dataset to Enhance LLM-based Quantum Code Generation using RAG

Abdul Basit , Nouhaila Innan , Muhammad Haider Asif , Minghao Shao , Muhammad Kashif , Alberto Marchisio , Muhammad Shafique This is my paper

Pith reviewed 2026-05-23 01:38 UTC · model grok-4.3

classification 💻 cs.SE cs.AIquant-ph

keywords PennyLanequantum code generationretrieval-augmented generationLLMdatasetquantum programmingRAG pipeline

0 comments

The pith

A dataset of 3347 PennyLane code samples raises LLM success on quantum tasks when paired with retrieval augmentation.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates and releases PennyLang, a collection of 3347 PennyLane quantum code examples drawn from textbooks, documentation, and repositories, each paired with contextual descriptions. It demonstrates that inserting relevant samples from this collection into LLM prompts through a retrieval-augmented generation pipeline lifts success rates on code-generation benchmarks. The same pipeline also reduces hallucinations and improves the correctness of the generated quantum programs. The work supplies an automated construction framework and reports baseline numbers across several open-source and commercial models.

Core claim

The central claim is that the PennyLang dataset, when used as the retrieval corpus inside a RAG pipeline, substantially improves LLM performance on PennyLane-specific quantum code generation tasks. Concrete gains include Qwen 7B rising from 8.7 percent to 41.7 percent success and LLaMa 4 rising from 78.8 percent to 84.8 percent success, accompanied by fewer hallucinations and higher code correctness.

What carries the argument

The PennyLang dataset of 3347 annotated PennyLane code samples, which functions as the external knowledge source retrieved and inserted into LLM prompts during RAG.

If this is right

LLMs produce more correct PennyLane code and commit fewer hallucinations when relevant examples are retrieved from PennyLang.
The automated curation framework can be reused to build comparable datasets for other quantum programming libraries.
Baseline numbers across multiple models provide a reproducible reference point for measuring future gains in AI-assisted quantum development.
Support for PennyLane extends LLM tooling beyond the Qiskit-centric studies that have dominated prior work.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same retrieval-augmentation pattern could be applied to other specialized coding domains that currently lack high-quality example collections.
Wider adoption of such datasets might lower the barrier for non-experts to produce working quantum programs.
Testing PennyLang with newer or larger models, or with different retrieval strategies, would clarify how far the observed gains can be pushed.

Load-bearing premise

The curated samples drawn from textbooks, documentation, and open repositories form a high-quality, representative knowledge source that augments LLMs without introducing systematic errors or irrelevant content.

What would settle it

Running the same RAG pipeline on a fresh collection of quantum programming problems that have no overlap with the 3347 samples in PennyLang and finding no improvement in success rate or hallucination rate would falsify the central claim.

Figures

Figures reproduced from arXiv: 2503.02497 by Abdul Basit, Alberto Marchisio, Minghao Shao, Muhammad Haider Asif, Muhammad Kashif, Muhammad Shafique, Nouhaila Innan.

**Figure 2.** Figure 2: Methodology for PennyLang Quantum Code Generation and Evaluation. We collect and refine quantum code from GitHub, textbooks, and PennyLane documentation, then use GPT-4o to convert cleaned snippets into instruction–query pairs (with code and tests). These examples are embedded in Chroma, and LangChain performs MMR-based retrieval to produce RAG-augmented versus vanilla prompts. GPT-4o-mini, Claude 3.5 Sonn… view at source ↗

**Figure 5.** Figure 5: Feature group usage across domains: Heatmap illustrating how [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 4.** Figure 4: Distribution of instruction and response lengths with their correlation. The left plot shows the normal distribution of instruction lengths, the middle [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 6.** Figure 6: Confusion matrix showing the relationship between newly defined semantic categories and various PennyLane quantum features. The color intensity [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

read the original abstract

Large Language Models (LLMs) offer powerful capabilities in code generation, natural language understanding, and domain-specific reasoning. Their application to quantum software development remains limited, in part because of the lack of high-quality datasets both for LLM training and as dependable knowledge sources. To bridge this gap, we introduce \textit{PennyLang}, an off-the-shelf, high-quality dataset of 3,347 PennyLane-specific quantum code samples with contextual descriptions, curated from textbooks, official documentation, and open-source repositories. Our contributions are threefold: (1) the creation and open-source release of PennyLang, a purpose-built dataset for quantum programming with PennyLane; (2) a framework for automated quantum code dataset construction that systematizes curation, annotation, and formatting to maximize downstream LLM usability; and (3) a baseline evaluation of the dataset across multiple open-source and commercial models, including ablation studies, all conducted within a retrieval-augmented generation (RAG) pipeline. Using PennyLang with RAG substantially improves performance: for example, Qwen 7B's success rate rises from 8.7% without retrieval to 41.7% with full-context augmentation, and LLaMa 4 improves from 78.8% to 84.8%, while also reducing hallucinations and enhancing quantum code correctness. Moving beyond Qiskit-focused studies, we bring LLM-based tools and reproducible methods to PennyLane for advancing AI-assisted quantum development.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

PennyLang is a new PennyLane dataset with RAG baselines, but the reported gains rest on an unevaluated test set that may overlap the curation sources.

read the letter

The core of this paper is the release of PennyLang, a 3,347-sample dataset of PennyLane code with descriptions, plus a curation pipeline and some RAG experiments on open models. That fills a narrow but real gap, since most prior work has stayed with Qiskit. Releasing the data and describing the collection steps from textbooks, docs, and repos is the useful part; anyone building retrieval tools for quantum code now has a starting point they did not have before. The ablations and multi-model baselines are also straightforward to appreciate even if the numbers are modest in places. The soft spot is the evaluation. The headline improvements (Qwen 7B 8.7 % to 41.7 %, LLaMa 4 78.8 % to 84.8 %) are presented without any stated check that the test prompts are disjoint from the 3,347 curated items. If the test cases come from the same sources, full-context retrieval is mostly supplying near-duplicates rather than demonstrating generalization. The abstract also leaves the success-rate definition and hallucination metric unspecified, so it is hard to judge how much of the lift is real versus prompt or retrieval artifact. This is the sort of paper that matters to the small group of people working on LLM tooling for quantum software stacks, especially PennyLane users who want a ready dataset. The resource itself has value even if the current numbers need tighter controls. It should go to peer review so the overlap question and metric details can be sorted out in revision.

Referee Report

2 major / 2 minor

Summary. The paper introduces PennyLang, a curated dataset of 3,347 PennyLane-specific quantum code samples with contextual descriptions sourced from textbooks, official documentation, and open-source repositories. It presents a framework for automated dataset construction and evaluates the dataset within a RAG pipeline, claiming that retrieval augmentation substantially boosts LLM performance on quantum code generation tasks (e.g., Qwen 7B success rate from 8.7% without retrieval to 41.7% with full-context augmentation; LLaMa 4 from 78.8% to 84.8%), while reducing hallucinations and improving code correctness. The work also emphasizes open release of the dataset and reproducible methods beyond Qiskit-focused studies.

Significance. If the empirical claims hold after addressing evaluation details, the release of PennyLang and the associated curation framework would constitute a useful, purpose-built resource for LLM-assisted quantum software development with PennyLane. The open-source dataset and emphasis on reproducible RAG pipelines are explicit strengths that could support follow-on work in the intersection of quantum computing and code generation.

major comments (2)

[Evaluation / baseline results] Evaluation / baseline results (abstract and § on experiments): The central empirical claim rests on before-and-after success-rate comparisons, yet the manuscript provides no definition of 'success rate,' no description of test-set construction or size, no decontamination/overlap analysis between the 3,347 curated samples and the evaluation prompts, and no controls for prompt-engineering effects. This directly undermines interpretability of the reported gains (Qwen 7B 8.7% → 41.7%; LLaMa 4 78.8% → 84.8%) because near-duplicate retrieval cannot be ruled out.
[Dataset curation and RAG pipeline] § on dataset curation and RAG pipeline: The claim that the curated samples constitute a 'high-quality, representative, and unbiased knowledge source' is load-bearing for the augmentation argument, but no quantitative checks (e.g., diversity metrics, source-distribution statistics, or manual validation protocol) are reported to support this. Without such evidence the performance lift could reflect curation bias rather than genuine generalization.

minor comments (2)

[Abstract / Title] The abstract and title use 'PennyLang' and 'PennyLane-Centric' interchangeably; consistent nomenclature would improve clarity.
[Abstract] Ablation-study results are mentioned but not summarized with concrete numbers or table references in the abstract; adding a brief quantitative overview would help readers assess the contribution at a glance.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which identify areas where additional methodological detail will improve the manuscript. We address each major comment below and indicate the corresponding revisions.

read point-by-point responses

Referee: [Evaluation / baseline results] Evaluation / baseline results (abstract and § on experiments): The central empirical claim rests on before-and-after success-rate comparisons, yet the manuscript provides no definition of 'success rate,' no description of test-set construction or size, no decontamination/overlap analysis between the 3,347 curated samples and the evaluation prompts, and no controls for prompt-engineering effects. This directly undermines interpretability of the reported gains (Qwen 7B 8.7% → 41.7%; LLaMa 4 78.8% → 84.8%) because near-duplicate retrieval cannot be ruled out.

Authors: We agree that these details are required for interpretability. In the revised manuscript we will add an explicit definition of success rate, a description of test-set construction and size, results of a decontamination analysis checking for overlap between the evaluation prompts and the 3,347 PennyLang samples, and controls that isolate prompt-engineering effects from the retrieval augmentation. These additions will be placed in the Experiments section. revision: yes
Referee: [Dataset curation and RAG pipeline] § on dataset curation and RAG pipeline: The claim that the curated samples constitute a 'high-quality, representative, and unbiased knowledge source' is load-bearing for the augmentation argument, but no quantitative checks (e.g., diversity metrics, source-distribution statistics, or manual validation protocol) are reported to support this. Without such evidence the performance lift could reflect curation bias rather than genuine generalization.

Authors: We concur that quantitative support for the dataset-quality claims is needed. The revised manuscript will include a dedicated Dataset Analysis subsection reporting diversity metrics, source-distribution statistics, and the manual validation protocol used during curation. These additions will allow readers to evaluate whether the observed gains reflect genuine generalization. revision: yes

Circularity Check

0 steps flagged

No circularity; empirical evaluation is self-contained.

full rationale

The paper's central claims consist of dataset construction followed by direct before-and-after LLM performance measurements (success rates, hallucination reduction) in a RAG setting. No equations, fitted parameters, or derivations are present that reduce reported outcomes to inputs by construction. No self-citation chains or uniqueness theorems are invoked to support the results. The evaluation is an independent empirical comparison whose validity may be questioned on other grounds (e.g., test-set overlap) but does not exhibit any of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical dataset-construction and benchmarking paper. No free parameters, mathematical axioms, or invented physical entities are required or introduced.

pith-pipeline@v0.9.0 · 5818 in / 1239 out tokens · 35599 ms · 2026-05-23T01:38:07.055392+00:00 · methodology

discussion (0)

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Can LLMs Solve Science or Just Write Code? Evaluating Quantum Solver Generation
cs.SE 2026-05 unverdicted novelty 6.0

Q-SAGE iteratively refines LLM-generated quantum solver scripts by comparing outputs to classical results, improving success rates while exposing persistent numerical accuracy limits.
Can LLMs Solve Science or Just Write Code? Evaluating Quantum Solver Generation
cs.SE 2026-05 unverdicted novelty 4.0

Iterative refinement boosts LLM success in generating quantum solvers that match classical results, but more advanced models shift from execution errors to hard-to-detect numerical inaccuracies.
Automated Quantum Software and AI Engineering
cs.SE 2026-04 unverdicted novelty 4.0

A systematic literature review maps trends in automated approaches to quantum software engineering and quantum AI, highlighting their role in hybrid quantum-classical systems.

Reference graph

Works this paper leans on

54 extracted references · 54 canonical work pages · cited by 2 Pith papers · 13 internal anchors

[1]

Simulating physics with computers,

R. P. Feynman, “Simulating physics with computers,” International Journal of Theoretical Physics , vol. 21, no. 6, pp. 467–488, Jun 1982

work page 1982
[2]

Quantum Computing in the NISQ era and beyond

J. Preskill, “Quantum computing in the nisq era and beyond,” Quantum, vol. 2, p. 79, Aug. 2018. [Online]. Available: http: //dx.doi.org/10.22331/q-2018-08-06-79

work page internal anchor Pith review doi:10.22331/q-2018-08-06-79 2018
[3]

Demonstrating quantum advantage in hybrid quantum neural networks for model capacity,

M. Kashif and S. Al-Kuwari, “Demonstrating quantum advantage in hybrid quantum neural networks for model capacity,” in 2022 IEEE International Conference on Rebooting Computing (ICRC) , 2022, pp. 36–44

work page 2022
[4]

Quantum supremacy using a programmable superconducting pro- cessor

F. Arute, K. Arya, R. Babbush et al. , “Quantum supremacy using a programmable superconducting processor,” Nature, vol. 574, no. 7779, pp. 505–510, Oct 2019. [Online]. Available: https: //doi.org/10.1038/s41586-019-1666-5

work page doi:10.1038/s41586-019-1666-5 2019
[5]

Quantum algorithms: an overview,

A. Montanaro, “Quantum algorithms: an overview,” npj Quantum Information, vol. 2, no. 1, Jan. 2016. [Online]. Available: http: //dx.doi.org/10.1038/npjqi.2015.23

work page doi:10.1038/npjqi.2015.23 2016
[6]

A survey on quantum machine learning: Current trends, challenges, opportunities, and the road ahead,

K. Zaman, A. Marchisio, M. A. Hanif, and M. Shafique, “A survey on quantum machine learning: Current trends, challenges, opportunities, and the road ahead,” 2024. [Online]. Available: https://arxiv.org/abs/2310.10315

work page arXiv 2024
[7]

Qfnn-ffd: Quantum federated neural network for financial fraud detection,

N. Innan, A. Marchisio, M. Bennai, and M. Shafique, “Qfnn-ffd: Quantum federated neural network for financial fraud detection,” 2024. [Online]. Available: https://arxiv.org/abs/2404.02595

work page arXiv 2024
[8]

Dubey, Christian Ufrecht, Maniraman Periyasamy, Axel Plinge, Christopher Mutschler & Daniel D

W. E. Maouaki, N. Innan, A. Marchisio et al. , “Quantum clustering for cybersecurity,” in 2024 IEEE International Conference on Quantum Computing and Engineering (QCE) . IEEE, Sep. 2024, p. 5–10. [Online]. Available: http://dx.doi.org/10.1109/QCE60285.2024.10243

work page doi:10.1109/qce60285.2024.10243 2024
[9]

Quantum computing for electronic structure analysis: Ground state energy and molecular proper- ties calculations,

N. Innan, M. A.-Z. Khan, and M. Bennai, “Quantum computing for electronic structure analysis: Ground state energy and molecular proper- ties calculations,” Materials Today Communications, vol. 38, p. 107760, 2024

work page 2024
[10]

Po-qa: A framework for portfolio optimization using quantum algorithms,

K. Zaman, A. Marchisio, M. Kashif, and M. Shafique, “Po-qa: A framework for portfolio optimization using quantum algorithms,” in2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2024, pp. 1397–1403

work page 2024
[11]

Financial fraud detection using quantum graph neural networks,

N. Innan, A. Sawaika, A. Dhor, S. Dutta, S. Thota, H. Gokal, N. Patel, M. A.-Z. Khan, I. Theodonis, and M. Bennai, “Financial fraud detection using quantum graph neural networks,” Quantum Machine Intelligence , vol. 6, no. 1, p. 7, 2024

work page 2024
[12]

The impact of cost function globality and locality in hybrid quantum neural networks on nisq devices,

M. Kashif and S. Al-Kuwari, “The impact of cost function globality and locality in hybrid quantum neural networks on nisq devices,” Machine Learning: Science and Technology , vol. 4, no. 1, p. 015004, 2023

work page 2023
[13]

Resource allocation optimization in 5g networks using variational quantum regressor,

P. Pathak, V . Oad, A. Prajapati, and N. Innan, “Resource allocation optimization in 5g networks using variational quantum regressor,” in 2024 International Conference on Quantum Communications, Network- ing, and Computing (QCNC) . IEEE, 2024, pp. 101–105

work page 2024
[14]

Advqunn: A methodology for analyzing the adversarial robustness of quanvolutional neural networks,

W. El Maouaki, A. Marchisio, T. Said, M. Bennai, and M. Shafique, “Advqunn: A methodology for analyzing the adversarial robustness of quanvolutional neural networks,” in2024 IEEE International Conference on Quantum Software (QSW) . IEEE, 2024, pp. 175–181

work page 2024
[15]

Fedqnn: Federated learning using quantum neural networks,

N. Innan, M. A.-Z. Khan, A. Marchisio, M. Shafique, and M. Bennai, “Fedqnn: Federated learning using quantum neural networks,” in 2024 International Joint Conference on Neural Networks (IJCNN) , 2024, pp. 1–9

work page 2024
[17]

Computational advantage in hybrid quantum neural networks: Myth or reality?

[Online]. Available: https://arxiv.org/abs/2412.04991

work page arXiv
[18]

Quantum optimization using variational algorithms on near-term quantum devices,

N. Moll, P. Barkoutsos, L. S. Bishop, J. M. Chow, A. Cross, D. J. Egger, S. Filipp, A. Fuhrer, J. M. Gambetta, M. Ganzhorn, A. Kandala, A. Mezzacapo, P. M ¨uller, W. Riess, G. Salis, J. Smolin, I. Tavernelli, and K. Temme, “Quantum optimization using variational algorithms on near-term quantum devices,” Quantum Science and Technology, vol. 3, no. 3, p. 03...

work page doi:10.1088/2058-9565/aab822 2018
[19]

A comprehensive review of quantum machine learning: from nisq to fault tolerance,

Y . Wang and J. Liu, “A comprehensive review of quantum machine learning: from nisq to fault tolerance,” Reports on Progress in Physics, vol. 87, no. 11, p. 116402, Oct. 2024. [Online]. Available: http://dx.doi.org/10.1088/1361-6633/ad7f69

work page doi:10.1088/1361-6633/ad7f69 2024
[20]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

V . Bergholm, J. Izaac, M. Schuld et al. , “Pennylane: Automatic differentiation of hybrid quantum-classical computations,” 2022. [Online]. Available: https://arxiv.org/abs/1811.04968

work page internal anchor Pith review Pith/arXiv arXiv 2022
[21]

Qiskit: an open- source framework for quantum computing,

G. Aleksandrowicz, T. Alexander, P. Barkoutsos et al., “Qiskit: an open- source framework for quantum computing,” 2019

work page 2019
[22]

Qiskit as a simulation platform for measurement-based quantum computation,

M. Kashif and S. Al-Kuwari, “Qiskit as a simulation platform for measurement-based quantum computation,” in 2022 IEEE 19th Interna- tional Conference on Software Architecture Companion (ICSA-C), 2022, pp. 152–159

work page 2022
[23]

Qiskit code assistant: Training llms for generating quantum computing code,

N. Dupuis, L. Buratti, S. Vishwakarma et al. , “Qiskit code assistant: Training llms for generating quantum computing code,” 2024

work page 2024
[24]

Language models are few-shot learners,

T. B. Brown, B. Mann, N. Ryder, M. Subbiah et al., “Language models are few-shot learners,” 2020

work page 2020
[25]

Evaluating Large Language Models Trained on Code

M. Chen, J. Tworek, H. Jun, Q. Yuan et al., “Evaluating large language models trained on code,” arXiv preprint, vol. arXiv:2107.03374, 2021. [Online]. Available: https://arxiv.org/abs/2107.03374

work page internal anchor Pith review Pith/arXiv arXiv 2021
[27]

Competition-Level Code Generation with AlphaCode

[Online]. Available: https://arxiv.org/abs/2203.07814

work page internal anchor Pith review Pith/arXiv arXiv
[28]

OpenAI Codex: Programming with Natural Language,

OpenAI, “OpenAI Codex: Programming with Natural Language,” https: //openai.com/index/openai-codex/, 2021

work page 2021
[29]

Code Llama: Open Foundation Models for Code

B. Rozi `ere, J. Gehring, F. Gloeckle et al. , “Code llama: Open foundation models for code,” 2024. [Online]. Available: https: //arxiv.org/abs/2308.12950

work page internal anchor Pith review Pith/arXiv arXiv 2024
[30]

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y . Zhou, S. Savarese, and C. Xiong, “Codegen: An open large language model for code with multi-turn program synthesis,” 2023. [Online]. Available: https://arxiv.org/abs/2203.13474

work page internal anchor Pith review Pith/arXiv arXiv 2023
[31]

InCoder: A Generative Model for Code Infilling and Synthesis

D. Fried, A. Aghajanyan, J. Lin, S. Wang, E. Wallace, F. Shi, R. Zhong, W. tau Yih, L. Zettlemoyer, and M. Lewis, “Incoder: A generative model for code infilling and synthesis,” 2023. [Online]. Available: https://arxiv.org/abs/2204.05999

work page internal anchor Pith review Pith/arXiv arXiv 2023
[32]

Schuld and F

M. Schuld and F. Petruccione, Supervised Learning with Quantum Computers, 1st ed. Springer, 2018

work page 2018
[33]

Invited: Leveraging machine learning for quantum compilation optimization,

X. Ren, T. Zhang, X. Xu, Y .-C. Zheng, and S. Zhang, “Invited: Leveraging machine learning for quantum compilation optimization,” ser. DAC ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145/3649329. 3663510

work page doi:10.1145/3649329 2024
[34]

The qiskit textbook,

Q. Community, “The qiskit textbook,” 2023. [Online]. Available: https://github.com/Qiskit/textbook

work page 2023
[35]

Ibm quantum challenge,

I. Quantum, “Ibm quantum challenge,” 2024. [Online]. Available: https://github.com/qiskit-community/ibm-quantum-challenge-2024

work page 2024
[36]

Qiskit open-source repositories,

I. Quantum and Q. Developers, “Qiskit open-source repositories,” 2023, available at https://github.com/Qiskit

work page 2023
[37]

Q-gen quantum circuit dataset,

Y . Mao, “Q-gen quantum circuit dataset,” 2023. [Online]. Available: https://www.kaggle.com/datasets/ykmaoykmao/ q-gen-quantum-circuit-dataset

work page 2023
[38]

Qdataset: Quantum datasets for machine learning,

E. Perrier, A. Youssry, and C. Ferrie, “Qdataset: Quantum datasets for machine learning,” 2021. [Online]. Available: https://arxiv.org/abs/2108. 06661

work page 2021
[39]

Qcircuitnet: A large-scale hierarchical dataset for quantum algorithm design,

R. Yang, Y . Gu, Z. Wang, Y . Liang, and T. Li, “Qcircuitnet: A large-scale hierarchical dataset for quantum algorithm design,” 2024. [Online]. Available: https://arxiv.org/abs/2410.07961

work page arXiv 2024
[40]

Official pennylane documentation,

P. Developers, “Official pennylane documentation,” 2023, available at https://docs.pennylane.ai/en/stable/

work page 2023
[41]

Pennylaneai github repository,

——, “Pennylaneai github repository,” 2023, available at https://github. com/PennyLaneAI

work page 2023
[42]

Survey of different large language model architectures: Trends, benchmarks, and challenges,

M. Shao, A. Basit, R. Karri, and M. Shafique, “Survey of different large language model architectures: Trends, benchmarks, and challenges,” IEEE Access , vol. 12, p. 188664–188706, 2024. [Online]. Available: http://dx.doi.org/10.1109/ACCESS.2024.3482107

work page doi:10.1109/access.2024.3482107 2024
[43]

A Survey on Large Language Models for Code Generation

J. Jiang, F. Wang, J. Shen, S. Kim, and S. Kim, “A survey on large language models for code generation,” 2024. [Online]. Available: https://arxiv.org/abs/2406.00515

work page internal anchor Pith review Pith/arXiv arXiv 2024
[44]

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Z. Feng, D. Guo, D. Tang et al. , “Codebert: A pre-trained model for programming and natural languages,” 2020. [Online]. Available: https://arxiv.org/abs/2002.08155

work page internal anchor Pith review Pith/arXiv arXiv 2020
[45]

Program synthesis with large language models,

J. Austin, A. Odena, M. Nye et al. , “Program synthesis with large language models,” 2021. [Online]. Available: https://arxiv.org/abs/2108. 07732

work page 2021
[46]

Codexglue: A machine learning benchmark dataset for code understanding and generation,

S. Lu, D. Guo, S. Ren, J. Huang et al. , “Codexglue: A machine learning benchmark dataset for code understanding and generation,”

work page
[47]

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

[Online]. Available: https://arxiv.org/abs/2102.04664

work page internal anchor Pith review Pith/arXiv arXiv
[48]

Adapting pre-trained language models for quantum natural language processing,

Q. Li, B. Wang, Y . Zhu, C. Lioma, and Q. Liu, “Adapting pre-trained language models for quantum natural language processing,” 2023. [Online]. Available: https://arxiv.org/abs/2302.13812

work page arXiv 2023
[49]

Ganguly, Quantum machine learning: an applied approach

S. Ganguly, Quantum machine learning: an applied approach. Springer, 2021

work page 2021
[50]

E. F. Combarro, S. Gonz ´alez-Castillo, and A. Di Meglio, A practical guide to quantum machine learning and quantum optimization: Hands- on approach to modern quantum algorithms . Packt Publishing Ltd, 2023

work page 2023
[51]

Pep 8 – style guide for python code,

G. Van Rossum, B. Warsaw, and N. Coghlan, “Pep 8 – style guide for python code,” 2001, available at https://peps.python.org/pep-0008/

work page 2001
[52]

Transformers: State-of-the-art natural language processing,

T. Wolf, L. Debut, V . Sanh et al. , “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Q. Liu and D. Schlangen, Eds. Online: Association for Computational Linguistics, Oct. 2020, pp. 38–45. [Online]. Available: https://aclanthology...

work page 2020
[53]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar et al. , “Attention is all you need,”

work page
[54]

Attention Is All You Need

[Online]. Available: https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv
[55]

Neural Machine Translation by Jointly Learning to Align and Translate

D. Bahdanau, K. Cho, and Y . Bengio, “Neural machine translation by jointly learning to align and translate,” 2016. [Online]. Available: https://arxiv.org/abs/1409.0473

work page internal anchor Pith review Pith/arXiv arXiv 2016
[56]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewis, E. Perez, A. Piktus et al. , “Retrieval-augmented generation for knowledge-intensive nlp tasks,” 2021. [Online]. Available: https: //arxiv.org/abs/2005.11401

work page internal anchor Pith review Pith/arXiv arXiv 2021

[1] [1]

Simulating physics with computers,

R. P. Feynman, “Simulating physics with computers,” International Journal of Theoretical Physics , vol. 21, no. 6, pp. 467–488, Jun 1982

work page 1982

[2] [2]

Quantum Computing in the NISQ era and beyond

J. Preskill, “Quantum computing in the nisq era and beyond,” Quantum, vol. 2, p. 79, Aug. 2018. [Online]. Available: http: //dx.doi.org/10.22331/q-2018-08-06-79

work page internal anchor Pith review doi:10.22331/q-2018-08-06-79 2018

[3] [3]

Demonstrating quantum advantage in hybrid quantum neural networks for model capacity,

M. Kashif and S. Al-Kuwari, “Demonstrating quantum advantage in hybrid quantum neural networks for model capacity,” in 2022 IEEE International Conference on Rebooting Computing (ICRC) , 2022, pp. 36–44

work page 2022

[4] [4]

Quantum supremacy using a programmable superconducting pro- cessor

F. Arute, K. Arya, R. Babbush et al. , “Quantum supremacy using a programmable superconducting processor,” Nature, vol. 574, no. 7779, pp. 505–510, Oct 2019. [Online]. Available: https: //doi.org/10.1038/s41586-019-1666-5

work page doi:10.1038/s41586-019-1666-5 2019

[5] [5]

Quantum algorithms: an overview,

A. Montanaro, “Quantum algorithms: an overview,” npj Quantum Information, vol. 2, no. 1, Jan. 2016. [Online]. Available: http: //dx.doi.org/10.1038/npjqi.2015.23

work page doi:10.1038/npjqi.2015.23 2016

[6] [6]

A survey on quantum machine learning: Current trends, challenges, opportunities, and the road ahead,

K. Zaman, A. Marchisio, M. A. Hanif, and M. Shafique, “A survey on quantum machine learning: Current trends, challenges, opportunities, and the road ahead,” 2024. [Online]. Available: https://arxiv.org/abs/2310.10315

work page arXiv 2024

[7] [7]

Qfnn-ffd: Quantum federated neural network for financial fraud detection,

N. Innan, A. Marchisio, M. Bennai, and M. Shafique, “Qfnn-ffd: Quantum federated neural network for financial fraud detection,” 2024. [Online]. Available: https://arxiv.org/abs/2404.02595

work page arXiv 2024

[8] [8]

Dubey, Christian Ufrecht, Maniraman Periyasamy, Axel Plinge, Christopher Mutschler & Daniel D

W. E. Maouaki, N. Innan, A. Marchisio et al. , “Quantum clustering for cybersecurity,” in 2024 IEEE International Conference on Quantum Computing and Engineering (QCE) . IEEE, Sep. 2024, p. 5–10. [Online]. Available: http://dx.doi.org/10.1109/QCE60285.2024.10243

work page doi:10.1109/qce60285.2024.10243 2024

[9] [9]

Quantum computing for electronic structure analysis: Ground state energy and molecular proper- ties calculations,

N. Innan, M. A.-Z. Khan, and M. Bennai, “Quantum computing for electronic structure analysis: Ground state energy and molecular proper- ties calculations,” Materials Today Communications, vol. 38, p. 107760, 2024

work page 2024

[10] [10]

Po-qa: A framework for portfolio optimization using quantum algorithms,

K. Zaman, A. Marchisio, M. Kashif, and M. Shafique, “Po-qa: A framework for portfolio optimization using quantum algorithms,” in2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1. IEEE, 2024, pp. 1397–1403

work page 2024

[11] [11]

Financial fraud detection using quantum graph neural networks,

N. Innan, A. Sawaika, A. Dhor, S. Dutta, S. Thota, H. Gokal, N. Patel, M. A.-Z. Khan, I. Theodonis, and M. Bennai, “Financial fraud detection using quantum graph neural networks,” Quantum Machine Intelligence , vol. 6, no. 1, p. 7, 2024

work page 2024

[12] [12]

The impact of cost function globality and locality in hybrid quantum neural networks on nisq devices,

M. Kashif and S. Al-Kuwari, “The impact of cost function globality and locality in hybrid quantum neural networks on nisq devices,” Machine Learning: Science and Technology , vol. 4, no. 1, p. 015004, 2023

work page 2023

[13] [13]

Resource allocation optimization in 5g networks using variational quantum regressor,

P. Pathak, V . Oad, A. Prajapati, and N. Innan, “Resource allocation optimization in 5g networks using variational quantum regressor,” in 2024 International Conference on Quantum Communications, Network- ing, and Computing (QCNC) . IEEE, 2024, pp. 101–105

work page 2024

[14] [14]

Advqunn: A methodology for analyzing the adversarial robustness of quanvolutional neural networks,

W. El Maouaki, A. Marchisio, T. Said, M. Bennai, and M. Shafique, “Advqunn: A methodology for analyzing the adversarial robustness of quanvolutional neural networks,” in2024 IEEE International Conference on Quantum Software (QSW) . IEEE, 2024, pp. 175–181

work page 2024

[15] [15]

Fedqnn: Federated learning using quantum neural networks,

N. Innan, M. A.-Z. Khan, A. Marchisio, M. Shafique, and M. Bennai, “Fedqnn: Federated learning using quantum neural networks,” in 2024 International Joint Conference on Neural Networks (IJCNN) , 2024, pp. 1–9

work page 2024

[16] [17]

Computational advantage in hybrid quantum neural networks: Myth or reality?

[Online]. Available: https://arxiv.org/abs/2412.04991

work page arXiv

[17] [18]

Quantum optimization using variational algorithms on near-term quantum devices,

N. Moll, P. Barkoutsos, L. S. Bishop, J. M. Chow, A. Cross, D. J. Egger, S. Filipp, A. Fuhrer, J. M. Gambetta, M. Ganzhorn, A. Kandala, A. Mezzacapo, P. M ¨uller, W. Riess, G. Salis, J. Smolin, I. Tavernelli, and K. Temme, “Quantum optimization using variational algorithms on near-term quantum devices,” Quantum Science and Technology, vol. 3, no. 3, p. 03...

work page doi:10.1088/2058-9565/aab822 2018

[18] [19]

A comprehensive review of quantum machine learning: from nisq to fault tolerance,

Y . Wang and J. Liu, “A comprehensive review of quantum machine learning: from nisq to fault tolerance,” Reports on Progress in Physics, vol. 87, no. 11, p. 116402, Oct. 2024. [Online]. Available: http://dx.doi.org/10.1088/1361-6633/ad7f69

work page doi:10.1088/1361-6633/ad7f69 2024

[19] [20]

PennyLane: Automatic differentiation of hybrid quantum-classical computations

V . Bergholm, J. Izaac, M. Schuld et al. , “Pennylane: Automatic differentiation of hybrid quantum-classical computations,” 2022. [Online]. Available: https://arxiv.org/abs/1811.04968

work page internal anchor Pith review Pith/arXiv arXiv 2022

[20] [21]

Qiskit: an open- source framework for quantum computing,

G. Aleksandrowicz, T. Alexander, P. Barkoutsos et al., “Qiskit: an open- source framework for quantum computing,” 2019

work page 2019

[21] [22]

Qiskit as a simulation platform for measurement-based quantum computation,

M. Kashif and S. Al-Kuwari, “Qiskit as a simulation platform for measurement-based quantum computation,” in 2022 IEEE 19th Interna- tional Conference on Software Architecture Companion (ICSA-C), 2022, pp. 152–159

work page 2022

[22] [23]

Qiskit code assistant: Training llms for generating quantum computing code,

N. Dupuis, L. Buratti, S. Vishwakarma et al. , “Qiskit code assistant: Training llms for generating quantum computing code,” 2024

work page 2024

[23] [24]

Language models are few-shot learners,

T. B. Brown, B. Mann, N. Ryder, M. Subbiah et al., “Language models are few-shot learners,” 2020

work page 2020

[24] [25]

Evaluating Large Language Models Trained on Code

M. Chen, J. Tworek, H. Jun, Q. Yuan et al., “Evaluating large language models trained on code,” arXiv preprint, vol. arXiv:2107.03374, 2021. [Online]. Available: https://arxiv.org/abs/2107.03374

work page internal anchor Pith review Pith/arXiv arXiv 2021

[25] [27]

Competition-Level Code Generation with AlphaCode

[Online]. Available: https://arxiv.org/abs/2203.07814

work page internal anchor Pith review Pith/arXiv arXiv

[26] [28]

OpenAI Codex: Programming with Natural Language,

OpenAI, “OpenAI Codex: Programming with Natural Language,” https: //openai.com/index/openai-codex/, 2021

work page 2021

[27] [29]

Code Llama: Open Foundation Models for Code

B. Rozi `ere, J. Gehring, F. Gloeckle et al. , “Code llama: Open foundation models for code,” 2024. [Online]. Available: https: //arxiv.org/abs/2308.12950

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [30]

CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis

E. Nijkamp, B. Pang, H. Hayashi, L. Tu, H. Wang, Y . Zhou, S. Savarese, and C. Xiong, “Codegen: An open large language model for code with multi-turn program synthesis,” 2023. [Online]. Available: https://arxiv.org/abs/2203.13474

work page internal anchor Pith review Pith/arXiv arXiv 2023

[29] [31]

InCoder: A Generative Model for Code Infilling and Synthesis

D. Fried, A. Aghajanyan, J. Lin, S. Wang, E. Wallace, F. Shi, R. Zhong, W. tau Yih, L. Zettlemoyer, and M. Lewis, “Incoder: A generative model for code infilling and synthesis,” 2023. [Online]. Available: https://arxiv.org/abs/2204.05999

work page internal anchor Pith review Pith/arXiv arXiv 2023

[30] [32]

Schuld and F

M. Schuld and F. Petruccione, Supervised Learning with Quantum Computers, 1st ed. Springer, 2018

work page 2018

[31] [33]

Invited: Leveraging machine learning for quantum compilation optimization,

X. Ren, T. Zhang, X. Xu, Y .-C. Zheng, and S. Zhang, “Invited: Leveraging machine learning for quantum compilation optimization,” ser. DAC ’24. New York, NY , USA: Association for Computing Machinery, 2024. [Online]. Available: https://doi.org/10.1145/3649329. 3663510

work page doi:10.1145/3649329 2024

[32] [34]

The qiskit textbook,

Q. Community, “The qiskit textbook,” 2023. [Online]. Available: https://github.com/Qiskit/textbook

work page 2023

[33] [35]

Ibm quantum challenge,

I. Quantum, “Ibm quantum challenge,” 2024. [Online]. Available: https://github.com/qiskit-community/ibm-quantum-challenge-2024

work page 2024

[34] [36]

Qiskit open-source repositories,

I. Quantum and Q. Developers, “Qiskit open-source repositories,” 2023, available at https://github.com/Qiskit

work page 2023

[35] [37]

Q-gen quantum circuit dataset,

Y . Mao, “Q-gen quantum circuit dataset,” 2023. [Online]. Available: https://www.kaggle.com/datasets/ykmaoykmao/ q-gen-quantum-circuit-dataset

work page 2023

[36] [38]

Qdataset: Quantum datasets for machine learning,

E. Perrier, A. Youssry, and C. Ferrie, “Qdataset: Quantum datasets for machine learning,” 2021. [Online]. Available: https://arxiv.org/abs/2108. 06661

work page 2021

[37] [39]

Qcircuitnet: A large-scale hierarchical dataset for quantum algorithm design,

R. Yang, Y . Gu, Z. Wang, Y . Liang, and T. Li, “Qcircuitnet: A large-scale hierarchical dataset for quantum algorithm design,” 2024. [Online]. Available: https://arxiv.org/abs/2410.07961

work page arXiv 2024

[38] [40]

Official pennylane documentation,

P. Developers, “Official pennylane documentation,” 2023, available at https://docs.pennylane.ai/en/stable/

work page 2023

[39] [41]

Pennylaneai github repository,

——, “Pennylaneai github repository,” 2023, available at https://github. com/PennyLaneAI

work page 2023

[40] [42]

Survey of different large language model architectures: Trends, benchmarks, and challenges,

M. Shao, A. Basit, R. Karri, and M. Shafique, “Survey of different large language model architectures: Trends, benchmarks, and challenges,” IEEE Access , vol. 12, p. 188664–188706, 2024. [Online]. Available: http://dx.doi.org/10.1109/ACCESS.2024.3482107

work page doi:10.1109/access.2024.3482107 2024

[41] [43]

A Survey on Large Language Models for Code Generation

J. Jiang, F. Wang, J. Shen, S. Kim, and S. Kim, “A survey on large language models for code generation,” 2024. [Online]. Available: https://arxiv.org/abs/2406.00515

work page internal anchor Pith review Pith/arXiv arXiv 2024

[42] [44]

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Z. Feng, D. Guo, D. Tang et al. , “Codebert: A pre-trained model for programming and natural languages,” 2020. [Online]. Available: https://arxiv.org/abs/2002.08155

work page internal anchor Pith review Pith/arXiv arXiv 2020

[43] [45]

Program synthesis with large language models,

J. Austin, A. Odena, M. Nye et al. , “Program synthesis with large language models,” 2021. [Online]. Available: https://arxiv.org/abs/2108. 07732

work page 2021

[44] [46]

Codexglue: A machine learning benchmark dataset for code understanding and generation,

S. Lu, D. Guo, S. Ren, J. Huang et al. , “Codexglue: A machine learning benchmark dataset for code understanding and generation,”

work page

[45] [47]

CodeXGLUE: A Machine Learning Benchmark Dataset for Code Understanding and Generation

[Online]. Available: https://arxiv.org/abs/2102.04664

work page internal anchor Pith review Pith/arXiv arXiv

[46] [48]

Adapting pre-trained language models for quantum natural language processing,

Q. Li, B. Wang, Y . Zhu, C. Lioma, and Q. Liu, “Adapting pre-trained language models for quantum natural language processing,” 2023. [Online]. Available: https://arxiv.org/abs/2302.13812

work page arXiv 2023

[47] [49]

Ganguly, Quantum machine learning: an applied approach

S. Ganguly, Quantum machine learning: an applied approach. Springer, 2021

work page 2021

[48] [50]

E. F. Combarro, S. Gonz ´alez-Castillo, and A. Di Meglio, A practical guide to quantum machine learning and quantum optimization: Hands- on approach to modern quantum algorithms . Packt Publishing Ltd, 2023

work page 2023

[49] [51]

Pep 8 – style guide for python code,

G. Van Rossum, B. Warsaw, and N. Coghlan, “Pep 8 – style guide for python code,” 2001, available at https://peps.python.org/pep-0008/

work page 2001

[50] [52]

Transformers: State-of-the-art natural language processing,

T. Wolf, L. Debut, V . Sanh et al. , “Transformers: State-of-the-art natural language processing,” in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Q. Liu and D. Schlangen, Eds. Online: Association for Computational Linguistics, Oct. 2020, pp. 38–45. [Online]. Available: https://aclanthology...

work page 2020

[51] [53]

Attention is all you need,

A. Vaswani, N. Shazeer, N. Parmar et al. , “Attention is all you need,”

work page

[52] [54]

Attention Is All You Need

[Online]. Available: https://arxiv.org/abs/1706.03762

work page internal anchor Pith review Pith/arXiv arXiv

[53] [55]

Neural Machine Translation by Jointly Learning to Align and Translate

D. Bahdanau, K. Cho, and Y . Bengio, “Neural machine translation by jointly learning to align and translate,” 2016. [Online]. Available: https://arxiv.org/abs/1409.0473

work page internal anchor Pith review Pith/arXiv arXiv 2016

[54] [56]

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

P. Lewis, E. Perez, A. Piktus et al. , “Retrieval-augmented generation for knowledge-intensive nlp tasks,” 2021. [Online]. Available: https: //arxiv.org/abs/2005.11401

work page internal anchor Pith review Pith/arXiv arXiv 2021