Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent

Cheng Ye; Hai-Feng Yu; Huikai Xu; Jiaxiu Han; Jing Gao; Lei Wang; Lin-Feng Zhang; Pan Zhang; Shigang Ou; Tianrui Che

arxiv: 2606.22376 · v1 · pith:IUCOI6MHnew · submitted 2026-06-21 · 🪐 quant-ph

Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent

Huikai Xu , Jiaxiu Han , Shigang Ou , Cheng Ye , Zisong Shen , Jing Gao , Yijia Wang , Tianrui Che

show 6 more authors

Yu Song Weiyang Liu Lei Wang Lin-Feng Zhang Pan Zhang Hai-Feng Yu

This is my paper

Pith reviewed 2026-06-26 10:40 UTC · model grok-4.3

classification 🪐 quant-ph

keywords autonomous calibrationsuperconducting quantum processorlarge language model agenttransmon qubitsskill orchestrationquantum hardware bring-updecision tree skills

0 comments

The pith

Language model agents distill expert knowledge into decision-tree Skills and autonomously calibrate 108 of 112 qubits in 4.7 hours.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that tacit expert judgment for calibrating superconducting quantum processors can be captured through a three-phase human-in-the-loop process and packaged into reusable Skills, each a decision tree of measurement commands, acceptance criteria, and audit records. These Skills are then orchestrated by fine-tuned large language model agents to execute calibration autonomously. This matters for scaling because manual methods and brittle scripts cannot keep pace with processors exceeding one hundred qubits, limiting progress toward fault-tolerant systems. The reported system completes calibration of 108 qubits in 4.7 hours with a 4-5 times speedup and shows agreement with experts on 14 of 16 qubits in a cross-validated subset, while demonstrating that core decision logic transfers across devices.

Core claim

Vibe Calibration organizes expert tacit knowledge into auditable Skills organized as decision trees that package parameterized measurement commands, quantitative acceptance criteria, and audit records; after fine-tuning a large language model on validated trajectories, the resulting agent-orchestrated system autonomously completes calibration of 108 out of 112 qubits on a frequency-tunable transmon processor in 4.7 hours, achieving a 4-5 times speedup over manual calibration of the full array and agreement with expert manual results on 14 out of 16 qubits in a 16-qubit subset, with the core workflows generalizing to new processors after only minor interface adaptations.

What carries the argument

The Skill, a decision tree that packages parameterized measurement commands, quantitative acceptance criteria, and audit records to enable autonomous execution and self-healing.

If this is right

The system reduces calibration time for 108 qubits to 4.7 hours versus the time required for manual calibration of all 112 qubits.
Cross-validation on a 16-qubit subset shows agreement between autonomous and expert manual results on 14 qubits.
Low-level control scripts require only minor interface adaptation while core decision logic and task orchestration transfer to new processors.
The approach removes a critical barrier to scalable quantum hardware operation by providing a reusable and auditable laboratory interface.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the Skills prove robust, the same distillation method could be applied to bring-up tasks on other quantum hardware platforms such as trapped ions or photonic circuits.
The auditable nature of the Skills creates a record that could be mined to identify and correct systematic weaknesses in calibration procedures over multiple devices.
Faster calibration cycles would allow more rapid iteration between processor fabrication and testing, potentially shortening development timelines for larger arrays.

Load-bearing premise

The three-phase human-in-the-loop distillation process fully captures expert tacit judgment into the Skills without critical omissions or biases that would cause autonomous execution to fail on unseen hardware states or anomalous signals.

What would settle it

A direct side-by-side run on the same 112-qubit processor in which the autonomous system and expert manual team produce differing calibration outcomes on more than two of the 16 cross-validated qubits, or fail to reach comparable success rates within the reported time on a new processor.

Figures

Figures reproduced from arXiv: 2606.22376 by Cheng Ye, Hai-Feng Yu, Huikai Xu, Jiaxiu Han, Jing Gao, Lei Wang, Lin-Feng Zhang, Pan Zhang, Shigang Ou, Tianrui Che, Weiyang Liu, Yijia Wang, Yu Song, Zisong Shen.

**Figure 2.** Figure 2: FIG. 2 [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: FIG. 3 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 1.** Figure 1: Validation loss curves over training for all four fine-tuning runs. [PITH_FULL_IMAGE:figures/full_fig_p014_1.png] view at source ↗

**Figure 2.** Figure 2: Unified Skill workflow redrawn as two main columns. The left column shows the one-dimensional characterization route. The right [PITH_FULL_IMAGE:figures/full_fig_p016_2.png] view at source ↗

**Figure 3.** Figure 3: Multi-path dynamic-programming fit on representative spectrum-vs-flux panels. The fitter extracts several smooth ridge candidates [PITH_FULL_IMAGE:figures/full_fig_p017_3.png] view at source ↗

**Figure 4.** Figure 4: Distilled qubit-characterization Skill as a unified decision tree. The workflow exposes two top-level routes that converge on a shared single-qubit characterization chain. Route A (left, blue) handles fixed-frequency or pre-fluxed devices: 1D S21 fixes the readout frequency, 1D spectroscopy proposes a candidate 𝑓01, topology groups are formed, and the shared chain, including Time Rabi, Power Rabi, single-s… view at source ↗

**Figure 5.** Figure 5: Representative outputs from the tl-characterize Skill on TL16. (A) wide overview scan, (B) high-power fine scan with seven detected resonator features, (C) low-power confirmation scan used for frequency matching and Kerr-shift evidence, and (D) S21-vs-power map for readout-power selection. B. Agent-level comparison The second part of NVIDIA’s release is the Quantum-Calibration-Agent Blueprint [25], an open… view at source ↗

read the original abstract

Superconducting quantum computing is one of the most mature solid-state platforms for quantum computation, with processors exceeding one hundred qubits. Yet further scaling toward fault-tolerant quantum computing is increasingly constrained by calibration complexity. Conventional scripts are brittle to anomalous signals, and expert judgment is bounded by cognitive bandwidth and serial operation time, failing to keep pace with system scale. Here we report Vibe Calibration, an autonomous calibration system orchestrated by large language model agents, which distills expert tacit knowledge into reusable Skills. Each Skill is organized as a decision tree that packages parameterized measurement commands, quantitative acceptance criteria, and audit records, enabling autonomous execution and self-healing. We capture this knowledge through a three-phase human-in-the-loop distillation process and fine-tune a large language model on validated trajectories. On a 112-qubit processor with frequency-tunable transmons, the system autonomously completes calibration of 108 out of 112 qubits in 4.7 hours, achieving a 4--5$\times$ speedup over manual calibration of the full 112 qubits. A cross-validated comparison with expert manual calibration on a 16-qubit subset shows agreement on 14 out of 16 qubits. More importantly, the model demonstrates transferable calibration workflows across devices. While low-level control scripts require minor interface adaptation for different hardware platforms, the core decision logic and task orchestration generalize to new processors, demonstrating a reusable laboratory interface rather than a memorized script.This work demonstrates, for the first time, fully autonomous calibration of a hundred-qubit superconducting processor through reusable and auditable Skills, removing a critical barrier to scalable quantum hardware operation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows an LLM-orchestrated system completing autonomous calibration on 108 of 112 qubits in 4.7 hours with a claimed 4-5x speedup, but the supporting validation stays narrow and the core assumption about capturing expert judgment needs more scrutiny.

read the letter

The concrete result is that their Vibe Calibration setup, built around LLM agents and reusable Skills as decision trees, handled most of the bring-up on a 112-qubit frequency-tunable transmon processor without constant human intervention. They report finishing 108 qubits in 4.7 hours versus the longer manual baseline, plus a 14-out-of-16 match on a small cross-validated subset.

What the work actually adds is the three-phase human-in-the-loop distillation that turns tacit expert steps into auditable Skills, then fine-tunes the model on those trajectories. The transferability note—that core logic generalizes while only low-level interfaces need adaptation—is the part that could matter for other labs.

The soft spot is the limited evidence base. The cross-check covers only 16 qubits, with no reported details on the four failures, exact acceptance criteria across the full set, or statistical measures of agreement. The stress-test concern about missing edge cases in anomalous signals holds weight here: if the distillation trajectories did not include the full range of readout or spectroscopy anomalies, the autonomous decisions could diverge from expert practice on new hardware states without anyone noticing in the reported numbers.

This is aimed at hardware teams scaling superconducting processors who need to cut calibration time. It deserves peer review because the empirical demonstration targets a real operational bottleneck with measurable outcomes, even though fuller protocol descriptions and broader validation would be required before the method could be treated as reliable.

Referee Report

2 major / 1 minor

Summary. The paper claims to introduce Vibe Calibration, an autonomous calibration system for a 112-qubit superconducting quantum processor orchestrated by LLM agents. Expert tacit knowledge is distilled into reusable 'Skills' (decision trees with parameterized commands, acceptance criteria, and audit records) via a three-phase human-in-the-loop process and fine-tuning. Key results include autonomous calibration of 108 out of 112 qubits in 4.7 hours (4-5× speedup over manual), 14/16 agreement with expert manual calibration on a 16-qubit subset, and transferability of workflows across devices.

Significance. If the empirical results hold under fuller validation, this would represent a meaningful advance in addressing calibration bottlenecks for scaling superconducting processors beyond 100 qubits. The distillation of expert judgment into auditable, reusable Skills with demonstrated speedup and partial cross-device transferability provides concrete evidence of a practical laboratory automation approach, which could reduce reliance on serial expert time.

major comments (2)

[Abstract] Abstract: The headline metrics (108/112 qubits calibrated autonomously in 4.7 hours, 4–5× speedup) rest on the three-phase distillation fully encoding expert acceptance criteria, yet no details are given on measurement protocols, statistical analysis of outcomes, failure modes for the 4 unsuccessful qubits, or how criteria were validated across the full dataset.
[Abstract] Abstract: Cross-validation is reported only on a 16-qubit subset (agreement on 14/16), with no independent expert audit described for the 108 autonomous results or the 4 failures; this is load-bearing for the central claim of equivalence to expert calibration on the full processor.

minor comments (1)

The term 'Skill' is presented as a novel packaging of decision trees; a brief comparison to existing automated calibration frameworks or scripted decision systems in the quantum hardware literature would improve context.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and for recognizing the potential impact of Vibe Calibration on scaling superconducting processors. We address each major comment below. We agree that the abstract and main text would benefit from expanded details on protocols, statistical methods, failure modes, and validation procedures, and we will revise the manuscript to include these.

read point-by-point responses

Referee: [Abstract] Abstract: The headline metrics (108/112 qubits calibrated autonomously in 4.7 hours, 4–5× speedup) rest on the three-phase distillation fully encoding expert acceptance criteria, yet no details are given on measurement protocols, statistical analysis of outcomes, failure modes for the 4 unsuccessful qubits, or how criteria were validated across the full dataset.

Authors: We agree that the abstract is concise and omits these specifics. The main text (Sections 3 and 4) describes the three-phase human-in-the-loop distillation, the structure of Skills (including parameterized commands and quantitative acceptance criteria), and the overall workflow. However, explicit discussion of measurement protocols (e.g., pulse sequences and readout settings), statistical analysis (e.g., how qubit success was determined across repeated runs), failure modes for the four qubits, and full-dataset validation is not sufficiently detailed. We will add a dedicated subsection (likely in Results or a new Methods appendix) providing these elements, including a table summarizing outcomes and failure analysis. revision: yes
Referee: [Abstract] Abstract: Cross-validation is reported only on a 16-qubit subset (agreement on 14/16), with no independent expert audit described for the 108 autonomous results or the 4 failures; this is load-bearing for the central claim of equivalence to expert calibration on the full processor.

Authors: The 16-qubit cross-validation was designed as a direct head-to-head comparison between autonomous Skill execution and expert manual calibration on the same device. For the full 112-qubit run, equivalence is claimed via the Skills having been distilled and validated through the three-phase process with expert oversight. We acknowledge that an independent post-hoc expert audit of all 108 successful and 4 failed calibrations is not described and would strengthen the claim. We will revise the text to explicitly state the scope of expert validation performed, clarify limitations of the subset comparison, and add any available post-run audit data or metrics if they exist in our records. revision: partial

Circularity Check

0 steps flagged

No circularity: purely empirical hardware metrics

full rationale

The paper reports measured outcomes from running an LLM-orchestrated calibration system on physical hardware (108/112 qubits completed in 4.7 h, 4–5× speedup, 14/16 subset agreement). No equations, derivations, fitted parameters, or self-citations appear in the abstract or described claims. The central results are direct experimental observations rather than any quantity computed from prior outputs by construction. The distillation process is an engineering workflow whose validity is assessed by external hardware performance, not by internal consistency with its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests primarily on the new methodology of distilling expert knowledge into Skills rather than on additional free parameters or unstated physical axioms.

axioms (1)

domain assumption Expert tacit knowledge for qubit calibration can be captured without critical loss in structured decision trees via a three-phase human-in-the-loop process
The paper presents this distillation as the foundation enabling autonomous execution and transferability.

invented entities (1)

Skill no independent evidence
purpose: Reusable decision tree packaging measurement commands, acceptance criteria, and audit records for autonomous self-healing calibration
Core new construct introduced to replace brittle scripts.

pith-pipeline@v0.9.1-grok · 5877 in / 1323 out tokens · 34191 ms · 2026-06-26T10:40:43.934726+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

69 extracted references · 1 canonical work pages

[1]

Kjaergaard, M

M. Kjaergaard, M. E. Schwartz, J. Braumüller, P. Krantz, J. I.-J. Wang, S. Gustavsson, and W. D. Oliver, Superconducting qubits: Current state of play, Annual Review of Condensed Matter Physics 11, 369 (2020)

2020
[2]

Y. Kim, A. Eddins, S. Anand, K. X. Wei, E. Van Den Berg, S. Rosenblatt, H. Nayfeh, Y. Wu, M. Zale- tel, K. Temme, et al., Evidence for the utility of quan- tum computing before fault tolerance, Nature618, 500 (2023)

2023
[3]

F. Jin, S. Jiang, X. Zhu, Z. Bao, F. Shen, K. Wang, Z. Zhu, S. Xu, Z. Song, J. Chen, et al. , Topological prethermal strong zero modes on superconducting pro- 9 cessors, Nature645, 626 (2025)

2025
[4]

D. Gao, D. Fan, C. Zha, J. Bei, G. Cai, J. Cai, S. Cao, F. Chen, J. Chen, K. Chen, et al. , Establishing a new benchmark in quantum computational advantage with 105-qubit zuchongzhi 3.0 processor, Physical Review Let- ters 134, 090601 (2025)

2025
[5]

Bravyi, A

S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low- overhead fault-tolerant quantum memory, Nature627, 778 (2024)

2024
[6]

G. Q. AI, Quantum error correction below the surface code threshold, Nature638, 920 (2025)

2025
[7]

T. He, W. Lin, R. Wang, Y. Li, J. Bei, J. Cai, S. Cao, D. Chen, K. Chen, X. Chen,et al., Experimental quan- tum error correction below the surface code threshold via all-microwave leakage suppression, Physical Review Let- ters 135, 260601 (2025)

2025
[8]

A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland, Surface codes: Towards practical large-scale quantum computation, Physical Review A 86, 032324 (2012)

2012
[9]

Preskill, Quantum computing in the nisq era and be- yond, Quantum2, 79 (2018)

J. Preskill, Quantum computing in the nisq era and be- yond, Quantum2, 79 (2018)

2018
[11]

Y. Y. Gao, M. A. Rol, S. Touzard, and C. Wang, Practi- cal guide for building superconducting quantum devices, PRX Quantum2, 040202 (2021)

2021
[12]

Wittler, F

N. Wittler, F. Roy, K. Pack, M. Werninghaus, A. S. Roy, D. J. Egger, S. Filipp, F. K. Wilhelm, and S. Machnes, Integrated tool set for control, calibration, and charac- terization of quantum devices applied to superconducting qubits, Physical Review Applied15, 034080 (2021)

2021
[13]

Kelly, R

J. Kelly, R. Barends, A. Fowler, A. Megrant, E. Jeffrey, T. White, D. Sank, J. Mutus, B. Campbell, Y. Chen, et al., Scalable in situ qubit calibration during repetitive error detection, Physical Review A94, 032321 (2016)

2016
[14]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, At- tention is all you need, Advances in Neural Information Processing Systems30 (2017)

2017
[15]

Brown, B

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Ka- plan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in Neural Information Processing Systems33, 1877 (2020)

2020
[16]

D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, Au- tonomous chemical research with large language models, Nature 624, 570 (2023)

2023
[17]

N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, et al., An autonomous laboratory for the accelerated synthesis of inorganic materials, Nature624, 86 (2023)

2023
[18]

S. Cao, Z. Zhang, M. Alghadeer, S. D. Fasciati, M. Piscitelli, M. Bakr, P. Leek, and A. Aspuru-Guzik, Agents for self-driving laboratories applied to quantum computing, arXiv preprint arXiv:2412.07978 (2024)

arXiv 2024
[19]

S. Li, J. M. Miller, P. J. Lee, G. Andersson, C. R. Conner, Y. J. Joshi, B. Karimi, A. M. King, H. L. Malc, H. Mishra, et al. , Large language model-assisted superconducting qubit experiments, arXiv preprint arXiv:2603.08801 (2026)

arXiv 2026
[21]

NVIDIA, NVIDIA Ising Calibration 1: A 35b vision-language model for quantum cal- ibration, https://build.nvidia.com/nvidia/ ising-calibration-1-35b-a3b (2026), accessed: 2026-05-29

2026
[22]

NVIDIA, Quantum calibration agent blueprint, https://github.com/NVIDIA/ Quantum-Calibration-Agent-Blueprint (2026), ac- cessed: 2026-05-29

2026
[23]

S. M. Kay,Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory , Prentice Hall Signal Pro- cessing Series (Prentice Hall, Upper Saddle River, NJ, 1993)

1993
[31]

Moonshot AI, Kimi K2.6, https://huggingface.co/ moonshotai/Kimi-K2.6 (2026)

2026
[34]

Moonshot AI, Kimi Code: Getting started with the cli, https://www.kimi.com/help/kimi-code/ cli-getting-started (2026), accessed: 2026-05-19

2026
[36]

Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent

QuarkStudio Contributors, QuarkStudio: A measure- ment framework for quantum experiments, https:// 10 quarkstudio.readthedocs.io/en/latest/ (2026), ac- cessed: 2026-05-28. Supplemental Material for “Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent” Huikai Xu,1, ∗ Jiaxiu Han,1, ∗...

2026
[37]

Power Rabi (home/run_powerrabi.py)
[38]

SingleShot (home/run_singleshot.py)
[39]

Run Spectrum on s8 (first step in pipeline)

Ramsey (home/run_ramsey.py) I’ll proceed with those when Spectrum finishes. * Brewed for 1m 20s * Background command "Run Spectrum on s8 (first step in pipeline)" completed (exit code 0) * Spectrum step completed. Let me check the results and proceed to Power Rabi. Read 1 file (ctrl+o to expand) * The Spectrum step completed. Let me read the full output. ...

2026
[43]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_t1.py’

[...] Ramsey Running T1 now. * Bash(ssh s8 "bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_t1.py’" 2>&1 | tail -40) > Load settings from /Users/baqis/autolab/quark.json Database path: /Users/baqis/autolab/home/checkpoint.db [Fig] Saved: results/characterization_20260506/T1_group0.png ... +425 lines (ctrl+o to expand) * T1 is running. ...
[45]

[OK] Power Rabi -- done
[46]

[OK] SingleShot -- done
[47]

[OK] T1 -- done (103/108 OK across all groups)
[48]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_ramsey.py’

[...] Ramsey -- last step Running Ramsey now. * Bash(ssh s8 "bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_ramsey.py’" 2>&1 | tail -40) > Running in the background (down to manage) * Ramsey is running (final step, background). Let me check progress. * Task Output(non-blocking) bnyk6r51b > (No output) * Ramsey is still running. Let me...
[49]

[OK] Spectrum -- done
[50]

[OK] Power Rabi -- done (0 bad fits)
[51]

[OK] SingleShot -- done (median visibility ˜1.6)
[52]

[OK] T1 -- done (103/108 OK)
[53]

The /loop will automatically check again in 10 minutes and proceed once done

[...] Ramsey -- running (final step) I’ll wait for Ramsey to complete. The /loop will automatically check again in 10 minutes and proceed once done. * Cogitated for 27m 7s * Running scheduled task (May 10 10:44pm) * Checking on the Ramsey step -- it’s been running in the background. * Task Output(non-blocking) bnyk6r51b > (No output) * The /loop checks in...
[54]

* The /loop dynamic mode isn’t available, but I’m still in the session

[...] Ramsey -- still running (background task bnyk6r51b) I’ll check again on the next loop cycle. * The /loop dynamic mode isn’t available, but I’m still in the session. Let me check the Ramsey task directly. * Task Output(non-blocking) bnyk6r51b > (No output) * The /loop fired again (every 10m, job 40560d9d). Ramsey is still running -- the background ta...
[55]

Run Ramsey on s8 (final step)

[...] Ramsey -- still in progress (task bnyk6r51b) * Churned for 45s * Background command "Run Ramsey on s8 (final step)" completed (exit code 0) * Ramsey step completed! Let me check the results. Searched for 1 pattern (ctrl+o to expand) * All 5 characterization steps completed. Full pipeline summary: 18 Step: Spectrum Status: [OK] Done Results: All 4 gr...
[56]

[OK] Spectrum -- all 4 groups done
[57]

[OK] Power Rabi -- all 4 groups done (0 bad fits)
[58]

[OK] SingleShot -- all 4 groups done
[59]

[OK] T1 -- 103/108 OK
[60]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python -c \

[OK] Ramsey -- Groups 1-3 done (83/83 OK), Group 0 fit failures No remaining steps. The only incomplete item is Group 0 Ramsey (0/25 fit) -- want me to re-run that group with different parameters (e.g. shorter delay_max or more repeats)? * Baked for 12s * Running scheduled task (May 10 11:01pm) 19 * Pipeline already fully complete. No remaining steps: +--...
[61]

Qwen Team, Qwen3.6-35B-A3B: Agentic coding power, now open to all (2026)

2026
[62]

Qwen Team, Qwen3.5: Towards native multimodal agents (2026)

2026
[63]

Kimi Team, Kimi K2: Open agentic intelligence (2025), arXiv:2507.20534 [cs.LG]

Pith/arXiv arXiv 2025
[64]

Kimi Team, Kimi K2.5: Visual agentic intelligence (2026), arXiv:2602.02276 [cs.CL]

Pith/arXiv arXiv 2026
[65]

Moonshot AI, Kimi K2.6, https://huggingface.co/moonshotai/Kimi-K2.6 (2026)

2026
[66]

Moonshot AI, Kimi Code: Getting started with the cli, https://www.kimi.com/help/kimi-code/cli-getting-started (2026), accessed: 2026-05-19

2026
[67]

W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. Gonzalez, H. Zhang, and I. Stoica, in Proceedings of the 29th Symposium on Operating Systems Principles, SOSP ’23 (Association for Computing Machinery, New York, NY, USA, 2023) p. 611–626

2023
[68]

Anthropic, Claude Code, https://www.claude.com/product/claude-code (2025), accessed: 2026-05-19

2025
[69]

Schick, J

T. Schick, J. Dwivedi-Yu, R. Dess `ı, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, in Advances in Neural Information Processing Systems (NeurIPS) (2023) arXiv:2302.04761

Pith/arXiv arXiv 2023
[70]

Qwen Team, Qwen3.6 model family, Model card, https://huggingface.co/Qwen/Qwen3.6-27B (2026), accessed: 2026-05-19

2026
[71]

DeepSeek-AI, Deepseek-v4: Towards highly efficient million-token context intelligence (2026), arXiv:2606.19348 [cs.CL]

arXiv 2026
[72]

Allen-Zhu and Y

Z. Allen-Zhu and Y. Li, in Proceedings of the International Conference on Machine Learning (ICML) (2024) arXiv:2309.14316

arXiv 2024
[73]

Maini, S

P. Maini, S. Seto, H. Bai, D. Grangier, Y. Zhang, and N. Jaitly, Rephrasing the web: A recipe for compute and data-efficient language modeling (2024), iCLR Workshop on Data Problems for Foundation Models (DPFM), arXiv:2401.16380 [cs.CL]

arXiv 2024
[74]

Gao, Z.-Y

J. Gao, Z.-Y. Lu, P. Zhang, and Z.-F. Gao, Strategic over-parameterization for generalizable low-rank adaptation (2026), arXiv:2605.16470 [cs.LG]

Pith/arXiv arXiv 2026
[75]

Anthropic, Equipping agents for the real world with agent skills, https://www.anthropic.com/engineering/ equipping-agents-for-the-real-world-with-agent-skills (2025), accessed: 2026-05-19

2025
[76]

Anthropic, Agent skills, https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview (2026), ac- cessed: 2026-05-19

2026
[77]

J. Koch, T. M. Yu, J. Gambetta, A. A. Houck, D. I. Schuster, J. Majer, A. Blais, M. H. Devoret, S. M. Girvin, and R. J. Schoelkopf, Physical Review A76, 042319 (2007), arXiv:cond-mat/0703002

Pith/arXiv arXiv 2007
[78]

Krantz, M

P. Krantz, M. Kjaergaard, F. Yan, T. P. Orlando, S. Gustavsson, and W. D. Oliver, Applied Physics Reviews6, 10.1063/1.5089550 (2019)

work page doi:10.1063/1.5089550 2019
[79]

Blais, A

A. Blais, A. L. Grimsmo, S. M. Girvin, and A. Wallraff, Reviews of Modern Physics 93, 025005 (2021)

2021
[80]

Boissonneault, J

M. Boissonneault, J. M. Gambetta, and A. Blais, Physical Review A 79, 013819 (2009), arXiv:0810.1336 [quant-ph]

Pith/arXiv arXiv 2009
[81]

M. S. Khalil, M. J. A. Stoutimore, F. C. Wellstood, and K. D. Osborn, Journal of Applied Physics 111, 054510 (2012), arXiv:1108.3117 [physics.ins-det]

Pith/arXiv arXiv 2012
[82]

Probst, F

S. Probst, F. B. Song, P. A. Bushev, A. V. Ustinov, and M. Weides, Review of Scientific Instruments86, 024706 (2015), arXiv:1410.3365 [physics.ins-det]

Pith/arXiv arXiv 2015
[83]

S. Cao, Z. Zhang, A. Agarwal, G. Bratrud, N. R. Beysengulov, D. C. Cole, A. G. Frieiro, E. O. Glen, H. Hsu, G. Huang, R. Jow, G. Shaji, T. Lubowe, L. Zhu, L. M. Calder ´on, N. Pancotti, J. Pendleton, B. Severin, C. E. Staub, S. Sussman, A. Veps ¨al¨ainen, N. R. Vora, Y. Xu, V. Bernales, D. Bowring, E. Kyoseva, I. Rungger, G. Semeghini, S. Stanwyck, T. Cos...

Pith/arXiv arXiv 2026
[84]

NVIDIA, NVIDIA Ising Calibration 1: A 35b vision-language model for quantum calibration, https://build.nvidia.com/nvidia/ ising-calibration-1-35b-a3b (2026), accessed: 2026-05-29

2026
[85]

NVIDIA, Quantum calibration agent blueprint, https://github.com/NVIDIA/Quantum-Calibration-Agent-Blueprint (2026), accessed: 2026-05-29

2026

[1] [1]

Kjaergaard, M

M. Kjaergaard, M. E. Schwartz, J. Braumüller, P. Krantz, J. I.-J. Wang, S. Gustavsson, and W. D. Oliver, Superconducting qubits: Current state of play, Annual Review of Condensed Matter Physics 11, 369 (2020)

2020

[2] [2]

Y. Kim, A. Eddins, S. Anand, K. X. Wei, E. Van Den Berg, S. Rosenblatt, H. Nayfeh, Y. Wu, M. Zale- tel, K. Temme, et al., Evidence for the utility of quan- tum computing before fault tolerance, Nature618, 500 (2023)

2023

[3] [3]

F. Jin, S. Jiang, X. Zhu, Z. Bao, F. Shen, K. Wang, Z. Zhu, S. Xu, Z. Song, J. Chen, et al. , Topological prethermal strong zero modes on superconducting pro- 9 cessors, Nature645, 626 (2025)

2025

[4] [4]

D. Gao, D. Fan, C. Zha, J. Bei, G. Cai, J. Cai, S. Cao, F. Chen, J. Chen, K. Chen, et al. , Establishing a new benchmark in quantum computational advantage with 105-qubit zuchongzhi 3.0 processor, Physical Review Let- ters 134, 090601 (2025)

2025

[5] [5]

Bravyi, A

S. Bravyi, A. W. Cross, J. M. Gambetta, D. Maslov, P. Rall, and T. J. Yoder, High-threshold and low- overhead fault-tolerant quantum memory, Nature627, 778 (2024)

2024

[6] [6]

G. Q. AI, Quantum error correction below the surface code threshold, Nature638, 920 (2025)

2025

[7] [7]

T. He, W. Lin, R. Wang, Y. Li, J. Bei, J. Cai, S. Cao, D. Chen, K. Chen, X. Chen,et al., Experimental quan- tum error correction below the surface code threshold via all-microwave leakage suppression, Physical Review Let- ters 135, 260601 (2025)

2025

[8] [8]

A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland, Surface codes: Towards practical large-scale quantum computation, Physical Review A 86, 032324 (2012)

2012

[9] [9]

Preskill, Quantum computing in the nisq era and be- yond, Quantum2, 79 (2018)

J. Preskill, Quantum computing in the nisq era and be- yond, Quantum2, 79 (2018)

2018

[10] [11]

Y. Y. Gao, M. A. Rol, S. Touzard, and C. Wang, Practi- cal guide for building superconducting quantum devices, PRX Quantum2, 040202 (2021)

2021

[11] [12]

Wittler, F

N. Wittler, F. Roy, K. Pack, M. Werninghaus, A. S. Roy, D. J. Egger, S. Filipp, F. K. Wilhelm, and S. Machnes, Integrated tool set for control, calibration, and charac- terization of quantum devices applied to superconducting qubits, Physical Review Applied15, 034080 (2021)

2021

[12] [13]

Kelly, R

J. Kelly, R. Barends, A. Fowler, A. Megrant, E. Jeffrey, T. White, D. Sank, J. Mutus, B. Campbell, Y. Chen, et al., Scalable in situ qubit calibration during repetitive error detection, Physical Review A94, 032321 (2016)

2016

[13] [14]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, At- tention is all you need, Advances in Neural Information Processing Systems30 (2017)

2017

[14] [15]

Brown, B

T. Brown, B. Mann, N. Ryder, M. Subbiah, J. D. Ka- plan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al., Language models are few-shot learners, Advances in Neural Information Processing Systems33, 1877 (2020)

2020

[15] [16]

D. A. Boiko, R. MacKnight, B. Kline, and G. Gomes, Au- tonomous chemical research with large language models, Nature 624, 570 (2023)

2023

[16] [17]

N. J. Szymanski, B. Rendy, Y. Fei, R. E. Kumar, T. He, D. Milsted, M. J. McDermott, M. Gallant, E. D. Cubuk, A. Merchant, et al., An autonomous laboratory for the accelerated synthesis of inorganic materials, Nature624, 86 (2023)

2023

[17] [18]

S. Cao, Z. Zhang, M. Alghadeer, S. D. Fasciati, M. Piscitelli, M. Bakr, P. Leek, and A. Aspuru-Guzik, Agents for self-driving laboratories applied to quantum computing, arXiv preprint arXiv:2412.07978 (2024)

arXiv 2024

[18] [19]

S. Li, J. M. Miller, P. J. Lee, G. Andersson, C. R. Conner, Y. J. Joshi, B. Karimi, A. M. King, H. L. Malc, H. Mishra, et al. , Large language model-assisted superconducting qubit experiments, arXiv preprint arXiv:2603.08801 (2026)

arXiv 2026

[19] [21]

NVIDIA, NVIDIA Ising Calibration 1: A 35b vision-language model for quantum cal- ibration, https://build.nvidia.com/nvidia/ ising-calibration-1-35b-a3b (2026), accessed: 2026-05-29

2026

[20] [22]

NVIDIA, Quantum calibration agent blueprint, https://github.com/NVIDIA/ Quantum-Calibration-Agent-Blueprint (2026), ac- cessed: 2026-05-29

2026

[21] [23]

S. M. Kay,Fundamentals of Statistical Signal Processing, Volume I: Estimation Theory , Prentice Hall Signal Pro- cessing Series (Prentice Hall, Upper Saddle River, NJ, 1993)

1993

[22] [31]

Moonshot AI, Kimi K2.6, https://huggingface.co/ moonshotai/Kimi-K2.6 (2026)

2026

[23] [34]

Moonshot AI, Kimi Code: Getting started with the cli, https://www.kimi.com/help/kimi-code/ cli-getting-started (2026), accessed: 2026-05-19

2026

[24] [36]

Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent

QuarkStudio Contributors, QuarkStudio: A measure- ment framework for quantum experiments, https:// 10 quarkstudio.readthedocs.io/en/latest/ (2026), ac- cessed: 2026-05-28. Supplemental Material for “Vibe Calibration: Autonomous Bring-up of a 112-Qubit Superconducting Quantum Processor by a Skill-Orchestrating Language Agent” Huikai Xu,1, ∗ Jiaxiu Han,1, ∗...

2026

[25] [37]

Power Rabi (home/run_powerrabi.py)

[26] [38]

SingleShot (home/run_singleshot.py)

[27] [39]

Run Spectrum on s8 (first step in pipeline)

Ramsey (home/run_ramsey.py) I’ll proceed with those when Spectrum finishes. * Brewed for 1m 20s * Background command "Run Spectrum on s8 (first step in pipeline)" completed (exit code 0) * Spectrum step completed. Let me check the results and proceed to Power Rabi. Read 1 file (ctrl+o to expand) * The Spectrum step completed. Let me read the full output. ...

2026

[28] [43]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_t1.py’

[...] Ramsey Running T1 now. * Bash(ssh s8 "bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_t1.py’" 2>&1 | tail -40) > Load settings from /Users/baqis/autolab/quark.json Database path: /Users/baqis/autolab/home/checkpoint.db [Fig] Saved: results/characterization_20260506/T1_group0.png ... +425 lines (ctrl+o to expand) * T1 is running. ...

[29] [45]

[OK] Power Rabi -- done

[30] [46]

[OK] SingleShot -- done

[31] [47]

[OK] T1 -- done (103/108 OK across all groups)

[32] [48]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_ramsey.py’

[...] Ramsey -- last step Running Ramsey now. * Bash(ssh s8 "bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python home/run_ramsey.py’" 2>&1 | tail -40) > Running in the background (down to manage) * Ramsey is running (final step, background). Let me check progress. * Task Output(non-blocking) bnyk6r51b > (No output) * Ramsey is still running. Let me...

[33] [49]

[OK] Spectrum -- done

[34] [50]

[OK] Power Rabi -- done (0 bad fits)

[35] [51]

[OK] SingleShot -- done (median visibility ˜1.6)

[36] [52]

[OK] T1 -- done (103/108 OK)

[37] [53]

The /loop will automatically check again in 10 minutes and proceed once done

[...] Ramsey -- running (final step) I’ll wait for Ramsey to complete. The /loop will automatically check again in 10 minutes and proceed once done. * Cogitated for 27m 7s * Running scheduled task (May 10 10:44pm) * Checking on the Ramsey step -- it’s been running in the background. * Task Output(non-blocking) bnyk6r51b > (No output) * The /loop checks in...

[38] [54]

* The /loop dynamic mode isn’t available, but I’m still in the session

[...] Ramsey -- still running (background task bnyk6r51b) I’ll check again on the next loop cycle. * The /loop dynamic mode isn’t available, but I’m still in the session. Let me check the Ramsey task directly. * Task Output(non-blocking) bnyk6r51b > (No output) * The /loop fired again (every 10m, job 40560d9d). Ramsey is still running -- the background ta...

[39] [55]

Run Ramsey on s8 (final step)

[...] Ramsey -- still in progress (task bnyk6r51b) * Churned for 45s * Background command "Run Ramsey on s8 (final step)" completed (exit code 0) * Ramsey step completed! Let me check the results. Searched for 1 pattern (ctrl+o to expand) * All 5 characterization steps completed. Full pipeline summary: 18 Step: Spectrum Status: [OK] Done Results: All 4 gr...

[40] [56]

[OK] Spectrum -- all 4 groups done

[41] [57]

[OK] Power Rabi -- all 4 groups done (0 bad fits)

[42] [58]

[OK] SingleShot -- all 4 groups done

[43] [59]

[OK] T1 -- 103/108 OK

[44] [60]

bash -l -c ’cd ˜/autolab && QUARK_USER=baqis uv run python -c \

[OK] Ramsey -- Groups 1-3 done (83/83 OK), Group 0 fit failures No remaining steps. The only incomplete item is Group 0 Ramsey (0/25 fit) -- want me to re-run that group with different parameters (e.g. shorter delay_max or more repeats)? * Baked for 12s * Running scheduled task (May 10 11:01pm) 19 * Pipeline already fully complete. No remaining steps: +--...

[45] [61]

Qwen Team, Qwen3.6-35B-A3B: Agentic coding power, now open to all (2026)

2026

[46] [62]

Qwen Team, Qwen3.5: Towards native multimodal agents (2026)

2026

[47] [63]

Kimi Team, Kimi K2: Open agentic intelligence (2025), arXiv:2507.20534 [cs.LG]

Pith/arXiv arXiv 2025

[48] [64]

Kimi Team, Kimi K2.5: Visual agentic intelligence (2026), arXiv:2602.02276 [cs.CL]

Pith/arXiv arXiv 2026

[49] [65]

Moonshot AI, Kimi K2.6, https://huggingface.co/moonshotai/Kimi-K2.6 (2026)

2026

[50] [66]

Moonshot AI, Kimi Code: Getting started with the cli, https://www.kimi.com/help/kimi-code/cli-getting-started (2026), accessed: 2026-05-19

2026

[51] [67]

W. Kwon, Z. Li, S. Zhuang, Y. Sheng, L. Zheng, C. H. Yu, J. Gonzalez, H. Zhang, and I. Stoica, in Proceedings of the 29th Symposium on Operating Systems Principles, SOSP ’23 (Association for Computing Machinery, New York, NY, USA, 2023) p. 611–626

2023

[52] [68]

Anthropic, Claude Code, https://www.claude.com/product/claude-code (2025), accessed: 2026-05-19

2025

[53] [69]

Schick, J

T. Schick, J. Dwivedi-Yu, R. Dess `ı, R. Raileanu, M. Lomeli, L. Zettlemoyer, N. Cancedda, and T. Scialom, in Advances in Neural Information Processing Systems (NeurIPS) (2023) arXiv:2302.04761

Pith/arXiv arXiv 2023

[54] [70]

Qwen Team, Qwen3.6 model family, Model card, https://huggingface.co/Qwen/Qwen3.6-27B (2026), accessed: 2026-05-19

2026

[55] [71]

DeepSeek-AI, Deepseek-v4: Towards highly efficient million-token context intelligence (2026), arXiv:2606.19348 [cs.CL]

arXiv 2026

[56] [72]

Allen-Zhu and Y

Z. Allen-Zhu and Y. Li, in Proceedings of the International Conference on Machine Learning (ICML) (2024) arXiv:2309.14316

arXiv 2024

[57] [73]

Maini, S

P. Maini, S. Seto, H. Bai, D. Grangier, Y. Zhang, and N. Jaitly, Rephrasing the web: A recipe for compute and data-efficient language modeling (2024), iCLR Workshop on Data Problems for Foundation Models (DPFM), arXiv:2401.16380 [cs.CL]

arXiv 2024

[58] [74]

Gao, Z.-Y

J. Gao, Z.-Y. Lu, P. Zhang, and Z.-F. Gao, Strategic over-parameterization for generalizable low-rank adaptation (2026), arXiv:2605.16470 [cs.LG]

Pith/arXiv arXiv 2026

[59] [75]

Anthropic, Equipping agents for the real world with agent skills, https://www.anthropic.com/engineering/ equipping-agents-for-the-real-world-with-agent-skills (2025), accessed: 2026-05-19

2025

[60] [76]

Anthropic, Agent skills, https://platform.claude.com/docs/en/agents-and-tools/agent-skills/overview (2026), ac- cessed: 2026-05-19

2026

[61] [77]

J. Koch, T. M. Yu, J. Gambetta, A. A. Houck, D. I. Schuster, J. Majer, A. Blais, M. H. Devoret, S. M. Girvin, and R. J. Schoelkopf, Physical Review A76, 042319 (2007), arXiv:cond-mat/0703002

Pith/arXiv arXiv 2007

[62] [78]

Krantz, M

P. Krantz, M. Kjaergaard, F. Yan, T. P. Orlando, S. Gustavsson, and W. D. Oliver, Applied Physics Reviews6, 10.1063/1.5089550 (2019)

work page doi:10.1063/1.5089550 2019

[63] [79]

Blais, A

A. Blais, A. L. Grimsmo, S. M. Girvin, and A. Wallraff, Reviews of Modern Physics 93, 025005 (2021)

2021

[64] [80]

Boissonneault, J

M. Boissonneault, J. M. Gambetta, and A. Blais, Physical Review A 79, 013819 (2009), arXiv:0810.1336 [quant-ph]

Pith/arXiv arXiv 2009

[65] [81]

M. S. Khalil, M. J. A. Stoutimore, F. C. Wellstood, and K. D. Osborn, Journal of Applied Physics 111, 054510 (2012), arXiv:1108.3117 [physics.ins-det]

Pith/arXiv arXiv 2012

[66] [82]

Probst, F

S. Probst, F. B. Song, P. A. Bushev, A. V. Ustinov, and M. Weides, Review of Scientific Instruments86, 024706 (2015), arXiv:1410.3365 [physics.ins-det]

Pith/arXiv arXiv 2015

[67] [83]

S. Cao, Z. Zhang, A. Agarwal, G. Bratrud, N. R. Beysengulov, D. C. Cole, A. G. Frieiro, E. O. Glen, H. Hsu, G. Huang, R. Jow, G. Shaji, T. Lubowe, L. Zhu, L. M. Calder ´on, N. Pancotti, J. Pendleton, B. Severin, C. E. Staub, S. Sussman, A. Veps ¨al¨ainen, N. R. Vora, Y. Xu, V. Bernales, D. Bowring, E. Kyoseva, I. Rungger, G. Semeghini, S. Stanwyck, T. Cos...

Pith/arXiv arXiv 2026

[68] [84]

NVIDIA, NVIDIA Ising Calibration 1: A 35b vision-language model for quantum calibration, https://build.nvidia.com/nvidia/ ising-calibration-1-35b-a3b (2026), accessed: 2026-05-29

2026

[69] [85]

NVIDIA, Quantum calibration agent blueprint, https://github.com/NVIDIA/Quantum-Calibration-Agent-Blueprint (2026), accessed: 2026-05-29

2026