Coherent Swap Regret and Channel-Proof Learning

Sohail Sarkar

arxiv: 2606.02655 · v1 · pith:E4PWNVEDnew · submitted 2026-06-01 · 🪐 quant-ph · cs.GT· cs.LG· math.OC

Coherent Swap Regret and Channel-Proof Learning

Sohail Sarkar This is my paper

Pith reviewed 2026-06-28 14:25 UTC · model grok-4.3

classification 🪐 quant-ph cs.GTcs.LGmath.OC

keywords coherent swap regretCPTP mapsquantum gamesregret minimizationquantum correlated equilibriumchannel deviationsmirror ascent

0 comments

The pith

Coherent swap regret against local CPTP maps in quantum games is set by non-unital measurement-and-preparation channels at rate Ω(√(dT log d)).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that external regret fails to capture a natural physical deviation in quantum games: a player can apply any local completely positive trace-preserving map to the state it actually received. Coherent swap regret is introduced as the appropriate benchmark against all such maps. An algorithm based on entropic mirror ascent over the CPTP Choi slice with a fixed-point play rule achieves O(√(dT log d)) regret. The analysis partitions deviations into three classes whose minimax rates differ sharply: replacement channels recover ordinary external regret, unital channels including all unitaries admit zero regret, and deterministic measurement-and-preparation channels already force the full Ω(√(dT log d)) lower bound that also upper-bounds every CPTP map. The result implies that hardness stems from non-unital use of the recommendation register rather than from coherence.

Core claim

Coherent swap regret measures performance against all local CPTP maps applied to the received or prepared state. Entropic mirror ascent on the CPTP Choi slice together with a fixed-point play rule attains O(√(dT log d)) coherent swap regret. Replacement channels recover the classical Θ(√(T log d)) external-regret rate, every unital channel has minimax regret zero, and deterministic measurement-and-preparation channels already require Ω(√(dT log d)) regret in the moderate-horizon regime; this same rate is tight for the entire class of CPTP maps.

What carries the argument

entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule, which produces the O(√(dT log d)) upper bound on coherent swap regret against arbitrary local CPTP maps.

If this is right

Replacement channels recover ordinary external regret at rate Θ(√(T log d)).
Unital channels, including unitary deviations and mixtures of unitaries, have zero minimax regret.
Deterministic measurement-and-preparation channels force Ω(√(dT log d)) regret, and this rate suffices for all CPTP deviations.
Decentralized full-information learning reaches an ε-approximate separable quantum correlated equilibrium after T = O(max_i d_i log d_i / ε²) rounds.
An SDP audit certifies local CPTP exploitability of mediated quantum recommendation protocols for arbitrary finite-dimensional states.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The clean separation between unital and non-unital maps indicates that preserving the maximally mixed state is what makes a deviation easy to defend against.
The probing-bandit extension with pseudo-regret O(d^{4/3} T^{2/3} (log d)^{1/3}) under Haar-random probes may extend to other quantum online-learning settings that use random pure-state measurements.
Identifying equilibria with channel-proofness of mediated protocols suggests that experimental verification of approximate quantum correlated equilibria could be performed by checking SDP conditions on observed statistics alone.

Load-bearing premise

The fixed-point play rule combined with entropic mirror ascent on the CPTP Choi slice produces the claimed O(√(dT log d)) bound against arbitrary local CPTP maps.

What would settle it

A calculation or finite-horizon simulation showing that regret against some deterministic measurement-and-preparation channel exceeds the Ω(√(dT log d)) lower bound, or that the stated algorithm fails to keep regret O(√(dT log d)) against a general local CPTP map.

read the original abstract

External regret certifies stability only against replacing one's behavior by a fixed alternative. In a quantum game, this misses a natural physical move: a player can apply a local completely positive trace-preserving (CPTP) map to the state it actually received or prepared. We introduce coherent swap regret as the regret benchmark against all such local CPTP deviations, and give an algorithm achieving $O(\sqrt{dT\log d})$ coherent swap regret via entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule. The main result is a three-level deviation-class landscape. Replacement channels recover ordinary external regret at rate $\Theta(\sqrt{T\log d})$. Unital channels, including unitary deviations and mixtures of unitaries, have zero minimax regret. Deterministic measurement-and-preparation channels already force $\Omega(\sqrt{dT\log d})$ regret in the moderate-horizon regime, and this rate is also sufficient for all CPTP deviations. Thus the hardness comes from non-unital use of the recommendation register, not from quantum coherence alone. As an application, decentralized full-information learning in finite quantum games reaches an $\varepsilon$-approximate separable quantum correlated equilibrium after $T=O(\max_i d_i\log d_i/\varepsilon^2)$ rounds. We identify these equilibria with channel-proofness of mediated quantum recommendation protocols, give an SDP audit for local CPTP exploitability applicable to arbitrary finite-dimensional states, and include a probing-bandit extension with pseudo-regret $O(d^{4/3}T^{2/3}(\log d)^{1/3})$ under Haar-random pure-state probes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces coherent swap regret against local CPTP maps and claims a three-level channel-class landscape with matching sqrt rates, but the upper bound has no derivation in the text.

read the letter

The main thing to know is that this paper defines coherent swap regret as the benchmark against local CPTP deviations in quantum games and lays out a three-level minimax landscape: ordinary replacement channels recover external regret at Theta(sqrt(T log d)), unital channels including unitaries have zero minimax regret, and deterministic measurement-preparation channels already force Omega(sqrt(dT log d)) in moderate horizons, with that rate also sufficient for all CPTP maps. The hardness is pinned on non-unital use of the recommendation register rather than coherence itself.

What is new is the coherent swap regret definition and the explicit separation by channel class. The algorithm via entropic mirror ascent on the CPTP Choi slice plus fixed-point play rule is concrete, and the application to decentralized learning reaching epsilon-approximate separable quantum correlated equilibrium after T = O(max d_i log d_i / epsilon^2) rounds is a clear use case. The SDP audit for local CPTP exploitability and the probing-bandit pseudo-regret bound are also useful additions.

The paper organizes the material cleanly and connects the regret notions to physical channel classes without obvious circularity in the claims. The lower bound statement for measurement-preparation channels is stated directly.

The soft spot is that the abstract asserts the O(sqrt(dT log d)) upper bound from the fixed-point rule and mirror ascent but supplies no potential function, regret decomposition, or handling of the non-unital constraint, so it is impossible to check whether the analysis closes or whether hidden dependencies on horizon or unitality appear. The same holds for the lower bound details. This is the central claim, so the gap matters.

This is for people working in quantum game theory and regret minimization who want to incorporate physical channel constraints. A reader focused on decentralized quantum learning or channel-proof equilibria would get value from the landscape and the SDP. It deserves a serious referee because the idea is original and the claims are specific enough to verify once the proofs are supplied.

Referee Report

1 major / 0 minor

Summary. The paper introduces coherent swap regret as the benchmark against local CPTP deviations in quantum games (beyond ordinary external regret against fixed replacements). It claims a three-level deviation landscape: replacement channels recover Θ(√(T log d)) external regret; unital channels (including unitaries) admit zero minimax regret; deterministic measurement-and-preparation channels force Ω(√(dT log d)) regret in the moderate-horizon regime, and this rate is also sufficient against arbitrary local CPTP maps. The upper bound is achieved by entropic mirror ascent on the CPTP Choi slice together with a fixed-point play rule. Applications include decentralized convergence to ε-approximate separable quantum correlated equilibrium in T = O(max_i d_i log d_i / ε²) rounds, an SDP audit for local CPTP exploitability, and a probing-bandit extension with pseudo-regret O(d^{4/3} T^{2/3} (log d)^{1/3}).

Significance. If the claimed rates and algorithm are correct, the work would clarify that hardness in quantum online learning arises from non-unital use of the recommendation register rather than from coherence itself, supplying a structured taxonomy of deviation classes and a concrete link between regret minimization and channel-proof mediated protocols. The SDP audit and equilibrium application could be practically useful for verifying exploitability in finite-dimensional quantum games.

major comments (1)

[Abstract] Abstract (and main result statement): the sufficiency claim that entropic mirror ascent on the CPTP Choi slice plus the fixed-point play rule yields O(√(dT log d)) coherent swap regret against arbitrary local CPTP maps is asserted without exhibiting the potential function, the regret decomposition, or the handling of the non-unital recommendation-register constraint. Consequently the analysis cannot be verified to close or to be free of hidden horizon or unitality dependence.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for identifying an opportunity to improve the clarity and verifiability of the main result statement. We address the comment below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract (and main result statement): the sufficiency claim that entropic mirror ascent on the CPTP Choi slice plus the fixed-point play rule yields O(√(dT log d)) coherent swap regret against arbitrary local CPTP maps is asserted without exhibiting the potential function, the regret decomposition, or the handling of the non-unital recommendation-register constraint. Consequently the analysis cannot be verified to close or to be free of hidden horizon or unitality dependence.

Authors: The abstract is a concise summary; the full analysis appears in the body. The potential function is the negative von Neumann entropy on the Choi operator of the CPTP map (Section 3). The regret decomposition proceeds by applying the standard entropic mirror-descent inequality on the CPTP slice and using the fixed-point play rule to absorb the non-unital action on the recommendation register without introducing extra T-factors or unitality assumptions (proof of Theorem 3.1). The resulting bound is O(√(dT log d)) with no hidden horizon dependence. We agree the abstract should better signpost these elements and will add a one-sentence outline of the key steps together with a pointer to the relevant theorem and section. revision: yes

Circularity Check

0 steps flagged

No circularity; bound asserted without exhibited derivation steps or self-referential reductions

full rationale

The abstract states that an algorithm achieves O(√(dT log d)) coherent swap regret via entropic mirror ascent on the CPTP Choi slice with a fixed-point play rule, and that this rate is sufficient for all CPTP deviations while deterministic measurement-and-preparation channels force Ω(√(dT log d)). No equations, potential functions, regret decompositions, or self-citations appear in the provided text that would reduce these claims to inputs by construction (e.g., no fitted parameters renamed as predictions, no uniqueness theorems imported from prior author work, and no ansatz smuggled via citation). The three-level landscape is presented as derived from channel classes without any load-bearing step that collapses to a self-definition or fitted input. This matches the default expectation of no significant circularity when no reduction can be quoted.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Only abstract available; ledger populated from stated concepts. No explicit free parameters or invented entities listed. Relies on standard quantum information axioms (CPTP maps, Choi representation) and game-theoretic regret definitions.

axioms (2)

domain assumption Local CPTP maps are the natural physical deviations a player can apply to its received or prepared state.
Abstract opens with this as the motivation for coherent swap regret.
standard math Finite-dimensional quantum systems with dimension d.
Bounds and rates are stated in terms of d.

pith-pipeline@v0.9.1-grok · 5822 in / 1386 out tokens · 23084 ms · 2026-06-28T14:25:27.942122+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

22 extracted references · 5 canonical work pages

[1]

Online learning of quantum states

Scott Aaronson, Xinyi Chen, Elad Hazan, Satyen Kale, and Ashwin Nayak. Online learning of quantum states. Advances in Neural Information Processing Systems, 31, 2018.https://arxiv.org/abs/1802.09025

arXiv 2018
[2]

Robert J. Aumann. Subjectivity and correlation in randomized strategies.Journal of Mathematical Economics 1(1):67–96, 1974.https://doi.org/10.1016/0304-4068(74)90037-8

work page doi:10.1016/0304-4068(74)90037-8 1974
[3]

Online learning of a panoply of quantum objects.Quantum Machine Intelligence7, article 78, 2025.https://arxiv.org/abs/2406.04245

Akshay Bansal, Ian George, Soumik Ghosh, Jamie Sikora, and Alice Zheng. Online learning of a panoply of quantum objects.Quantum Machine Intelligence7, article 78, 2025.https://arxiv.org/abs/2406.04245

arXiv 2025
[4]

From external to internal regret.Journal of Machine Learning Research 8:1307–1324, 2007.https://www.jmlr.org/papers/v8/blum07a.html

Avrim Blum and Yishay Mansour. From external to internal regret.Journal of Machine Learning Research 8:1307–1324, 2007.https://www.jmlr.org/papers/v8/blum07a.html

2007
[5]

Quantum game theory and the complexity of approximating quantum Nash equilibria.Quantum6:882, 2022.https://doi.org/10.22331/q-2022-12-22-882

John Bostanci and John Watrous. Quantum game theory and the complexity of approximating quantum Nash equilibria.Quantum6:882, 2022.https://doi.org/10.22331/q-2022-12-22-882

work page doi:10.22331/q-2022-12-22-882 2022
[6]

Eric A. Carlen. Trace inequalities and quantum entropy: an introductory course. InEntropy and the Quantum, Contemporary Mathematics 529, pages 73–140. American Mathematical Society, 2010.https://doi.org/10. 1090/conm/529/10428

2010
[7]

Quantum correlated equilibria in classical complete information games

Alan Deckelbaum. Quantum correlated equilibria in classical complete information games. arXiv:1101.3380, 2011.https://arxiv.org/abs/1101.3380

Pith/arXiv arXiv 2011
[8]

From external to swap regret 2.0: An efficient reduction for large action spaces

Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, and Noah Golowich. From external to swap regret 2.0: An efficient reduction for large action spaces. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, 2024.https://arxiv.org/abs/2310.19786

arXiv 2024
[9]

Instance-optimal matrix multiplicative weight update and its quantum applications

Weiyuan Gong, Tongyang Li, Xinzhao Wang, and Zhiyu Zhang. Instance-optimal matrix multiplicative weight update and its quantum applications. arXiv:2509.08911, 2025.https://arxiv.org/abs/2509.08911

arXiv 2025
[10]

Gordon, Amy Greenwald, and Casey Marks

Geoffrey J. Gordon, Amy Greenwald, and Casey Marks. No-regret learning in convex games. InProceedings of the 25th International Conference on Machine Learning, pages 360–367, 2008.https://www.cs.cmu.edu/ ~ggordon/gordon-greenwald-marks-icml-phi-regret.pdf

2008
[11]

Toward a general theory of quantum games

Gus Gutoski and John Watrous. Toward a general theory of quantum games. InProceedings of the 39th ACM Symposium on Theory of Computing, pages 565–574, 2007.https://arxiv.org/abs/quant-ph/0611234

Pith/arXiv arXiv 2007
[12]

A simple adaptive procedure leading to correlated equilibrium.Econo- metrica68(5):1127–1150, 2000.https://doi.org/10.1111/1468-0262.00153

Sergiu Hart and Andreu Mas-Colell. A simple adaptive procedure leading to correlated equilibrium.Econo- metrica68(5):1127–1150, 2000.https://doi.org/10.1111/1468-0262.00153

work page doi:10.1111/1468-0262.00153 2000
[13]

A tight lower bound and efficient reduction for swap regret

Shinji Ito. A tight lower bound and efficient reduction for swap regret. InAdvances in Neural Infor- mation Processing Systems, 33:18550–18559, 2020.https://proceedings.neurips.cc/paper/2020/hash/ d79c8788088c2193f0244d8f1f36d2db-Abstract.html

2020
[14]

Parallel approximation of non-interactive zero-sum quantum games

Rahul Jain and John Watrous. Parallel approximation of non-interactive zero-sum quantum games. In Proceedings of the 24th IEEE Conference on Computational Complexity, pages 243–253, 2009.https: //arxiv.org/abs/0808.2775

Pith/arXiv arXiv 2009
[15]

Payoff-based learning with matrix multiplicative weights in quantum games.Advances in Neural Information Processing Systems36, 2023.https://arxiv.org/abs/2311.02423

Kyriakos Lotidis, Panayotis Mertikopoulos, Nicholas Bambos, and Jose Blanchet. Payoff-based learning with matrix multiplicative weights in quantum games.Advances in Neural Information Processing Systems36, 2023.https://arxiv.org/abs/2311.02423

arXiv 2023
[16]

No-regret learning and equilibrium computation in quantum games.Quantum8:1569, 2024.https://doi.org/10.22331/q-2024-12-17-1569

Wayne Lin, Georgios Piliouras, Ryann Sim, and Antonios Varvitsiotis. No-regret learning and equilibrium computation in quantum games.Quantum8:1569, 2024.https://doi.org/10.22331/q-2024-12-17-1569

work page doi:10.22331/q-2024-12-17-1569 2024
[17]

Caro, Jens Eisert, and Sumeet Khatri

Asad Raza, Matthias C. Caro, Jens Eisert, and Sumeet Khatri. Online learning of quantum processes. arXiv:2406.04250, 2024.https://arxiv.org/abs/2406.04250

arXiv 2024
[18]

Full characterization of quantum correlated equilibria.Quantum Informa- tion and Computation13(9–10):846–860, 2013.https://arxiv.org/abs/1105.5353

Zhaohui Wei and Shengyu Zhang. Full characterization of quantum correlated equilibria.Quantum Informa- tion and Computation13(9–10):846–860, 2013.https://arxiv.org/abs/1105.5353

Pith/arXiv arXiv 2013
[19]

Quantum game players can have advantage without discord.Information and Computation256:174–181, 2017.https://arxiv.org/abs/1502.00207

Zhaohui Wei and Shengyu Zhang. Quantum game players can have advantage without discord.Information and Computation256:174–181, 2017.https://arxiv.org/abs/1502.00207

Pith/arXiv arXiv 2017
[20]

Quantum strategic game theory

Shengyu Zhang. Quantum strategic game theory. InProceedings of the 3rd Innovations in Theoretical Com- puter Science Conference, pages 39–59, 2012.https://arxiv.org/abs/1012.5141

Pith/arXiv arXiv 2012
[21]

Francisca Vasconcelos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Pil- iouras, and Michael I. Jordan. A quadratic speedup in finding Nash equilibria of quantum zero-sum games. Quantum9:1737, 2025.https://doi.org/10.22331/q-2025-05-06-1737. COHERENT SWAP REGRET AND CHANNEL-PROOF LEARNING 23

work page doi:10.22331/q-2025-05-06-1737 2025
[22]

Cambridge University Press, 2018

John Watrous.The Theory of Quantum Information. Cambridge University Press, 2018. Author’s draft: https://cs.uwaterloo.ca/~watrous/TQI/TQI.pdf

2018

[1] [1]

Online learning of quantum states

Scott Aaronson, Xinyi Chen, Elad Hazan, Satyen Kale, and Ashwin Nayak. Online learning of quantum states. Advances in Neural Information Processing Systems, 31, 2018.https://arxiv.org/abs/1802.09025

arXiv 2018

[2] [2]

Robert J. Aumann. Subjectivity and correlation in randomized strategies.Journal of Mathematical Economics 1(1):67–96, 1974.https://doi.org/10.1016/0304-4068(74)90037-8

work page doi:10.1016/0304-4068(74)90037-8 1974

[3] [3]

Online learning of a panoply of quantum objects.Quantum Machine Intelligence7, article 78, 2025.https://arxiv.org/abs/2406.04245

Akshay Bansal, Ian George, Soumik Ghosh, Jamie Sikora, and Alice Zheng. Online learning of a panoply of quantum objects.Quantum Machine Intelligence7, article 78, 2025.https://arxiv.org/abs/2406.04245

arXiv 2025

[4] [4]

From external to internal regret.Journal of Machine Learning Research 8:1307–1324, 2007.https://www.jmlr.org/papers/v8/blum07a.html

Avrim Blum and Yishay Mansour. From external to internal regret.Journal of Machine Learning Research 8:1307–1324, 2007.https://www.jmlr.org/papers/v8/blum07a.html

2007

[5] [5]

Quantum game theory and the complexity of approximating quantum Nash equilibria.Quantum6:882, 2022.https://doi.org/10.22331/q-2022-12-22-882

John Bostanci and John Watrous. Quantum game theory and the complexity of approximating quantum Nash equilibria.Quantum6:882, 2022.https://doi.org/10.22331/q-2022-12-22-882

work page doi:10.22331/q-2022-12-22-882 2022

[6] [6]

Eric A. Carlen. Trace inequalities and quantum entropy: an introductory course. InEntropy and the Quantum, Contemporary Mathematics 529, pages 73–140. American Mathematical Society, 2010.https://doi.org/10. 1090/conm/529/10428

2010

[7] [7]

Quantum correlated equilibria in classical complete information games

Alan Deckelbaum. Quantum correlated equilibria in classical complete information games. arXiv:1101.3380, 2011.https://arxiv.org/abs/1101.3380

Pith/arXiv arXiv 2011

[8] [8]

From external to swap regret 2.0: An efficient reduction for large action spaces

Yuval Dagan, Constantinos Daskalakis, Maxwell Fishelson, and Noah Golowich. From external to swap regret 2.0: An efficient reduction for large action spaces. InProceedings of the 56th Annual ACM Symposium on Theory of Computing, 2024.https://arxiv.org/abs/2310.19786

arXiv 2024

[9] [9]

Instance-optimal matrix multiplicative weight update and its quantum applications

Weiyuan Gong, Tongyang Li, Xinzhao Wang, and Zhiyu Zhang. Instance-optimal matrix multiplicative weight update and its quantum applications. arXiv:2509.08911, 2025.https://arxiv.org/abs/2509.08911

arXiv 2025

[10] [10]

Gordon, Amy Greenwald, and Casey Marks

Geoffrey J. Gordon, Amy Greenwald, and Casey Marks. No-regret learning in convex games. InProceedings of the 25th International Conference on Machine Learning, pages 360–367, 2008.https://www.cs.cmu.edu/ ~ggordon/gordon-greenwald-marks-icml-phi-regret.pdf

2008

[11] [11]

Toward a general theory of quantum games

Gus Gutoski and John Watrous. Toward a general theory of quantum games. InProceedings of the 39th ACM Symposium on Theory of Computing, pages 565–574, 2007.https://arxiv.org/abs/quant-ph/0611234

Pith/arXiv arXiv 2007

[12] [12]

A simple adaptive procedure leading to correlated equilibrium.Econo- metrica68(5):1127–1150, 2000.https://doi.org/10.1111/1468-0262.00153

Sergiu Hart and Andreu Mas-Colell. A simple adaptive procedure leading to correlated equilibrium.Econo- metrica68(5):1127–1150, 2000.https://doi.org/10.1111/1468-0262.00153

work page doi:10.1111/1468-0262.00153 2000

[13] [13]

A tight lower bound and efficient reduction for swap regret

Shinji Ito. A tight lower bound and efficient reduction for swap regret. InAdvances in Neural Infor- mation Processing Systems, 33:18550–18559, 2020.https://proceedings.neurips.cc/paper/2020/hash/ d79c8788088c2193f0244d8f1f36d2db-Abstract.html

2020

[14] [14]

Parallel approximation of non-interactive zero-sum quantum games

Rahul Jain and John Watrous. Parallel approximation of non-interactive zero-sum quantum games. In Proceedings of the 24th IEEE Conference on Computational Complexity, pages 243–253, 2009.https: //arxiv.org/abs/0808.2775

Pith/arXiv arXiv 2009

[15] [15]

Payoff-based learning with matrix multiplicative weights in quantum games.Advances in Neural Information Processing Systems36, 2023.https://arxiv.org/abs/2311.02423

Kyriakos Lotidis, Panayotis Mertikopoulos, Nicholas Bambos, and Jose Blanchet. Payoff-based learning with matrix multiplicative weights in quantum games.Advances in Neural Information Processing Systems36, 2023.https://arxiv.org/abs/2311.02423

arXiv 2023

[16] [16]

No-regret learning and equilibrium computation in quantum games.Quantum8:1569, 2024.https://doi.org/10.22331/q-2024-12-17-1569

Wayne Lin, Georgios Piliouras, Ryann Sim, and Antonios Varvitsiotis. No-regret learning and equilibrium computation in quantum games.Quantum8:1569, 2024.https://doi.org/10.22331/q-2024-12-17-1569

work page doi:10.22331/q-2024-12-17-1569 2024

[17] [17]

Caro, Jens Eisert, and Sumeet Khatri

Asad Raza, Matthias C. Caro, Jens Eisert, and Sumeet Khatri. Online learning of quantum processes. arXiv:2406.04250, 2024.https://arxiv.org/abs/2406.04250

arXiv 2024

[18] [18]

Full characterization of quantum correlated equilibria.Quantum Informa- tion and Computation13(9–10):846–860, 2013.https://arxiv.org/abs/1105.5353

Zhaohui Wei and Shengyu Zhang. Full characterization of quantum correlated equilibria.Quantum Informa- tion and Computation13(9–10):846–860, 2013.https://arxiv.org/abs/1105.5353

Pith/arXiv arXiv 2013

[19] [19]

Quantum game players can have advantage without discord.Information and Computation256:174–181, 2017.https://arxiv.org/abs/1502.00207

Zhaohui Wei and Shengyu Zhang. Quantum game players can have advantage without discord.Information and Computation256:174–181, 2017.https://arxiv.org/abs/1502.00207

Pith/arXiv arXiv 2017

[20] [20]

Quantum strategic game theory

Shengyu Zhang. Quantum strategic game theory. InProceedings of the 3rd Innovations in Theoretical Com- puter Science Conference, pages 39–59, 2012.https://arxiv.org/abs/1012.5141

Pith/arXiv arXiv 2012

[21] [21]

Francisca Vasconcelos, Emmanouil-Vasileios Vlatakis-Gkaragkounis, Panayotis Mertikopoulos, Georgios Pil- iouras, and Michael I. Jordan. A quadratic speedup in finding Nash equilibria of quantum zero-sum games. Quantum9:1737, 2025.https://doi.org/10.22331/q-2025-05-06-1737. COHERENT SWAP REGRET AND CHANNEL-PROOF LEARNING 23

work page doi:10.22331/q-2025-05-06-1737 2025

[22] [22]

Cambridge University Press, 2018

John Watrous.The Theory of Quantum Information. Cambridge University Press, 2018. Author’s draft: https://cs.uwaterloo.ca/~watrous/TQI/TQI.pdf

2018