Batch Entanglement Detection in Parameterized Qubit States using Classical Bandit Algorithms

K. Bharati; Krishna Jagannathan; Vikesh Siddhu

arxiv: 2406.19738 · v2 · submitted 2024-06-28 · 🪐 quant-ph · cs.AI· cs.LG

Batch Entanglement Detection in Parameterized Qubit States using Classical Bandit Algorithms

K. Bharati , Vikesh Siddhu , Krishna Jagannathan This is my paper

Pith reviewed 2026-05-23 23:46 UTC · model grok-4.3

classification 🪐 quant-ph cs.AIcs.LG

keywords entanglement detectionbatch detectionthresholding banditsmulti-armed banditsentanglement witnessestwo-qubit statesquantum machine learning

0 comments

The pith

A single-parameter witness family plus classical thresholding bandits detects entanglement in batches of two-qubit states from class F.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes a method to identify all entangled states among K unknown two-qubit states by measuring a one-parameter family of witnesses and routing the outcomes to a classical thresholding bandit algorithm. The approach yields conclusive detection together with explicit sample-complexity bounds exactly when the states belong to class F, which includes depolarized Bell states and Bell diagonal states. A sympathetic reader would care because entanglement detection normally demands either joint measurements forbidden by no-go theorems or full tomography, and the batch formulation borrows efficient decision rules from classical multi-armed bandits to lower the experimental cost.

Core claim

By mapping batch entanglement detection onto the classical thresholding bandit problem, the authors show that measurements of the Zhu-Teo-Englert single-parameter witness family suffice to separate entangled from separable states inside class F, delivering both conclusive identification of every entangled member of a batch and explicit upper bounds on the total number of measurements required.

What carries the argument

The thresholding bandit algorithm applied to statistics obtained from the single-parameter entanglement witness family of Zhu, Teo, and Englert.

If this is right

Conclusive identification of every entangled state in a batch is possible without full tomography whenever the states lie in class F.
The mapping to the thresholding bandit problem supplies explicit measurement-complexity guarantees derived from classical multi-armed bandit theory.
Numerical simulations and a physical experiment confirm that the method works on the targeted state families.
The technique applies directly to the practically motivated families inside class F, such as depolarized Bell states and Bell diagonal states.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same bandit reduction could be tried on other quantum properties, such as coherence or non-stabilizerness, that admit low-parameter witness families.
If analogous witness families exist for higher-dimensional or multi-qubit systems, the sample savings might carry over to those settings.
Hybrid quantum-classical detection loops of this form may become routine for real-time verification tasks in quantum networks.

Load-bearing premise

The unknown states must belong to class F, the restricted set for which the chosen witness family is guaranteed to separate entangled from separable cases.

What would settle it

A depolarized Bell state or other member of class F that the algorithm still misclassifies after the number of measurements guaranteed by the bandit bound has been collected.

Figures

Figures reproduced from arXiv: 2406.19738 by K. Bharati, Krishna Jagannathan, Vikesh Siddhu.

**Figure 2.** Figure 2: Average number of samples v/s δ for the (1, K)-quantum MAB problem Algorithm 3 (m, K)-quantum MAB policy for states in F Input: threshold ζ = 0, acceptance error rates δ, arms A ← [K], WBMs {E1, E2} Output: Aent With E ← E1, run Algorithm 2 for K arms with U(t, δ) ← q log( 4Kt2 δ ) 2t and return stopping time τ1 and entangled arms A (1) ent if (|A (1) ent | == 1 and m == 1) or (|A (1) ent | == K) then A (2… view at source ↗

**Figure 3.** Figure 3: Average stopping time v/s δ for the (m, K)-quantum MAB problem B. Entanglement Detection in Arbitrary Quantum States In this section, we present a routine for detecting entanglement in arbitrary quantum states and provide numerical results for the (1, K)-quantum Multi-Armed Bandit (MAB) problem. To generate random density matrices, we follow the method described in [46]. Specifically, we start by generatin… view at source ↗

**Figure 4.** Figure 4: Entanglement Detection ratio v/s δ for the (1, K)-quantum MAB problem for arbitrary quantum states TABLE VI: Examples of arbitrary pure entangled states detected by the family of witnesses (4) Pure entangled states |ψ1⟩, |ψ2⟩ and |ψ3⟩ Values under (SEi ) 6 i=1 [0.2687 + 0.0375i; 0.2406 + 0.4090i; 0.0502 + 0.6162i; 0.2413 + 0.5107i] (−0.1851, 0.3160, 0.1598, −0.0058, 0.2177, −0.1947) [0.0565 + 0.3355i; 0.05… view at source ↗

read the original abstract

Entanglement is a key property of quantum states that acts as a resource for a wide range of tasks in quantum computing. Entanglement detection is a key conceptual and practical challenge. Without adaptive or joint measurements, entanglement detection is constrained by no-go theorems (Lu et al. [Phys. Rev. Lett., 116, 230501 (2016)]), necessitating full state tomography. Batch entanglement detection refers to the problem of identifying all entangled states from amongst a set of $K$ unknown states, which finds applications in quantum information processing. We devise a method for performing batch entanglement detection by measuring a single-parameter family of entanglement witnesses, as proposed by Zhu, Teo, and Englert [Phys. Rev. A, 81, 052339, 2010], followed by a thresholding bandit algorithm on the measurement data. The proposed method can perform batch entanglement detection conclusively when the unknown states are drawn from a practically well-motivated class of two-qubit states $\mathcal{F}$, which includes Depolarised Bell states, Bell diagonal states, etc. Our key novelty lies in drawing a connection between batch entanglement detection and a Thresholding Bandit problem in classical Multi-Armed Bandits (MAB). The connection to the MAB problem also enables us to derive theoretical guarantees on the measurement/sample complexity of the proposed technique. We demonstrate the performance of the proposed method through numerical simulations and an experimental implementation. More broadly, this paper highlights the potential for employing classical machine learning techniques for quantum entanglement detection.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper maps batch entanglement detection for states in class F to a classical thresholding bandit problem and pulls sample bounds from MAB theory.

read the letter

The central move is to treat the problem of identifying which of K states in class F are entangled as a thresholding multi-armed bandit instance. They measure the single-parameter witness family from Zhu, Teo, and Englert on each state, feed the empirical expectations into a bandit algorithm, and obtain theoretical measurement-complexity guarantees that follow from existing MAB results once the witness statistics are in hand. This is new; the explicit reduction and the resulting bounds are not in the prior witness or no-go literature they cite. They also run numerical simulations and an experiment, which is useful for seeing how the method behaves in practice. The restriction to class F is stated up front, and within that class the witness family is supposed to separate entangled from separable states, so the decision rule is sound on paper. Outside F the same witnesses can fail to detect entanglement, which limits the scope but is not hidden. The main soft spot is that finite-shot estimation of the witness values introduces noise, and it is not clear how much slack the bandit analysis has to absorb that noise without extra error terms or additional assumptions. The abstract does not show the derivations, so the exact gap between the ideal MAB bounds and the quantum setting is hard to judge from what is here. This is for quantum information researchers who already work with entanglement witnesses and want a concrete way to do batch detection without full tomography on states that fit class F. It is worth sending to a serious referee because the MAB connection is concrete, the application is practical, and the claims are scoped appropriately even if the experimental validation is preliminary.

Referee Report

2 major / 1 minor

Summary. The manuscript claims that batch entanglement detection on K unknown two-qubit states drawn from class F (Depolarised Bell states, Bell diagonal states, etc.) can be performed conclusively by measuring a single-parameter family of witnesses (Zhu et al., Phys. Rev. A 81, 052339, 2010) and feeding the outcomes into a classical thresholding bandit algorithm; the connection to the Thresholding Bandit problem is asserted to yield theoretical guarantees on measurement/sample complexity, with supporting numerical simulations and an experimental demonstration.

Significance. If the witness family is complete for F and the MAB reduction preserves the exact decision boundary, the approach could lower the measurement overhead for batch detection tasks in quantum information processing by importing existing MAB sample-complexity bounds. The explicit linkage to the classical Thresholding Bandit problem and the resulting guarantees constitute a clear technical contribution.

major comments (2)

[Abstract] Abstract and the paragraph defining class F: the claim of conclusive detection requires that the single-parameter witness family separates every entangled state in F from all separable states. The manuscript must supply or cite a self-contained argument establishing this completeness property for the full class F; without it the thresholding rule can produce false negatives even when the MAB algorithm runs perfectly.
[MAB reduction section] Section describing the MAB reduction (likely the section containing the connection to the Thresholding Bandit problem): the stated sample-complexity guarantees appear to treat witness expectations as known exactly, yet they are estimated from finite-shot measurements. The manuscript must derive or bound the additional error introduced by this estimation and show whether the classical MAB bounds survive or require modified constants; this step is load-bearing for the theoretical guarantees.

minor comments (1)

Notation for the witness parameter and the decision threshold should be introduced once and used consistently; occasional re-use of the same symbol for different quantities appears in the methods description.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the constructive major comments. Both points identify places where the manuscript's claims would benefit from additional explicit justification. We address each below and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract and the paragraph defining class F: the claim of conclusive detection requires that the single-parameter witness family separates every entangled state in F from all separable states. The manuscript must supply or cite a self-contained argument establishing this completeness property for the full class F; without it the thresholding rule can produce false negatives even when the MAB algorithm runs perfectly.

Authors: We agree that conclusive detection for the full class F requires the witness family to separate every entangled state in F from all separable states. The manuscript defines class F by reference to the states for which Zhu et al. (Phys. Rev. A 81, 052339, 2010) explicitly construct the single-parameter family and prove detection (Depolarized Bell states, Bell-diagonal states, and the listed extensions). To make the argument self-contained we will add, in the paragraph defining class F, a short paragraph that (i) recalls the relevant theorem from Zhu et al. and (ii) states that, by construction of F, every entangled member of F yields a witness expectation below the separable threshold for some parameter value in the family. This addition will also be reflected in a revised abstract sentence. revision: yes
Referee: [MAB reduction section] Section describing the MAB reduction (likely the section containing the connection to the Thresholding Bandit problem): the stated sample-complexity guarantees appear to treat witness expectations as known exactly, yet they are estimated from finite-shot measurements. The manuscript must derive or bound the additional error introduced by this estimation and show whether the classical MAB bounds survive or require modified constants; this step is load-bearing for the theoretical guarantees.

Authors: The referee is correct that the current theoretical guarantees are stated for exact witness expectations while the experimental and numerical sections use finite-shot estimates. We will add a new subsection (or appendix) that (i) applies Hoeffding or empirical-Bernstein bounds to control the deviation between the estimated and true expectations with high probability, (ii) shows that choosing a per-arm shot budget scaling as O(log(K/δ)/ε²) propagates the estimation error into the thresholding decision without altering the asymptotic sample-complexity order, and (iii) states the modified constants that appear in the final MAB bound. The revised text will therefore make explicit that the classical Thresholding Bandit guarantees survive up to these (explicitly bounded) factors. revision: yes

Circularity Check

0 steps flagged

No circularity; MAB connection and witness family supply independent content

full rationale

The paper explicitly restricts its conclusive detection claim to states in class F (where the cited single-parameter witness family separates entangled from separable states) and derives measurement-complexity guarantees by reduction to the classical Thresholding Bandit problem. Both the witness construction and the bandit guarantees originate in external literature (Zhu et al. and standard MAB results) with no overlap to the present authors. No equation or claim reduces a fitted parameter or self-citation back into the target result; the derivation chain remains open to external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that all states lie inside class F (so that the witness family separates entangled from separable cases) and on standard results from the classical multi-armed-bandit literature; no new free parameters or invented entities are introduced in the abstract.

axioms (1)

domain assumption Unknown states are drawn from class F (Depolarised Bell states, Bell diagonal states, etc.) for which the single-parameter witness family separates entangled from separable states.
Stated explicitly in the abstract as the setting in which conclusive detection is possible.

pith-pipeline@v0.9.0 · 5813 in / 1309 out tokens · 22169 ms · 2026-05-23T23:46:40.697032+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 53 canonical work pages

[1]

Tomography is necessary for universal entanglement detection with single-copy observables,

D. Lu, T. Xin, N. Yu, Z. Ji, J. Chen, G. Long, J. Baugh, X. Peng, B. Zeng, and R. Laflamme, “Tomography is necessary for universal entanglement detection with single-copy observables,” Phys. Rev. Lett. , vol. 116, p. 230501, Jun 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.116.230501

work page doi:10.1103/physrevlett.116.230501 2016
[2]

Minimal tomography with entanglement witnesses,

H. Zhu, Y . S. Teo, and B.-G. Englert, “Minimal tomography with entanglement witnesses,” Phys. Rev. A, vol. 81, p. 052339, May 2010. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.81.052339

work page doi:10.1103/physreva.81.052339 2010
[3]

Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,

C. H. Bennett, G. Brassard, C. Cr ´epeau, R. Jozsa, A. Peres, and W. K. Wootters, “Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,” Phys. Rev. Lett., vol. 70, pp. 1895–1899, Mar 1993

work page 1993
[4]

Quantum entanglement and communication complexity,

H. Buhrman, R. Cleve, and W. van Dam, “Quantum entanglement and communication complexity,” SIAM Journal on Computing, vol. 30, no. 6, pp. 1829–1841, 2001

work page 2001
[5]

Quantum entanglement,

R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki, “Quantum entanglement,” Reviews of Modern Physics , vol. 81, no. 2, pp. 865–942, Jun. 2009

work page 2009
[6]

Low rank matrix recovery from rank one measurements,

R. Kueng, H. Rauhut, and U. Terstiege, “Low rank matrix recovery from rank one measurements,” Applied and Computational Harmonic Analysis , vol. 42, no. 1, pp. 88–116, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1063520315001037

work page 2017
[7]

Confidence polytopes in quantum state tomography,

J. Wang, V . B. Scholz, and R. Renner, “Confidence polytopes in quantum state tomography,” Physical Review Letters , vol. 122, no. 19, May 2019. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.122.190401

work page doi:10.1103/physrevlett.122.190401 2019
[8]

Efficient quantum tomography,

R. O’Donnell and J. Wright, “Efficient quantum tomography,” Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:769062

work page 2015
[9]

Efficient quantum tomography ii,

——, “Efficient quantum tomography ii,” Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing ,

work page
[10]

Available: https://api.semanticscholar.org/CorpusID:5245926

[Online]. Available: https://api.semanticscholar.org/CorpusID:5245926

work page
[11]

Focus on quantum tomography,

K. Banaszek, M. Cramer, and D. Gross, “Focus on quantum tomography,” New Journal of Physics , vol. 15, no. 12, p. 125020, dec 2013. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/15/12/125020

work page doi:10.1088/1367-2630/15/12/125020 2013
[12]

Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,

S. T. Flammia, D. Gross, Y .-K. Liu, and J. Eisert, “Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,” New Journal of Physics , vol. 14, no. 9, p. 095022, sep 2012. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/14/9/095022

work page doi:10.1088/1367-2630/14/9/095022 2012
[13]

Fast state tomography with optimal error bounds,

M. Guta, J. Kahn, R. Kueng, and J. A. Tropp, “Fast state tomography with optimal error bounds,” Journal of Physics A: Mathematical and Theoretical , vol. 53, no. 20, p. 204001, apr 2020. [Online]. Available: https: //dx.doi.org/10.1088/1751-8121/ab8111

work page doi:10.1088/1751-8121/ab8111 2020
[14]

Neural-network quantum state tomography,

G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko, and G. Carleo, “Neural-network quantum state tomography,” Nature Physics, vol. 14, no. 5, p. 447–450, Feb. 2018. [Online]. Available: http://dx.doi.org/10.1038/s41567-018-0048-5

work page doi:10.1038/s41567-018-0048-5 2018
[15]

Adaptive quantum state tomography with neural networks,

Y . Quek, S. Fort, and H. K. Ng, “Adaptive quantum state tomography with neural networks,” 2018

work page 2018
[16]

Neural-network quantum state tomography,

D. Koutn ´y, L. Motka, Z. Hradil, J. ˇReh´aˇcek, and L. L. S ´anchez-Soto, “Neural-network quantum state tomography,” Physical Review A, vol. 106, no. 1, Jul. 2022. [Online]. Available: http://dx.doi.org/10.1103/PhysRevA.106.012409

work page doi:10.1103/physreva.106.012409 2022
[17]

Efficient quantum state tomography with convolutional neural networks,

T. Schmale, M. Reh, and M. G ¨arttner, “Efficient quantum state tomography with convolutional neural networks,” npj Quantum Information, vol. 8, no. 1, Sep. 2022. [Online]. Available: http://dx.doi.org/10.1038/s41534-022-00621-4

work page doi:10.1038/s41534-022-00621-4 2022
[18]

Fast and Robust Quantum State Tomography from Few Basis Measurements,

D. S. Franc ¸a, F. G. L. Brand ˜ao, and R. Kueng, “Fast and Robust Quantum State Tomography from Few Basis Measurements,” in 16th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2021) , ser. Leibniz International Proceedings in Informatics (LIPIcs), M.-H. Hsieh, Ed., vol. 197. Dagstuhl, Germany: Schloss Dagstuhl – Leib...

work page doi:10.4230/lipics.tqc.2021.7 2021
[19]

Sample-optimal tomography of quantum states,

J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu, “Sample-optimal tomography of quantum states,” IEEE Transactions on Information Theory, p. 1–1, 2017. [Online]. Available: http://dx.doi.org/10.1109/TIT.2017.2719044

work page doi:10.1109/tit.2017.2719044 2017
[20]

Physical Review Letters 85(10), 2200–2203 (2000)

Y . S. Teo, H. Zhu, B.-G. Englert, J. ˇReh´aˇcek, and Z. c. v. Hradil, “Quantum-state reconstruction by maximizing likelihood and entropy,” Phys. Rev. Lett. , vol. 107, p. 020404, Jul 2011. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett. 107.020404

work page doi:10.1103/physrevlett 2011
[21]

Maximum a posteriori probability estimates for quantum tomography,

V . Siddhu, “Maximum a posteriori probability estimates for quantum tomography,” Physical Review A , vol. 99, no. 1, Jan

work page
[22]

Available: http://dx.doi.org/10.1103/PhysRevA.99.012342

[Online]. Available: http://dx.doi.org/10.1103/PhysRevA.99.012342

work page doi:10.1103/physreva.99.012342
[24]

Bell inequalities and the separability criterion,

B. M. Terhal, “Bell inequalities and the separability criterion,” Physics Letters A , vol. 271, no. 5, pp. 319–326, 2000. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960100004011

work page 2000
[26]

Entanglement witnesses: construction, analysis and classification,

D. Chru ´sci´nski and G. Sarbicki, “Entanglement witnesses: construction, analysis and classification,” Journal of Physics A: Mathematical and Theoretical , vol. 47, no. 48, p. 483001, Nov. 2014. [Online]. Available: http: //dx.doi.org/10.1088/1751-8113/47/48/483001

work page doi:10.1088/1751-8113/47/48/483001 2014
[27]

Finite-time analysis of the multiarmed bandit problem,

P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the multiarmed bandit problem,” Machine Learning, vol. 47, pp. 235–256, 05 2002

work page 2002
[28]

Best arm identification in multi-armed bandits

J.-Y . Audibert, S. Bubeck, and R. Munos, “Best arm identification in multi-armed bandits.” in COLT, 2010, pp. 41–53

work page 2010
[29]

Good arm identification via bandit feedback,

H. Kano, J. Honda, K. Sakamaki, K. Matsuura, A. Nakamura, and M. Sugiyama, “Good arm identification via bandit feedback,” 2018

work page 2018
[30]

Optimization of entanglement witnesses,

M. Lewenstein, B. Kraus, J. I. Cirac, and P. Horodecki, “Optimization of entanglement witnesses,” Phys. Rev. A , vol. 62, p. 052310, Oct 2000. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.62.052310

work page doi:10.1103/physreva.62.052310 2000
[31]

Bengtsson and K

I. Bengtsson and K. Zyczkowski, Geometry of Quantum States: An Introduction to Quantum Entanglement . Cambridge University Press, 2006

work page 2006
[32]

Separability of mixed states: necessary and sufficient conditions,

M. Horodecki, P. Horodecki, and R. Horodecki, “Separability of mixed states: necessary and sufficient conditions,” Physics Letters A , vol. 223, no. 1, pp. 1–8, 1996. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0375960196007062

work page 1996
[33]

Separability criterion for density matrices,

A. Peres, “Separability criterion for density matrices,” Phys. Rev. Lett. , vol. 77, pp. 1413–1415, Aug 1996. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.77.1413

work page doi:10.1103/physrevlett.77.1413 1996
[34]

Separability criterion and inseparable mixed states with positive partial transposition,

P. Horodecki, “Separability criterion and inseparable mixed states with positive partial transposition,” Physics Letters A , vol. 232, no. 5, pp. 333–339, 1997. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960197004167

work page 1997
[35]

A separability criterion for density operators,

O. Rudolph, “A separability criterion for density operators,” Journal of Physics A: Mathematical and General , vol. 33, no. 21, p. 3951–3955, May 2000. [Online]. Available: http://dx.doi.org/10.1088/0305-4470/33/21/308

work page doi:10.1088/0305-4470/33/21/308 2000
[36]

Covariance matrices and the separability problem,

O. G ¨uhne, P. Hyllus, O. Gittsovich, and J. Eisert, “Covariance matrices and the separability problem,” Physical Review Letters, vol. 99, no. 13, Sep. 2007. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.99.130504

work page doi:10.1103/physrevlett.99.130504 2007
[37]

Classical deterministic complexity of edmonds’ problem and quantum entanglement,

L. Gurvits, “Classical deterministic complexity of edmonds’ problem and quantum entanglement,” 2003

work page 2003
[38]

Complete family of separability criteria,

A. C. Doherty, P. A. Parrilo, and F. M. Spedalieri, “Complete family of separability criteria,” Phys. Rev. A , vol. 69, p. 022308, Feb 2004. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.69.022308

work page doi:10.1103/physreva.69.022308 2004
[39]

Pac bounds for multi-armed bandit and markov decision processes,

E. Even-Dar, S. Mannor, and Y . Mansour, “Pac bounds for multi-armed bandit and markov decision processes,” ser. COLT ’02. Berlin, Heidelberg: Springer-Verlag, 2002, p. 255–270

work page 2002
[40]

Pac subset selection in stochastic multi-armed bandits,

S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, “Pac subset selection in stochastic multi-armed bandits,” in Proceedings of the 29th International Coference on International Conference on Machine Learning , ser. ICML’12. Madison, WI, USA: Omnipress, 2012, p. 227–234

work page 2012
[41]

Almost optimal exploration in multi-armed bandits,

Z. Karnin, T. Koren, and O. Somekh, “Almost optimal exploration in multi-armed bandits,” in Proceedings of the 30th International Conference on Machine Learning , 2013, pp. 1238–1246

work page 2013
[42]

The sample complexity of exploration in the multi-armed bandit problem,

S. Mannor and J. N. Tsitsiklis, “The sample complexity of exploration in the multi-armed bandit problem,” J. Mach. Learn. Res., vol. 5, p. 623–648, dec 2004

work page 2004
[43]

Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,

R. H. Farrell, “Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,” The Annals of Mathematical Statistics, vol. 35, no. 1, pp. 36 – 72, 1964. [Online]. Available: https://doi.org/10.1214/aoms/1177703731

work page doi:10.1214/aoms/1177703731 1964
[44]

lil’ ucb : An optimal exploration algorithm for multi-armed bandits,

K. Jamieson, M. Malloy, R. Nowak, and S. Bubeck, “lil’ ucb : An optimal exploration algorithm for multi-armed bandits,” in Proceedings of The 27th Conference on Learning Theory , ser. Proceedings of Machine Learning Research, M. F. Balcan, V . Feldman, and C. Szepesv´ari, Eds., vol. 35. Barcelona, Spain: PMLR, 13–15 Jun 2014, pp. 423–439. [Online]. Availa...

work page 2014
[45]

An optimal algorithm for the thresholding bandit problem,

A. Locatelli, M. Gutzeit, and A. Carpentier, “An optimal algorithm for the thresholding bandit problem,” in Proceedings of The 33rd International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48. New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 1690–1698. [Online]. Available...

work page 2016
[46]

lil’hdoc: An algorithm for good arm identification under small threshold gap,

T.-H. Tsai, Y .-D. Tsai, and S.-D. Lin, “lil’hdoc: An algorithm for good arm identification under small threshold gap,” 2024

work page 2024
[47]

Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states,

J. Lumbreras, E. Haapasalo, and M. Tomamichel, “Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states,” Quantum, vol. 6, p. 749, Jun. 2022. [Online]. Available: http: //dx.doi.org/10.22331/q-2022-06-29-749

work page doi:10.22331/q-2022-06-29-749 2022
[48]

https://doi.org/10.1088/0305-4470/34/35/335 Manuscript submitted to ACM

K. Zyczkowski and H.-J. Sommers, “Induced measures in the space of mixed quantum states,” Journal of Physics A: Mathematical and General , vol. 34, no. 35, p. 7111–7125, Aug. 2001. [Online]. Available: http: //dx.doi.org/10.1088/0305-4470/34/35/335

work page doi:10.1088/0305-4470/34/35/335 2001
[49]

Machine-learning-derived entanglement witnesses,

A. C. Greenwood, L. T. Wu, E. Y . Zhu, B. T. Kirby, and L. Qian, “Machine-learning-derived entanglement witnesses,” Phys. Rev. Appl., vol. 19, p. 034058, Mar 2023. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevApplied.19.034058

work page doi:10.1103/physrevapplied.19.034058 2023
[50]

Sample complexity of partition identification using multi-armed bandits,

S. Juneja and S. Krishnasamy, “Sample complexity of partition identification using multi-armed bandits,” 2019. 18 VII. S UPPLEMENTARY MATERIAL The following lemma is useful for some calculations. Lemma 14. For t ≥ 1, c > 0, ε ∈ (0, 1), 0 < w ≤ 1, 1 t log log ((1 +ε)t) w ≥ c =⇒ t ≤ 1 c log   2 log (1+ε) cw w   . (16) A. Proof for Section IV-A

work page 2019
[51]

Let B denote the ”good” event that at any time t > 0 and for all arms i ∈ [K], the true value SE(ρi) is well concentrated around its estimate ˆSi,Ni(t)

Proof of Lemma 8: Proof. Let B denote the ”good” event that at any time t > 0 and for all arms i ∈ [K], the true value SE(ρi) is well concentrated around its estimate ˆSi,Ni(t). B := K[ i=1 ∞[ t=1 | ˆSi,Ni(t) − Si| ≤ U Ni(t), δ cεK From Lemma 7 and by applying the union bound, we get that P [B] ≥ 1 − cεK δ cεK 1+ε ≥ 1 − δ (17) where Eq. 17 holds because ε...

work page
[52]

Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3)

Proof of Theorem 9: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3) . . . > SE(ρK−1) > 0 > S E(ρK). Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 1, the arm i⋆ will be dropped from the active set Ω if LCB i⋆(t) > 0. That is, ˆSi⋆,Ni⋆(t) − U Ni⋆(t), δ cεK ...

work page
[53]

Let us consider the case where B holds

Proof of Theorem 10: Proof. Let us consider the case where B holds. By the elimination rule of Algorithm 1, an arm i is removed from the active set Ω if LCB i(t) > 0. We have that, ˆSi,Ni(t) − U Ni(t), δ cεK ≥ ζ ˆSi,Ni(t) − Si + ∆i ≥ U Ni(t), δ cεK =⇒ ∆i ≥ 2U Ni(t), δ cεK (18) Let us denote Ni to be the number of samples of arm i, that is, Ni = inf {t : U...

work page
[54]

Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]

Proof of Lemma 11: Proof. Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]. In the case where there are arms greater than or equal to λ, we show that P { ˆm < λ } ∪S i∈Aent{Si < ζ } ≤ δ where ˆm is the number of good arms identified by the agent. Since we are now considering the case when m ≥ λ, the event { ˆm < λ } implies that at le...

work page
[55]

Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2)

Proof of Theorem 12: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) . . . > SE(ρK−m) > 0 > S E(ρK−m+1) . . . > S E(ρK), with m being unknown. Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 2, an arm i will be dropped if LCB i(t) > 0. That is, ˆSi,Ni(t) − U Ni(t), ...

work page

[1] [1]

Tomography is necessary for universal entanglement detection with single-copy observables,

D. Lu, T. Xin, N. Yu, Z. Ji, J. Chen, G. Long, J. Baugh, X. Peng, B. Zeng, and R. Laflamme, “Tomography is necessary for universal entanglement detection with single-copy observables,” Phys. Rev. Lett. , vol. 116, p. 230501, Jun 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.116.230501

work page doi:10.1103/physrevlett.116.230501 2016

[2] [2]

Minimal tomography with entanglement witnesses,

H. Zhu, Y . S. Teo, and B.-G. Englert, “Minimal tomography with entanglement witnesses,” Phys. Rev. A, vol. 81, p. 052339, May 2010. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.81.052339

work page doi:10.1103/physreva.81.052339 2010

[3] [3]

Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,

C. H. Bennett, G. Brassard, C. Cr ´epeau, R. Jozsa, A. Peres, and W. K. Wootters, “Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,” Phys. Rev. Lett., vol. 70, pp. 1895–1899, Mar 1993

work page 1993

[4] [4]

Quantum entanglement and communication complexity,

H. Buhrman, R. Cleve, and W. van Dam, “Quantum entanglement and communication complexity,” SIAM Journal on Computing, vol. 30, no. 6, pp. 1829–1841, 2001

work page 2001

[5] [5]

Quantum entanglement,

R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki, “Quantum entanglement,” Reviews of Modern Physics , vol. 81, no. 2, pp. 865–942, Jun. 2009

work page 2009

[6] [6]

Low rank matrix recovery from rank one measurements,

R. Kueng, H. Rauhut, and U. Terstiege, “Low rank matrix recovery from rank one measurements,” Applied and Computational Harmonic Analysis , vol. 42, no. 1, pp. 88–116, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1063520315001037

work page 2017

[7] [7]

Confidence polytopes in quantum state tomography,

J. Wang, V . B. Scholz, and R. Renner, “Confidence polytopes in quantum state tomography,” Physical Review Letters , vol. 122, no. 19, May 2019. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.122.190401

work page doi:10.1103/physrevlett.122.190401 2019

[8] [8]

Efficient quantum tomography,

R. O’Donnell and J. Wright, “Efficient quantum tomography,” Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:769062

work page 2015

[9] [9]

Efficient quantum tomography ii,

——, “Efficient quantum tomography ii,” Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing ,

work page

[10] [10]

Available: https://api.semanticscholar.org/CorpusID:5245926

[Online]. Available: https://api.semanticscholar.org/CorpusID:5245926

work page

[11] [11]

Focus on quantum tomography,

K. Banaszek, M. Cramer, and D. Gross, “Focus on quantum tomography,” New Journal of Physics , vol. 15, no. 12, p. 125020, dec 2013. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/15/12/125020

work page doi:10.1088/1367-2630/15/12/125020 2013

[12] [12]

Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,

S. T. Flammia, D. Gross, Y .-K. Liu, and J. Eisert, “Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,” New Journal of Physics , vol. 14, no. 9, p. 095022, sep 2012. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/14/9/095022

work page doi:10.1088/1367-2630/14/9/095022 2012

[13] [13]

Fast state tomography with optimal error bounds,

M. Guta, J. Kahn, R. Kueng, and J. A. Tropp, “Fast state tomography with optimal error bounds,” Journal of Physics A: Mathematical and Theoretical , vol. 53, no. 20, p. 204001, apr 2020. [Online]. Available: https: //dx.doi.org/10.1088/1751-8121/ab8111

work page doi:10.1088/1751-8121/ab8111 2020

[14] [14]

Neural-network quantum state tomography,

G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko, and G. Carleo, “Neural-network quantum state tomography,” Nature Physics, vol. 14, no. 5, p. 447–450, Feb. 2018. [Online]. Available: http://dx.doi.org/10.1038/s41567-018-0048-5

work page doi:10.1038/s41567-018-0048-5 2018

[15] [15]

Adaptive quantum state tomography with neural networks,

Y . Quek, S. Fort, and H. K. Ng, “Adaptive quantum state tomography with neural networks,” 2018

work page 2018

[16] [16]

Neural-network quantum state tomography,

D. Koutn ´y, L. Motka, Z. Hradil, J. ˇReh´aˇcek, and L. L. S ´anchez-Soto, “Neural-network quantum state tomography,” Physical Review A, vol. 106, no. 1, Jul. 2022. [Online]. Available: http://dx.doi.org/10.1103/PhysRevA.106.012409

work page doi:10.1103/physreva.106.012409 2022

[17] [17]

Efficient quantum state tomography with convolutional neural networks,

T. Schmale, M. Reh, and M. G ¨arttner, “Efficient quantum state tomography with convolutional neural networks,” npj Quantum Information, vol. 8, no. 1, Sep. 2022. [Online]. Available: http://dx.doi.org/10.1038/s41534-022-00621-4

work page doi:10.1038/s41534-022-00621-4 2022

[18] [18]

Fast and Robust Quantum State Tomography from Few Basis Measurements,

D. S. Franc ¸a, F. G. L. Brand ˜ao, and R. Kueng, “Fast and Robust Quantum State Tomography from Few Basis Measurements,” in 16th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2021) , ser. Leibniz International Proceedings in Informatics (LIPIcs), M.-H. Hsieh, Ed., vol. 197. Dagstuhl, Germany: Schloss Dagstuhl – Leib...

work page doi:10.4230/lipics.tqc.2021.7 2021

[19] [19]

Sample-optimal tomography of quantum states,

J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu, “Sample-optimal tomography of quantum states,” IEEE Transactions on Information Theory, p. 1–1, 2017. [Online]. Available: http://dx.doi.org/10.1109/TIT.2017.2719044

work page doi:10.1109/tit.2017.2719044 2017

[20] [20]

Physical Review Letters 85(10), 2200–2203 (2000)

Y . S. Teo, H. Zhu, B.-G. Englert, J. ˇReh´aˇcek, and Z. c. v. Hradil, “Quantum-state reconstruction by maximizing likelihood and entropy,” Phys. Rev. Lett. , vol. 107, p. 020404, Jul 2011. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett. 107.020404

work page doi:10.1103/physrevlett 2011

[21] [21]

Maximum a posteriori probability estimates for quantum tomography,

V . Siddhu, “Maximum a posteriori probability estimates for quantum tomography,” Physical Review A , vol. 99, no. 1, Jan

work page

[22] [22]

Available: http://dx.doi.org/10.1103/PhysRevA.99.012342

[Online]. Available: http://dx.doi.org/10.1103/PhysRevA.99.012342

work page doi:10.1103/physreva.99.012342

[23] [24]

Bell inequalities and the separability criterion,

B. M. Terhal, “Bell inequalities and the separability criterion,” Physics Letters A , vol. 271, no. 5, pp. 319–326, 2000. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960100004011

work page 2000

[24] [26]

Entanglement witnesses: construction, analysis and classification,

D. Chru ´sci´nski and G. Sarbicki, “Entanglement witnesses: construction, analysis and classification,” Journal of Physics A: Mathematical and Theoretical , vol. 47, no. 48, p. 483001, Nov. 2014. [Online]. Available: http: //dx.doi.org/10.1088/1751-8113/47/48/483001

work page doi:10.1088/1751-8113/47/48/483001 2014

[25] [27]

Finite-time analysis of the multiarmed bandit problem,

P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the multiarmed bandit problem,” Machine Learning, vol. 47, pp. 235–256, 05 2002

work page 2002

[26] [28]

Best arm identification in multi-armed bandits

J.-Y . Audibert, S. Bubeck, and R. Munos, “Best arm identification in multi-armed bandits.” in COLT, 2010, pp. 41–53

work page 2010

[27] [29]

Good arm identification via bandit feedback,

H. Kano, J. Honda, K. Sakamaki, K. Matsuura, A. Nakamura, and M. Sugiyama, “Good arm identification via bandit feedback,” 2018

work page 2018

[28] [30]

Optimization of entanglement witnesses,

M. Lewenstein, B. Kraus, J. I. Cirac, and P. Horodecki, “Optimization of entanglement witnesses,” Phys. Rev. A , vol. 62, p. 052310, Oct 2000. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.62.052310

work page doi:10.1103/physreva.62.052310 2000

[29] [31]

Bengtsson and K

I. Bengtsson and K. Zyczkowski, Geometry of Quantum States: An Introduction to Quantum Entanglement . Cambridge University Press, 2006

work page 2006

[30] [32]

Separability of mixed states: necessary and sufficient conditions,

M. Horodecki, P. Horodecki, and R. Horodecki, “Separability of mixed states: necessary and sufficient conditions,” Physics Letters A , vol. 223, no. 1, pp. 1–8, 1996. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0375960196007062

work page 1996

[31] [33]

Separability criterion for density matrices,

A. Peres, “Separability criterion for density matrices,” Phys. Rev. Lett. , vol. 77, pp. 1413–1415, Aug 1996. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.77.1413

work page doi:10.1103/physrevlett.77.1413 1996

[32] [34]

Separability criterion and inseparable mixed states with positive partial transposition,

P. Horodecki, “Separability criterion and inseparable mixed states with positive partial transposition,” Physics Letters A , vol. 232, no. 5, pp. 333–339, 1997. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960197004167

work page 1997

[33] [35]

A separability criterion for density operators,

O. Rudolph, “A separability criterion for density operators,” Journal of Physics A: Mathematical and General , vol. 33, no. 21, p. 3951–3955, May 2000. [Online]. Available: http://dx.doi.org/10.1088/0305-4470/33/21/308

work page doi:10.1088/0305-4470/33/21/308 2000

[34] [36]

Covariance matrices and the separability problem,

O. G ¨uhne, P. Hyllus, O. Gittsovich, and J. Eisert, “Covariance matrices and the separability problem,” Physical Review Letters, vol. 99, no. 13, Sep. 2007. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.99.130504

work page doi:10.1103/physrevlett.99.130504 2007

[35] [37]

Classical deterministic complexity of edmonds’ problem and quantum entanglement,

L. Gurvits, “Classical deterministic complexity of edmonds’ problem and quantum entanglement,” 2003

work page 2003

[36] [38]

Complete family of separability criteria,

A. C. Doherty, P. A. Parrilo, and F. M. Spedalieri, “Complete family of separability criteria,” Phys. Rev. A , vol. 69, p. 022308, Feb 2004. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.69.022308

work page doi:10.1103/physreva.69.022308 2004

[37] [39]

Pac bounds for multi-armed bandit and markov decision processes,

E. Even-Dar, S. Mannor, and Y . Mansour, “Pac bounds for multi-armed bandit and markov decision processes,” ser. COLT ’02. Berlin, Heidelberg: Springer-Verlag, 2002, p. 255–270

work page 2002

[38] [40]

Pac subset selection in stochastic multi-armed bandits,

S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, “Pac subset selection in stochastic multi-armed bandits,” in Proceedings of the 29th International Coference on International Conference on Machine Learning , ser. ICML’12. Madison, WI, USA: Omnipress, 2012, p. 227–234

work page 2012

[39] [41]

Almost optimal exploration in multi-armed bandits,

Z. Karnin, T. Koren, and O. Somekh, “Almost optimal exploration in multi-armed bandits,” in Proceedings of the 30th International Conference on Machine Learning , 2013, pp. 1238–1246

work page 2013

[40] [42]

The sample complexity of exploration in the multi-armed bandit problem,

S. Mannor and J. N. Tsitsiklis, “The sample complexity of exploration in the multi-armed bandit problem,” J. Mach. Learn. Res., vol. 5, p. 623–648, dec 2004

work page 2004

[41] [43]

Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,

R. H. Farrell, “Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,” The Annals of Mathematical Statistics, vol. 35, no. 1, pp. 36 – 72, 1964. [Online]. Available: https://doi.org/10.1214/aoms/1177703731

work page doi:10.1214/aoms/1177703731 1964

[42] [44]

lil’ ucb : An optimal exploration algorithm for multi-armed bandits,

K. Jamieson, M. Malloy, R. Nowak, and S. Bubeck, “lil’ ucb : An optimal exploration algorithm for multi-armed bandits,” in Proceedings of The 27th Conference on Learning Theory , ser. Proceedings of Machine Learning Research, M. F. Balcan, V . Feldman, and C. Szepesv´ari, Eds., vol. 35. Barcelona, Spain: PMLR, 13–15 Jun 2014, pp. 423–439. [Online]. Availa...

work page 2014

[43] [45]

An optimal algorithm for the thresholding bandit problem,

A. Locatelli, M. Gutzeit, and A. Carpentier, “An optimal algorithm for the thresholding bandit problem,” in Proceedings of The 33rd International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48. New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 1690–1698. [Online]. Available...

work page 2016

[44] [46]

lil’hdoc: An algorithm for good arm identification under small threshold gap,

T.-H. Tsai, Y .-D. Tsai, and S.-D. Lin, “lil’hdoc: An algorithm for good arm identification under small threshold gap,” 2024

work page 2024

[45] [47]

Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states,

J. Lumbreras, E. Haapasalo, and M. Tomamichel, “Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states,” Quantum, vol. 6, p. 749, Jun. 2022. [Online]. Available: http: //dx.doi.org/10.22331/q-2022-06-29-749

work page doi:10.22331/q-2022-06-29-749 2022

[46] [48]

https://doi.org/10.1088/0305-4470/34/35/335 Manuscript submitted to ACM

K. Zyczkowski and H.-J. Sommers, “Induced measures in the space of mixed quantum states,” Journal of Physics A: Mathematical and General , vol. 34, no. 35, p. 7111–7125, Aug. 2001. [Online]. Available: http: //dx.doi.org/10.1088/0305-4470/34/35/335

work page doi:10.1088/0305-4470/34/35/335 2001

[47] [49]

Machine-learning-derived entanglement witnesses,

A. C. Greenwood, L. T. Wu, E. Y . Zhu, B. T. Kirby, and L. Qian, “Machine-learning-derived entanglement witnesses,” Phys. Rev. Appl., vol. 19, p. 034058, Mar 2023. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevApplied.19.034058

work page doi:10.1103/physrevapplied.19.034058 2023

[48] [50]

Sample complexity of partition identification using multi-armed bandits,

S. Juneja and S. Krishnasamy, “Sample complexity of partition identification using multi-armed bandits,” 2019. 18 VII. S UPPLEMENTARY MATERIAL The following lemma is useful for some calculations. Lemma 14. For t ≥ 1, c > 0, ε ∈ (0, 1), 0 < w ≤ 1, 1 t log log ((1 +ε)t) w ≥ c =⇒ t ≤ 1 c log   2 log (1+ε) cw w   . (16) A. Proof for Section IV-A

work page 2019

[49] [51]

Let B denote the ”good” event that at any time t > 0 and for all arms i ∈ [K], the true value SE(ρi) is well concentrated around its estimate ˆSi,Ni(t)

Proof of Lemma 8: Proof. Let B denote the ”good” event that at any time t > 0 and for all arms i ∈ [K], the true value SE(ρi) is well concentrated around its estimate ˆSi,Ni(t). B := K[ i=1 ∞[ t=1 | ˆSi,Ni(t) − Si| ≤ U Ni(t), δ cεK From Lemma 7 and by applying the union bound, we get that P [B] ≥ 1 − cεK δ cεK 1+ε ≥ 1 − δ (17) where Eq. 17 holds because ε...

work page

[50] [52]

Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3)

Proof of Theorem 9: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3) . . . > SE(ρK−1) > 0 > S E(ρK). Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 1, the arm i⋆ will be dropped from the active set Ω if LCB i⋆(t) > 0. That is, ˆSi⋆,Ni⋆(t) − U Ni⋆(t), δ cεK ...

work page

[51] [53]

Let us consider the case where B holds

Proof of Theorem 10: Proof. Let us consider the case where B holds. By the elimination rule of Algorithm 1, an arm i is removed from the active set Ω if LCB i(t) > 0. We have that, ˆSi,Ni(t) − U Ni(t), δ cεK ≥ ζ ˆSi,Ni(t) − Si + ∆i ≥ U Ni(t), δ cεK =⇒ ∆i ≥ 2U Ni(t), δ cεK (18) Let us denote Ni to be the number of samples of arm i, that is, Ni = inf {t : U...

work page

[52] [54]

Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]

Proof of Lemma 11: Proof. Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]. In the case where there are arms greater than or equal to λ, we show that P { ˆm < λ } ∪S i∈Aent{Si < ζ } ≤ δ where ˆm is the number of good arms identified by the agent. Since we are now considering the case when m ≥ λ, the event { ˆm < λ } implies that at le...

work page

[53] [55]

Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2)

Proof of Theorem 12: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) . . . > SE(ρK−m) > 0 > S E(ρK−m+1) . . . > S E(ρK), with m being unknown. Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 2, an arm i will be dropped if LCB i(t) > 0. That is, ˆSi,Ni(t) − U Ni(t), ...

work page