Batch Entanglement Detection in Parameterized Qubit States using Classical Bandit Algorithms
Pith reviewed 2026-05-23 23:46 UTC · model grok-4.3
The pith
A single-parameter witness family plus classical thresholding bandits detects entanglement in batches of two-qubit states from class F.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By mapping batch entanglement detection onto the classical thresholding bandit problem, the authors show that measurements of the Zhu-Teo-Englert single-parameter witness family suffice to separate entangled from separable states inside class F, delivering both conclusive identification of every entangled member of a batch and explicit upper bounds on the total number of measurements required.
What carries the argument
The thresholding bandit algorithm applied to statistics obtained from the single-parameter entanglement witness family of Zhu, Teo, and Englert.
If this is right
- Conclusive identification of every entangled state in a batch is possible without full tomography whenever the states lie in class F.
- The mapping to the thresholding bandit problem supplies explicit measurement-complexity guarantees derived from classical multi-armed bandit theory.
- Numerical simulations and a physical experiment confirm that the method works on the targeted state families.
- The technique applies directly to the practically motivated families inside class F, such as depolarized Bell states and Bell diagonal states.
Where Pith is reading between the lines
- The same bandit reduction could be tried on other quantum properties, such as coherence or non-stabilizerness, that admit low-parameter witness families.
- If analogous witness families exist for higher-dimensional or multi-qubit systems, the sample savings might carry over to those settings.
- Hybrid quantum-classical detection loops of this form may become routine for real-time verification tasks in quantum networks.
Load-bearing premise
The unknown states must belong to class F, the restricted set for which the chosen witness family is guaranteed to separate entangled from separable cases.
What would settle it
A depolarized Bell state or other member of class F that the algorithm still misclassifies after the number of measurements guaranteed by the bandit bound has been collected.
Figures
read the original abstract
Entanglement is a key property of quantum states that acts as a resource for a wide range of tasks in quantum computing. Entanglement detection is a key conceptual and practical challenge. Without adaptive or joint measurements, entanglement detection is constrained by no-go theorems (Lu et al. [Phys. Rev. Lett., 116, 230501 (2016)]), necessitating full state tomography. Batch entanglement detection refers to the problem of identifying all entangled states from amongst a set of $K$ unknown states, which finds applications in quantum information processing. We devise a method for performing batch entanglement detection by measuring a single-parameter family of entanglement witnesses, as proposed by Zhu, Teo, and Englert [Phys. Rev. A, 81, 052339, 2010], followed by a thresholding bandit algorithm on the measurement data. The proposed method can perform batch entanglement detection conclusively when the unknown states are drawn from a practically well-motivated class of two-qubit states $\mathcal{F}$, which includes Depolarised Bell states, Bell diagonal states, etc. Our key novelty lies in drawing a connection between batch entanglement detection and a Thresholding Bandit problem in classical Multi-Armed Bandits (MAB). The connection to the MAB problem also enables us to derive theoretical guarantees on the measurement/sample complexity of the proposed technique. We demonstrate the performance of the proposed method through numerical simulations and an experimental implementation. More broadly, this paper highlights the potential for employing classical machine learning techniques for quantum entanglement detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that batch entanglement detection on K unknown two-qubit states drawn from class F (Depolarised Bell states, Bell diagonal states, etc.) can be performed conclusively by measuring a single-parameter family of witnesses (Zhu et al., Phys. Rev. A 81, 052339, 2010) and feeding the outcomes into a classical thresholding bandit algorithm; the connection to the Thresholding Bandit problem is asserted to yield theoretical guarantees on measurement/sample complexity, with supporting numerical simulations and an experimental demonstration.
Significance. If the witness family is complete for F and the MAB reduction preserves the exact decision boundary, the approach could lower the measurement overhead for batch detection tasks in quantum information processing by importing existing MAB sample-complexity bounds. The explicit linkage to the classical Thresholding Bandit problem and the resulting guarantees constitute a clear technical contribution.
major comments (2)
- [Abstract] Abstract and the paragraph defining class F: the claim of conclusive detection requires that the single-parameter witness family separates every entangled state in F from all separable states. The manuscript must supply or cite a self-contained argument establishing this completeness property for the full class F; without it the thresholding rule can produce false negatives even when the MAB algorithm runs perfectly.
- [MAB reduction section] Section describing the MAB reduction (likely the section containing the connection to the Thresholding Bandit problem): the stated sample-complexity guarantees appear to treat witness expectations as known exactly, yet they are estimated from finite-shot measurements. The manuscript must derive or bound the additional error introduced by this estimation and show whether the classical MAB bounds survive or require modified constants; this step is load-bearing for the theoretical guarantees.
minor comments (1)
- Notation for the witness parameter and the decision threshold should be introduced once and used consistently; occasional re-use of the same symbol for different quantities appears in the methods description.
Simulated Author's Rebuttal
We thank the referee for the careful reading and the constructive major comments. Both points identify places where the manuscript's claims would benefit from additional explicit justification. We address each below and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract and the paragraph defining class F: the claim of conclusive detection requires that the single-parameter witness family separates every entangled state in F from all separable states. The manuscript must supply or cite a self-contained argument establishing this completeness property for the full class F; without it the thresholding rule can produce false negatives even when the MAB algorithm runs perfectly.
Authors: We agree that conclusive detection for the full class F requires the witness family to separate every entangled state in F from all separable states. The manuscript defines class F by reference to the states for which Zhu et al. (Phys. Rev. A 81, 052339, 2010) explicitly construct the single-parameter family and prove detection (Depolarized Bell states, Bell-diagonal states, and the listed extensions). To make the argument self-contained we will add, in the paragraph defining class F, a short paragraph that (i) recalls the relevant theorem from Zhu et al. and (ii) states that, by construction of F, every entangled member of F yields a witness expectation below the separable threshold for some parameter value in the family. This addition will also be reflected in a revised abstract sentence. revision: yes
-
Referee: [MAB reduction section] Section describing the MAB reduction (likely the section containing the connection to the Thresholding Bandit problem): the stated sample-complexity guarantees appear to treat witness expectations as known exactly, yet they are estimated from finite-shot measurements. The manuscript must derive or bound the additional error introduced by this estimation and show whether the classical MAB bounds survive or require modified constants; this step is load-bearing for the theoretical guarantees.
Authors: The referee is correct that the current theoretical guarantees are stated for exact witness expectations while the experimental and numerical sections use finite-shot estimates. We will add a new subsection (or appendix) that (i) applies Hoeffding or empirical-Bernstein bounds to control the deviation between the estimated and true expectations with high probability, (ii) shows that choosing a per-arm shot budget scaling as O(log(K/δ)/ε²) propagates the estimation error into the thresholding decision without altering the asymptotic sample-complexity order, and (iii) states the modified constants that appear in the final MAB bound. The revised text will therefore make explicit that the classical Thresholding Bandit guarantees survive up to these (explicitly bounded) factors. revision: yes
Circularity Check
No circularity; MAB connection and witness family supply independent content
full rationale
The paper explicitly restricts its conclusive detection claim to states in class F (where the cited single-parameter witness family separates entangled from separable states) and derives measurement-complexity guarantees by reduction to the classical Thresholding Bandit problem. Both the witness construction and the bandit guarantees originate in external literature (Zhu et al. and standard MAB results) with no overlap to the present authors. No equation or claim reduces a fitted parameter or self-citation back into the target result; the derivation chain remains open to external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Unknown states are drawn from class F (Depolarised Bell states, Bell diagonal states, etc.) for which the single-parameter witness family separates entangled from separable states.
Reference graph
Works this paper leans on
-
[1]
Tomography is necessary for universal entanglement detection with single-copy observables,
D. Lu, T. Xin, N. Yu, Z. Ji, J. Chen, G. Long, J. Baugh, X. Peng, B. Zeng, and R. Laflamme, “Tomography is necessary for universal entanglement detection with single-copy observables,” Phys. Rev. Lett. , vol. 116, p. 230501, Jun 2016. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.116.230501
-
[2]
Minimal tomography with entanglement witnesses,
H. Zhu, Y . S. Teo, and B.-G. Englert, “Minimal tomography with entanglement witnesses,” Phys. Rev. A, vol. 81, p. 052339, May 2010. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.81.052339
-
[3]
Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,
C. H. Bennett, G. Brassard, C. Cr ´epeau, R. Jozsa, A. Peres, and W. K. Wootters, “Teleporting an unknown quantum state via dual classical and einstein-podolsky-rosen channels,” Phys. Rev. Lett., vol. 70, pp. 1895–1899, Mar 1993
work page 1993
-
[4]
Quantum entanglement and communication complexity,
H. Buhrman, R. Cleve, and W. van Dam, “Quantum entanglement and communication complexity,” SIAM Journal on Computing, vol. 30, no. 6, pp. 1829–1841, 2001
work page 2001
-
[5]
R. Horodecki, P. Horodecki, M. Horodecki, and K. Horodecki, “Quantum entanglement,” Reviews of Modern Physics , vol. 81, no. 2, pp. 865–942, Jun. 2009
work page 2009
-
[6]
Low rank matrix recovery from rank one measurements,
R. Kueng, H. Rauhut, and U. Terstiege, “Low rank matrix recovery from rank one measurements,” Applied and Computational Harmonic Analysis , vol. 42, no. 1, pp. 88–116, 2017. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S1063520315001037
work page 2017
-
[7]
Confidence polytopes in quantum state tomography,
J. Wang, V . B. Scholz, and R. Renner, “Confidence polytopes in quantum state tomography,” Physical Review Letters , vol. 122, no. 19, May 2019. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.122.190401
-
[8]
R. O’Donnell and J. Wright, “Efficient quantum tomography,” Proceedings of the forty-eighth annual ACM symposium on Theory of Computing , 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:769062
work page 2015
-
[9]
Efficient quantum tomography ii,
——, “Efficient quantum tomography ii,” Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing ,
-
[10]
Available: https://api.semanticscholar.org/CorpusID:5245926
[Online]. Available: https://api.semanticscholar.org/CorpusID:5245926
-
[11]
K. Banaszek, M. Cramer, and D. Gross, “Focus on quantum tomography,” New Journal of Physics , vol. 15, no. 12, p. 125020, dec 2013. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/15/12/125020
-
[12]
Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,
S. T. Flammia, D. Gross, Y .-K. Liu, and J. Eisert, “Quantum tomography via compressed sensing: error bounds, sample complexity and efficient estimators,” New Journal of Physics , vol. 14, no. 9, p. 095022, sep 2012. [Online]. Available: https://dx.doi.org/10.1088/1367-2630/14/9/095022
-
[13]
Fast state tomography with optimal error bounds,
M. Guta, J. Kahn, R. Kueng, and J. A. Tropp, “Fast state tomography with optimal error bounds,” Journal of Physics A: Mathematical and Theoretical , vol. 53, no. 20, p. 204001, apr 2020. [Online]. Available: https: //dx.doi.org/10.1088/1751-8121/ab8111
-
[14]
Neural-network quantum state tomography,
G. Torlai, G. Mazzola, J. Carrasquilla, M. Troyer, R. Melko, and G. Carleo, “Neural-network quantum state tomography,” Nature Physics, vol. 14, no. 5, p. 447–450, Feb. 2018. [Online]. Available: http://dx.doi.org/10.1038/s41567-018-0048-5
-
[15]
Adaptive quantum state tomography with neural networks,
Y . Quek, S. Fort, and H. K. Ng, “Adaptive quantum state tomography with neural networks,” 2018
work page 2018
-
[16]
Neural-network quantum state tomography,
D. Koutn ´y, L. Motka, Z. Hradil, J. ˇReh´aˇcek, and L. L. S ´anchez-Soto, “Neural-network quantum state tomography,” Physical Review A, vol. 106, no. 1, Jul. 2022. [Online]. Available: http://dx.doi.org/10.1103/PhysRevA.106.012409
-
[17]
Efficient quantum state tomography with convolutional neural networks,
T. Schmale, M. Reh, and M. G ¨arttner, “Efficient quantum state tomography with convolutional neural networks,” npj Quantum Information, vol. 8, no. 1, Sep. 2022. [Online]. Available: http://dx.doi.org/10.1038/s41534-022-00621-4
-
[18]
Fast and Robust Quantum State Tomography from Few Basis Measurements,
D. S. Franc ¸a, F. G. L. Brand ˜ao, and R. Kueng, “Fast and Robust Quantum State Tomography from Few Basis Measurements,” in 16th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2021) , ser. Leibniz International Proceedings in Informatics (LIPIcs), M.-H. Hsieh, Ed., vol. 197. Dagstuhl, Germany: Schloss Dagstuhl – Leib...
-
[19]
Sample-optimal tomography of quantum states,
J. Haah, A. W. Harrow, Z. Ji, X. Wu, and N. Yu, “Sample-optimal tomography of quantum states,” IEEE Transactions on Information Theory, p. 1–1, 2017. [Online]. Available: http://dx.doi.org/10.1109/TIT.2017.2719044
-
[20]
Physical Review Letters 85(10), 2200–2203 (2000)
Y . S. Teo, H. Zhu, B.-G. Englert, J. ˇReh´aˇcek, and Z. c. v. Hradil, “Quantum-state reconstruction by maximizing likelihood and entropy,” Phys. Rev. Lett. , vol. 107, p. 020404, Jul 2011. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett. 107.020404
-
[21]
Maximum a posteriori probability estimates for quantum tomography,
V . Siddhu, “Maximum a posteriori probability estimates for quantum tomography,” Physical Review A , vol. 99, no. 1, Jan
-
[22]
Available: http://dx.doi.org/10.1103/PhysRevA.99.012342
[Online]. Available: http://dx.doi.org/10.1103/PhysRevA.99.012342
-
[24]
Bell inequalities and the separability criterion,
B. M. Terhal, “Bell inequalities and the separability criterion,” Physics Letters A , vol. 271, no. 5, pp. 319–326, 2000. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960100004011
work page 2000
-
[26]
Entanglement witnesses: construction, analysis and classification,
D. Chru ´sci´nski and G. Sarbicki, “Entanglement witnesses: construction, analysis and classification,” Journal of Physics A: Mathematical and Theoretical , vol. 47, no. 48, p. 483001, Nov. 2014. [Online]. Available: http: //dx.doi.org/10.1088/1751-8113/47/48/483001
-
[27]
Finite-time analysis of the multiarmed bandit problem,
P. Auer, N. Cesa-Bianchi, and P. Fischer, “Finite-time analysis of the multiarmed bandit problem,” Machine Learning, vol. 47, pp. 235–256, 05 2002
work page 2002
-
[28]
Best arm identification in multi-armed bandits
J.-Y . Audibert, S. Bubeck, and R. Munos, “Best arm identification in multi-armed bandits.” in COLT, 2010, pp. 41–53
work page 2010
-
[29]
Good arm identification via bandit feedback,
H. Kano, J. Honda, K. Sakamaki, K. Matsuura, A. Nakamura, and M. Sugiyama, “Good arm identification via bandit feedback,” 2018
work page 2018
-
[30]
Optimization of entanglement witnesses,
M. Lewenstein, B. Kraus, J. I. Cirac, and P. Horodecki, “Optimization of entanglement witnesses,” Phys. Rev. A , vol. 62, p. 052310, Oct 2000. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.62.052310
-
[31]
I. Bengtsson and K. Zyczkowski, Geometry of Quantum States: An Introduction to Quantum Entanglement . Cambridge University Press, 2006
work page 2006
-
[32]
Separability of mixed states: necessary and sufficient conditions,
M. Horodecki, P. Horodecki, and R. Horodecki, “Separability of mixed states: necessary and sufficient conditions,” Physics Letters A , vol. 223, no. 1, pp. 1–8, 1996. [Online]. Available: https://www.sciencedirect.com/science/article/pii/ S0375960196007062
work page 1996
-
[33]
Separability criterion for density matrices,
A. Peres, “Separability criterion for density matrices,” Phys. Rev. Lett. , vol. 77, pp. 1413–1415, Aug 1996. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevLett.77.1413
-
[34]
Separability criterion and inseparable mixed states with positive partial transposition,
P. Horodecki, “Separability criterion and inseparable mixed states with positive partial transposition,” Physics Letters A , vol. 232, no. 5, pp. 333–339, 1997. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0375960197004167
work page 1997
-
[35]
A separability criterion for density operators,
O. Rudolph, “A separability criterion for density operators,” Journal of Physics A: Mathematical and General , vol. 33, no. 21, p. 3951–3955, May 2000. [Online]. Available: http://dx.doi.org/10.1088/0305-4470/33/21/308
-
[36]
Covariance matrices and the separability problem,
O. G ¨uhne, P. Hyllus, O. Gittsovich, and J. Eisert, “Covariance matrices and the separability problem,” Physical Review Letters, vol. 99, no. 13, Sep. 2007. [Online]. Available: http://dx.doi.org/10.1103/PhysRevLett.99.130504
-
[37]
Classical deterministic complexity of edmonds’ problem and quantum entanglement,
L. Gurvits, “Classical deterministic complexity of edmonds’ problem and quantum entanglement,” 2003
work page 2003
-
[38]
Complete family of separability criteria,
A. C. Doherty, P. A. Parrilo, and F. M. Spedalieri, “Complete family of separability criteria,” Phys. Rev. A , vol. 69, p. 022308, Feb 2004. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevA.69.022308
-
[39]
Pac bounds for multi-armed bandit and markov decision processes,
E. Even-Dar, S. Mannor, and Y . Mansour, “Pac bounds for multi-armed bandit and markov decision processes,” ser. COLT ’02. Berlin, Heidelberg: Springer-Verlag, 2002, p. 255–270
work page 2002
-
[40]
Pac subset selection in stochastic multi-armed bandits,
S. Kalyanakrishnan, A. Tewari, P. Auer, and P. Stone, “Pac subset selection in stochastic multi-armed bandits,” in Proceedings of the 29th International Coference on International Conference on Machine Learning , ser. ICML’12. Madison, WI, USA: Omnipress, 2012, p. 227–234
work page 2012
-
[41]
Almost optimal exploration in multi-armed bandits,
Z. Karnin, T. Koren, and O. Somekh, “Almost optimal exploration in multi-armed bandits,” in Proceedings of the 30th International Conference on Machine Learning , 2013, pp. 1238–1246
work page 2013
-
[42]
The sample complexity of exploration in the multi-armed bandit problem,
S. Mannor and J. N. Tsitsiklis, “The sample complexity of exploration in the multi-armed bandit problem,” J. Mach. Learn. Res., vol. 5, p. 623–648, dec 2004
work page 2004
-
[43]
Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,
R. H. Farrell, “Asymptotic Behavior of Expected Sample Size in Certain One Sided Tests,” The Annals of Mathematical Statistics, vol. 35, no. 1, pp. 36 – 72, 1964. [Online]. Available: https://doi.org/10.1214/aoms/1177703731
-
[44]
lil’ ucb : An optimal exploration algorithm for multi-armed bandits,
K. Jamieson, M. Malloy, R. Nowak, and S. Bubeck, “lil’ ucb : An optimal exploration algorithm for multi-armed bandits,” in Proceedings of The 27th Conference on Learning Theory , ser. Proceedings of Machine Learning Research, M. F. Balcan, V . Feldman, and C. Szepesv´ari, Eds., vol. 35. Barcelona, Spain: PMLR, 13–15 Jun 2014, pp. 423–439. [Online]. Availa...
work page 2014
-
[45]
An optimal algorithm for the thresholding bandit problem,
A. Locatelli, M. Gutzeit, and A. Carpentier, “An optimal algorithm for the thresholding bandit problem,” in Proceedings of The 33rd International Conference on Machine Learning , ser. Proceedings of Machine Learning Research, M. F. Balcan and K. Q. Weinberger, Eds., vol. 48. New York, New York, USA: PMLR, 20–22 Jun 2016, pp. 1690–1698. [Online]. Available...
work page 2016
-
[46]
lil’hdoc: An algorithm for good arm identification under small threshold gap,
T.-H. Tsai, Y .-D. Tsai, and S.-D. Lin, “lil’hdoc: An algorithm for good arm identification under small threshold gap,” 2024
work page 2024
-
[47]
J. Lumbreras, E. Haapasalo, and M. Tomamichel, “Multi-armed quantum bandits: Exploration versus exploitation when learning properties of quantum states,” Quantum, vol. 6, p. 749, Jun. 2022. [Online]. Available: http: //dx.doi.org/10.22331/q-2022-06-29-749
-
[48]
https://doi.org/10.1088/0305-4470/34/35/335 Manuscript submitted to ACM
K. Zyczkowski and H.-J. Sommers, “Induced measures in the space of mixed quantum states,” Journal of Physics A: Mathematical and General , vol. 34, no. 35, p. 7111–7125, Aug. 2001. [Online]. Available: http: //dx.doi.org/10.1088/0305-4470/34/35/335
-
[49]
Machine-learning-derived entanglement witnesses,
A. C. Greenwood, L. T. Wu, E. Y . Zhu, B. T. Kirby, and L. Qian, “Machine-learning-derived entanglement witnesses,” Phys. Rev. Appl., vol. 19, p. 034058, Mar 2023. [Online]. Available: https://link.aps.org/doi/10.1103/PhysRevApplied.19.034058
-
[50]
Sample complexity of partition identification using multi-armed bandits,
S. Juneja and S. Krishnasamy, “Sample complexity of partition identification using multi-armed bandits,” 2019. 18 VII. S UPPLEMENTARY MATERIAL The following lemma is useful for some calculations. Lemma 14. For t ≥ 1, c > 0, ε ∈ (0, 1), 0 < w ≤ 1, 1 t log log ((1 +ε)t) w ≥ c =⇒ t ≤ 1 c log 2 log (1+ε) cw w . (16) A. Proof for Section IV-A
work page 2019
-
[51]
Proof of Lemma 8: Proof. Let B denote the ”good” event that at any time t > 0 and for all arms i ∈ [K], the true value SE(ρi) is well concentrated around its estimate ˆSi,Ni(t). B := K[ i=1 ∞[ t=1 | ˆSi,Ni(t) − Si| ≤ U Ni(t), δ cεK From Lemma 7 and by applying the union bound, we get that P [B] ≥ 1 − cεK δ cεK 1+ε ≥ 1 − δ (17) where Eq. 17 holds because ε...
-
[52]
Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3)
Proof of Theorem 9: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) ≥ SE(ρ3) . . . > SE(ρK−1) > 0 > S E(ρK). Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 1, the arm i⋆ will be dropped from the active set Ω if LCB i⋆(t) > 0. That is, ˆSi⋆,Ni⋆(t) − U Ni⋆(t), δ cεK ...
-
[53]
Let us consider the case where B holds
Proof of Theorem 10: Proof. Let us consider the case where B holds. By the elimination rule of Algorithm 1, an arm i is removed from the active set Ω if LCB i(t) > 0. We have that, ˆSi,Ni(t) − U Ni(t), δ cεK ≥ ζ ˆSi,Ni(t) − Si + ∆i ≥ U Ni(t), δ cεK =⇒ ∆i ≥ 2U Ni(t), δ cεK (18) Let us denote Ni to be the number of samples of arm i, that is, Ni = inf {t : U...
-
[54]
Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]
Proof of Lemma 11: Proof. Firstly, we show that Algorithm 2 is (λ, δ)-PAC for arbitrary λ ∈ [K]. In the case where there are arms greater than or equal to λ, we show that P { ˆm < λ } ∪S i∈Aent{Si < ζ } ≤ δ where ˆm is the number of good arms identified by the agent. Since we are now considering the case when m ≥ λ, the event { ˆm < λ } implies that at le...
-
[55]
Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2)
Proof of Theorem 12: Proof. Recall that the threshold ζ = 0 and problem instance SE is such that SE(ρ1) ≥ SE(ρ2) . . . > SE(ρK−m) > 0 > S E(ρK−m+1) . . . > S E(ρK), with m being unknown. Let us consider the case that the event B described in Lemma 8 holds. As outlined in Algorithm 2, an arm i will be dropped if LCB i(t) > 0. That is, ˆSi,Ni(t) − U Ni(t), ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.