Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers

(2) Columbia University); Andras Ferenczi (1); Dagen Wang (1); Jason Qizhe Qin (2) ((1) American Express Co.; Sutapa Samanta (1)

arxiv: 2605.16438 · v1 · pith:ZHFMYAHDnew · submitted 2026-05-14 · 💻 cs.LG · cs.AI

Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers

Andras Ferenczi (1) , Sutapa Samanta (1) , Dagen Wang (1) , Jason Qizhe Qin (2) ((1) American Express Co. , (2) Columbia University) This is my paper

Pith reviewed 2026-05-20 19:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords Byzantine resilienceFederated learningQUBOQuantum annealingClient selectionMultiKrumMultiSignal ensemble

0 comments

The pith

Reformulating client selection as a QUBO problem solved on quantum annealers improves detection of advanced Byzantine attacks over classical MultiKrum in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to show that turning client selection into a QUBO optimization task lets quantum annealers find a globally consistent set of honest updates by minimizing pairwise distances across all subsets at once. This joint approach is meant to catch sophisticated attacks that preserve local statistics and fool per-client scoring like MultiKrum. To scale beyond small client counts where pure QUBO degrades, the authors add a MultiSignal ensemble that routes attacks using gaps in Euclidean and cosine scores and applies a penalized QUBO variant with voting. A sympathetic reader would care because federated learning relies on untrusted clients yet current defenses leave gaps against evasion strategies that mimic honest gradients.

Core claim

Encoding pairwise distances between client updates into a QUBO cost function and solving it on quantum annealers finds the mutually closest group of m clients, outperforming MultiKrum on hard attacks such as Advanced LIE at 15 clients and, through the MultiSignal ensemble, reaching 95.3 percent average detection accuracy at 100 clients on MNIST versus 91.8 percent for classical MultiKrum.

What carries the argument

The QUBO cost function that encodes pairwise distances for joint optimization over client subsets, solved by quantum annealers to identify the most mutually consistent honest group rather than scoring clients individually.

Load-bearing premise

Encoding pairwise distances into a QUBO will produce a globally optimal set of mutually consistent honest clients and the quantum annealer will return high-quality solutions even as the number of clients grows beyond small scales.

What would settle it

A scaling experiment at 100-plus clients in which the pure QUBO or MultiSignal ensemble fails to exceed MultiKrum accuracy on Sparse Lie or Advanced Lie attacks.

Figures

Figures reproduced from arXiv: 2605.16438 by (2) Columbia University), Andras Ferenczi (1), Dagen Wang (1), Jason Qizhe Qin (2) ((1) American Express Co., Sutapa Samanta (1).

**Figure 2.** Figure 2: MNIST: Round-by-round detection under Sparse Lie attack. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 5.** Figure 5: FashionMNIST: Round-by-round detection under ALIE attack. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 4.** Figure 4: CIFAR-10: Round-by-round detection under Sparse Lie attack. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗

read the original abstract

Federated Learning (FL) trains a global model across decentralized clients while preserving data privacy, but at scale it is vulnerable to malicious updates. Byzantine-resilient aggregation methods such as MultiKrum score gradients against their nearest neighbors and can miss malicious updates that preserve the statistical properties of honest ones. We propose a quantum annealing approach that reformulates client selection as a Quadratic Unconstrained Binary Optimization (QUBO) problem, encoding pairwise distances into a cost function solved by quantum annealers (QA). Unlike MultiKrum's greedy per-client scoring, the QUBO formulation jointly optimizes over all subsets to find the mutually closest group of $m$ clients. At small scale (15 clients), QUBO outperforms MultiKrum on the most challenging Byzantine attacks: e.g., Advanced LIE is detected with 95.11% accuracy versus 81.33% on MNIST and 97.78% versus 75.56% on CIFAR-10. QUBO fares poorly on simpler attacks where MultiKrum excels, so the two methods are complementary. QUBO quality also degrades as the number of clients grows. To address this, we introduce a MultiSignal ensemble that uses a dual-feature routing gate based on Euclidean and cosine Krum score gaps to classify attacks into four regimes and routes evasion attacks to a suspicion-penalized QUBO with agreement voting. At 100 clients on MNIST, MultiSignal achieves 95.3% average detection accuracy versus 91.8% for classical MultiKrum, with the largest gains on Sparse Lie (72.0% to 95.2%, +23.2 points) and Advanced Lie (80.4% to 85.2%, +4.8 points). These results show that QUBO-based quantum annealing with MultiSignal is a principled and scalable defense against the most challenging Byzantine strategies in federated learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

QUBO client selection beats MultiKrum on hard attacks at small scale but the 100-client gains look driven more by the routing gate than by quantum annealing.

read the letter

The core new piece is turning client selection into a single QUBO that jointly picks the m mutually closest updates instead of scoring them greedily like MultiKrum. That formulation is a clean departure and it shows up in the numbers: at 15 clients the QUBO version lifts detection on Advanced LIE from 81% to 95% on MNIST and from 76% to 98% on CIFAR-10. The MultiSignal routing gate that looks at Euclidean and cosine Krum gaps and sends evasion attacks to a penalized QUBO plus agreement voting is the practical fix for scale. At 100 clients the full ensemble reaches 95.3% average detection versus 91.8% for plain MultiKrum, with the biggest jump on Sparse Lie (+23 points). Those are the concrete wins the paper actually delivers. The cost function is just pairwise distances, so there is no hidden fitting to the accuracy metric. The authors are also upfront that solution quality drops as client count grows, which is why they added the router. The soft spots are the missing pieces that make the quantum claim hard to judge. There are no error bars, no hyper-parameter details, and no ablation that isolates the annealer from the classical routing and voting. The stress-test concern holds: if a good classical QUBO heuristic already finds comparable subsets on the same distance objective, then the reported lift at n=100 is mostly from the MultiSignal logic rather than the quantum hardware. Without that separation it is difficult to know how much the annealer is really contributing once client numbers move beyond toy scales. This is worth a serious referee for anyone working on robust aggregation in federated learning. The formulation is novel enough and the attack-regime results are specific enough that a reviewer can ask for the missing controls and hardware comparisons without the paper being dismissed outright. I would bring it to a reading group to walk through the QUBO encoding and the routing logic, but I would not cite it until the quantum versus classical gap is shown more clearly.

Referee Report

3 major / 2 minor

Summary. The paper claims that reformulating Byzantine client selection in federated learning as a QUBO problem solved on quantum annealers outperforms greedy methods like MultiKrum on challenging attacks at small scale (15 clients), and that a MultiSignal ensemble with a dual-feature routing gate (Euclidean and cosine Krum score gaps) plus suspicion-penalized QUBO and agreement voting extends this to 100 clients, achieving 95.3% average detection accuracy on MNIST versus 91.8% for MultiKrum, with gains of +23.2 points on Sparse Lie and +4.8 on Advanced Lie.

Significance. If the experimental claims hold under rigorous controls, the work would demonstrate a principled way to leverage joint subset optimization via QUBO for Byzantine resilience, showing complementarity with classical methods and a scalable routing mechanism. The explicit encoding of mutual consistency in the cost function and the attack-regime routing are conceptually attractive strengths.

major comments (3)

[Abstract / Experimental evaluation at scale] Abstract and large-scale results: the headline 95.3% vs 91.8% detection accuracies at 100 clients on MNIST are reported without error bars, hyper-parameter tuning protocols, or ablation of the routing gate and agreement voting; these omissions make it impossible to assess whether the gains are statistically reliable or driven by the QUBO component.
[Large-scale experiments (100 clients)] Large-scale experiments: the manuscript states that QUBO solution quality degrades with client count and that the formulation jointly optimizes for the closest group of m clients, yet supplies no quantitative metrics (e.g., energy gap or solution quality vs. classical solvers) for n=100; without this, the +23.2 point gain on Sparse Lie cannot be confidently attributed to quantum annealing rather than the classical routing and voting logic.
[QUBO formulation and evaluation methodology] QUBO formulation and evaluation: performance is measured solely by end-to-end detection accuracy on fixed public datasets via direct comparison to MultiKrum; because the cost function directly encodes pairwise distances rather than being fitted to the target metric, the evaluation risks circularity and does not test whether the annealer actually returns higher-quality mutually consistent subsets than classical QUBO heuristics.

minor comments (2)

[Abstract] Abstract: the statement that 'QUBO fares poorly on simpler attacks' is left without the corresponding numerical values or attack names, reducing completeness.
[MultiSignal ensemble description] Notation: the definition of the suspicion-penalized QUBO objective and the precise four-regime classification logic of the routing gate would benefit from an explicit equation or pseudocode block for reproducibility.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, outlining planned revisions where appropriate to strengthen the experimental reporting and clarify the role of the QUBO component.

read point-by-point responses

Referee: [Abstract / Experimental evaluation at scale] Abstract and large-scale results: the headline 95.3% vs 91.8% detection accuracies at 100 clients on MNIST are reported without error bars, hyper-parameter tuning protocols, or ablation of the routing gate and agreement voting; these omissions make it impossible to assess whether the gains are statistically reliable or driven by the QUBO component.

Authors: We agree that the current presentation lacks error bars, explicit hyper-parameter tuning details, and ablations, which limits assessment of reliability and component contributions. In the revised manuscript we will report mean detection accuracies with standard deviations computed over five independent random seeds for both the 15-client and 100-client regimes. We will also add a dedicated paragraph describing the grid search used to select routing-gate thresholds (Euclidean and cosine Krum-score gaps) and the suspicion penalty coefficient in the QUBO objective. Finally, we will include a new ablation table that isolates the routing gate and the agreement-voting step, showing their incremental effect on the Sparse Lie and Advanced Lie gains. These additions will make it possible to evaluate statistical significance and the specific contribution of the QUBO-based selection. revision: yes
Referee: [Large-scale experiments (100 clients)] Large-scale experiments: the manuscript states that QUBO solution quality degrades with client count and that the formulation jointly optimizes for the closest group of m clients, yet supplies no quantitative metrics (e.g., energy gap or solution quality vs. classical solvers) for n=100; without this, the +23.2 point gain on Sparse Lie cannot be confidently attributed to quantum annealing rather than the classical routing and voting logic.

Authors: We acknowledge that the manuscript does not supply energy-gap or solver-comparison metrics at n=100. The reported degradation statement is qualitative, and the +23.2-point gain on Sparse Lie is achieved by the full MultiSignal ensemble rather than by a standalone QUBO solve. In revision we will expand the discussion of scale limitations, explicitly stating that the routing gate directs only the most challenging attack regimes to the penalized QUBO and that the observed improvement therefore reflects the hybrid design. We will also add a brief analysis of subset consistency (average pairwise distance within the selected m clients) for the routed cases to provide indirect evidence that the QUBO component contributes to the gain. However, new quantum-annealing runs with direct classical-solver comparisons at n=100 are not feasible within the revision timeline due to hardware-access constraints. revision: partial
Referee: [QUBO formulation and evaluation methodology] QUBO formulation and evaluation: performance is measured solely by end-to-end detection accuracy on fixed public datasets via direct comparison to MultiKrum; because the cost function directly encodes pairwise distances rather than being fitted to the target metric, the evaluation risks circularity and does not test whether the annealer actually returns higher-quality mutually consistent subsets than classical QUBO heuristics.

Authors: The QUBO objective is deliberately constructed to minimize the sum of pairwise Euclidean distances among the selected clients, thereby directly encoding the joint mutual-consistency criterion that distinguishes it from MultiKrum’s per-client greedy scoring. End-to-end detection accuracy is the appropriate primary metric for a Byzantine-resilience method. To address the circularity concern we will add a supplementary experiment that compares the internal consistency (mean pairwise distance and variance of the selected subset) achieved by the quantum annealer against two classical QUBO heuristics—simulated annealing and a greedy pairwise-distance baseline—on identical problem instances drawn from the 100-client MNIST runs. This will demonstrate that the annealer returns higher-quality mutually consistent subsets independent of the downstream detection task. revision: yes

standing simulated objections not resolved

Quantitative metrics on QUBO solution quality (energy gaps or direct comparisons versus classical solvers) for the n=100 regime, owing to hardware-access limitations that prevent new annealing runs within the revision period.

Circularity Check

0 steps flagged

No significant circularity; direct reformulation and empirical benchmarking

full rationale

The paper reformulates client selection as a QUBO by directly encoding pairwise distances between updates into the cost function, then solves for the mutually closest group of m clients. This is a standard mathematical mapping rather than a self-referential definition or fitted parameter. Detection performance is measured by direct comparison against MultiKrum on fixed public datasets (MNIST, CIFAR-10) at stated client scales, without tuning any QUBO coefficients or routing thresholds to the target accuracy numbers. The MultiSignal ensemble is introduced explicitly to mitigate acknowledged QUBO degradation at n=100, using Krum score gaps as routing features. No load-bearing self-citations, uniqueness theorems, or ansatzes from prior author work appear in the derivation. The chain is self-contained against external benchmarks and does not reduce any claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that honest client updates form tighter clusters than malicious ones and on the engineering choice of a fixed subset size m; no new physical entities or free parameters fitted to the final accuracy are introduced in the abstract.

free parameters (1)

subset size m
The number of clients to select is a modeling choice that must be set in advance and directly shapes the QUBO objective.

axioms (1)

domain assumption Honest updates are more similar to one another than to malicious updates under Euclidean and cosine distances.
This is the premise that allows pairwise distances to serve as a proxy for honesty inside the QUBO cost function.

pith-pipeline@v0.9.0 · 5915 in / 1406 out tokens · 64778 ms · 2026-05-20T19:52:31.908080+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose a quantum annealing approach that reformulates client selection as a Quadratic Unconstrained Binary Optimization (QUBO) problem, encoding pairwise distances into a cost function solved by quantum annealers (QA).
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The MultiSignal ensemble ... routes evasion attacks to a suspicion-penalized QUBO with agreement voting.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

[1]

Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), ser. Proceedings of Machine Learning Research, vol. 54, 2017, pp. 1273–1282

work page 2017
[2]

Advances and open problems in federated learning,

P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Benniset al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

work page 2021
[3]

Federated machine learning: Concept and applications,

Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 1–19, 2019

work page 2019
[4]

Machine learning with adversaries: Byzantine tolerant gradient descent,

P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” inAd- vances in Neural Information Processing Systems 30 (NeurIPS), 2017

work page 2017
[5]

A little is enough: Circum- venting defenses for distributed learning,

M. Baruch, G. Baruch, and Y . Goldberg, “A little is enough: Circum- venting defenses for distributed learning,” 2019

work page 2019
[6]

G. A. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. L ¨u, H. Wang, and Y . Wang,The Unconstrained Binary Quadratic Programming Prob- lem: A Survey. Annals of Operations Research, 2014, vol. 222

work page 2014
[7]

Adiabatic quantum computation and quantum annealing,

T. Albash and D. A. Lidar, “Adiabatic quantum computation and quantum annealing,”Reviews of Modern Physics, vol. 90, no. 1, p. 015002, 2018

work page 2018
[8]

MNIST handwritten digit database,

Y . LeCun and C. Cortes, “MNIST handwritten digit database,” http://yann.lecun.com/exdb/mnist/, 2010. [Online]. Available: http: //yann.lecun.com/exdb/mnist/

work page 2010
[9]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,”arXiv preprint arXiv:1708.07747, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[10]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Technical Report, 2009. [Online]. Available: https://www.cs.toronto.edu/∼kriz/cifar.html

work page 2009
[11]

Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,

A. Ferenczi, D. Wang, S. Samanta, and T. Hodges, “Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,” in2025 IEEE International Conference on Quantum Artificial Intelli- gence (QAI), Naples, Italy, 2025, pp. 49–54

work page 2025
[12]

The hid- den vulnerability of distributed learning in byzantium,

E. M. El Mhamdi, R. Guerraoui, A. Guirguis, and S. Rouault, “The hid- den vulnerability of distributed learning in byzantium,” inProceedings of the 1st Conference on Machine Learning and Systems (MLSys), 2018

work page 2018
[13]

Byzantine-robust distributed learning: Towards optimal statistical rates,

D. Yin, Y . Chen, K. Ramchandran, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” inProceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 5650–5659

work page 2018
[14]

DRACO: Byzantine-resilient distributed training via redundant gradients,

L. Chen, H. Wang, Z. Charles, and D. Papailiopoulos, “DRACO: Byzantine-resilient distributed training via redundant gradients,” inPro- ceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 903–912

work page 2018
[15]

FLAME: Taming backdoors in federated learning,

T. D. Nguyenet al., “FLAME: Taming backdoors in federated learning,” arXiv preprint arXiv:2101.02281, 2021

work page arXiv 2021
[16]

Sentinel: An aggregation function to secure decentralized federated learning,

C. Fenget al., “Sentinel: An aggregation function to secure decentralized federated learning,”arXiv preprint arXiv:2310.08097, 2023

work page arXiv 2023
[17]

Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,

Y . Xu and L. Lyu, “Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,”arXiv preprint arXiv:2304.00160, 2023

work page arXiv 2023
[18]

Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,

Z. Zhang, Q. Su, and X. Sun, “Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 339–354

work page 2022
[19]

Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,

X. Wei, L. Fan, Y . Guo, Y . Gong, Z. Han, and Y . Wang, “Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 6, pp. 6038–6051, 2024

work page 2024
[20]

Quantum assisted scheduling algorithm for federated learning in distributed networks,

X. Wei, L. Fan, Y . Guo, Z. Han, and Y . Wang, “Quantum assisted scheduling algorithm for federated learning in distributed networks,” in 2023 32nd International Conference on Computer Communications and Networks (ICCCN), 2023, pp. 1–10

work page 2023
[21]

Quantum-enhanced blockchain federated learning via quantum byzantine agreement,

H.-W. Liu, Z.-P. Liu, H.-L. Yin, and Z.-B. Chen, “Quantum-enhanced blockchain federated learning via quantum byzantine agreement,”Sci- ence China Information Sciences, vol. 68, 2025

work page 2025
[22]

Zhang, C

Y . Zhang, C. Zhang, C. Zhang, L. Fan, B. Zeng, and Q. Yang, “Federated learning with quantum secure aggregation,”arXiv preprint arXiv:2207.07444, 2022

work page arXiv 2022
[23]

Defending against byzantine attacks in quantum federated learning,

Q. Xia, Z. Tao, and Q. Li, “Defending against byzantine attacks in quantum federated learning,” inProceedings of the 17th International Conference on Mobility, Sensing and Networking (MSN), 2021, pp. 145– 152

work page 2021
[24]

Hybrid quantum enhanced fed- erated learning for cyber attack detection,

G. Subramanian and M. Chinnadurai, “Hybrid quantum enhanced fed- erated learning for cyber attack detection,”Scientific Reports, vol. 14, p. 32038, 2024

work page 2024
[25]

Ising formulations of many NP problems,

A. Lucas, “Ising formulations of many NP problems,”Frontiers in Physics, vol. 2, p. 5, 2014

work page 2014

[1] [1]

Communication-efficient learning of deep networks from decentralized data,

H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), ser. Proceedings of Machine Learning Research, vol. 54, 2017, pp. 1273–1282

work page 2017

[2] [2]

Advances and open problems in federated learning,

P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Benniset al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

work page 2021

[3] [3]

Federated machine learning: Concept and applications,

Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 1–19, 2019

work page 2019

[4] [4]

Machine learning with adversaries: Byzantine tolerant gradient descent,

P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” inAd- vances in Neural Information Processing Systems 30 (NeurIPS), 2017

work page 2017

[5] [5]

A little is enough: Circum- venting defenses for distributed learning,

M. Baruch, G. Baruch, and Y . Goldberg, “A little is enough: Circum- venting defenses for distributed learning,” 2019

work page 2019

[6] [6]

G. A. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. L ¨u, H. Wang, and Y . Wang,The Unconstrained Binary Quadratic Programming Prob- lem: A Survey. Annals of Operations Research, 2014, vol. 222

work page 2014

[7] [7]

Adiabatic quantum computation and quantum annealing,

T. Albash and D. A. Lidar, “Adiabatic quantum computation and quantum annealing,”Reviews of Modern Physics, vol. 90, no. 1, p. 015002, 2018

work page 2018

[8] [8]

MNIST handwritten digit database,

Y . LeCun and C. Cortes, “MNIST handwritten digit database,” http://yann.lecun.com/exdb/mnist/, 2010. [Online]. Available: http: //yann.lecun.com/exdb/mnist/

work page 2010

[9] [9]

Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,”arXiv preprint arXiv:1708.07747, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[10] [10]

Learning multiple layers of features from tiny images,

A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Technical Report, 2009. [Online]. Available: https://www.cs.toronto.edu/∼kriz/cifar.html

work page 2009

[11] [11]

Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,

A. Ferenczi, D. Wang, S. Samanta, and T. Hodges, “Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,” in2025 IEEE International Conference on Quantum Artificial Intelli- gence (QAI), Naples, Italy, 2025, pp. 49–54

work page 2025

[12] [12]

The hid- den vulnerability of distributed learning in byzantium,

E. M. El Mhamdi, R. Guerraoui, A. Guirguis, and S. Rouault, “The hid- den vulnerability of distributed learning in byzantium,” inProceedings of the 1st Conference on Machine Learning and Systems (MLSys), 2018

work page 2018

[13] [13]

Byzantine-robust distributed learning: Towards optimal statistical rates,

D. Yin, Y . Chen, K. Ramchandran, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” inProceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 5650–5659

work page 2018

[14] [14]

DRACO: Byzantine-resilient distributed training via redundant gradients,

L. Chen, H. Wang, Z. Charles, and D. Papailiopoulos, “DRACO: Byzantine-resilient distributed training via redundant gradients,” inPro- ceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 903–912

work page 2018

[15] [15]

FLAME: Taming backdoors in federated learning,

T. D. Nguyenet al., “FLAME: Taming backdoors in federated learning,” arXiv preprint arXiv:2101.02281, 2021

work page arXiv 2021

[16] [16]

Sentinel: An aggregation function to secure decentralized federated learning,

C. Fenget al., “Sentinel: An aggregation function to secure decentralized federated learning,”arXiv preprint arXiv:2310.08097, 2023

work page arXiv 2023

[17] [17]

Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,

Y . Xu and L. Lyu, “Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,”arXiv preprint arXiv:2304.00160, 2023

work page arXiv 2023

[18] [18]

Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,

Z. Zhang, Q. Su, and X. Sun, “Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 339–354

work page 2022

[19] [19]

Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,

X. Wei, L. Fan, Y . Guo, Y . Gong, Z. Han, and Y . Wang, “Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 6, pp. 6038–6051, 2024

work page 2024

[20] [20]

Quantum assisted scheduling algorithm for federated learning in distributed networks,

X. Wei, L. Fan, Y . Guo, Z. Han, and Y . Wang, “Quantum assisted scheduling algorithm for federated learning in distributed networks,” in 2023 32nd International Conference on Computer Communications and Networks (ICCCN), 2023, pp. 1–10

work page 2023

[21] [21]

Quantum-enhanced blockchain federated learning via quantum byzantine agreement,

H.-W. Liu, Z.-P. Liu, H.-L. Yin, and Z.-B. Chen, “Quantum-enhanced blockchain federated learning via quantum byzantine agreement,”Sci- ence China Information Sciences, vol. 68, 2025

work page 2025

[22] [22]

Zhang, C

Y . Zhang, C. Zhang, C. Zhang, L. Fan, B. Zeng, and Q. Yang, “Federated learning with quantum secure aggregation,”arXiv preprint arXiv:2207.07444, 2022

work page arXiv 2022

[23] [23]

Defending against byzantine attacks in quantum federated learning,

Q. Xia, Z. Tao, and Q. Li, “Defending against byzantine attacks in quantum federated learning,” inProceedings of the 17th International Conference on Mobility, Sensing and Networking (MSN), 2021, pp. 145– 152

work page 2021

[24] [24]

Hybrid quantum enhanced fed- erated learning for cyber attack detection,

G. Subramanian and M. Chinnadurai, “Hybrid quantum enhanced fed- erated learning for cyber attack detection,”Scientific Reports, vol. 14, p. 32038, 2024

work page 2024

[25] [25]

Ising formulations of many NP problems,

A. Lucas, “Ising formulations of many NP problems,”Frontiers in Physics, vol. 2, p. 5, 2014

work page 2014