pith. sign in

arxiv: 2605.16438 · v1 · pith:ZHFMYAHDnew · submitted 2026-05-14 · 💻 cs.LG · cs.AI

Byzantine-Resilient Federated Learning via QUBO-Based Client Selection on Quantum Annealers

Pith reviewed 2026-05-20 19:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords Byzantine resilienceFederated learningQUBOQuantum annealingClient selectionMultiKrumMultiSignal ensemble
0
0 comments X

The pith

Reformulating client selection as a QUBO problem solved on quantum annealers improves detection of advanced Byzantine attacks over classical MultiKrum in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tries to show that turning client selection into a QUBO optimization task lets quantum annealers find a globally consistent set of honest updates by minimizing pairwise distances across all subsets at once. This joint approach is meant to catch sophisticated attacks that preserve local statistics and fool per-client scoring like MultiKrum. To scale beyond small client counts where pure QUBO degrades, the authors add a MultiSignal ensemble that routes attacks using gaps in Euclidean and cosine scores and applies a penalized QUBO variant with voting. A sympathetic reader would care because federated learning relies on untrusted clients yet current defenses leave gaps against evasion strategies that mimic honest gradients.

Core claim

Encoding pairwise distances between client updates into a QUBO cost function and solving it on quantum annealers finds the mutually closest group of m clients, outperforming MultiKrum on hard attacks such as Advanced LIE at 15 clients and, through the MultiSignal ensemble, reaching 95.3 percent average detection accuracy at 100 clients on MNIST versus 91.8 percent for classical MultiKrum.

What carries the argument

The QUBO cost function that encodes pairwise distances for joint optimization over client subsets, solved by quantum annealers to identify the most mutually consistent honest group rather than scoring clients individually.

Load-bearing premise

Encoding pairwise distances into a QUBO will produce a globally optimal set of mutually consistent honest clients and the quantum annealer will return high-quality solutions even as the number of clients grows beyond small scales.

What would settle it

A scaling experiment at 100-plus clients in which the pure QUBO or MultiSignal ensemble fails to exceed MultiKrum accuracy on Sparse Lie or Advanced Lie attacks.

Figures

Figures reproduced from arXiv: 2605.16438 by (2) Columbia University), Andras Ferenczi (1), Dagen Wang (1), Jason Qizhe Qin (2) ((1) American Express Co., Sutapa Samanta (1).

Figure 1
Figure 1. Figure 1: MNIST: Round-by-round detection under ALIE attack. [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: MNIST: Round-by-round detection under Sparse Lie attack. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 5
Figure 5. Figure 5: FashionMNIST: Round-by-round detection under ALIE attack. [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 4
Figure 4. Figure 4: CIFAR-10: Round-by-round detection under Sparse Lie attack. [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
read the original abstract

Federated Learning (FL) trains a global model across decentralized clients while preserving data privacy, but at scale it is vulnerable to malicious updates. Byzantine-resilient aggregation methods such as MultiKrum score gradients against their nearest neighbors and can miss malicious updates that preserve the statistical properties of honest ones. We propose a quantum annealing approach that reformulates client selection as a Quadratic Unconstrained Binary Optimization (QUBO) problem, encoding pairwise distances into a cost function solved by quantum annealers (QA). Unlike MultiKrum's greedy per-client scoring, the QUBO formulation jointly optimizes over all subsets to find the mutually closest group of $m$ clients. At small scale (15 clients), QUBO outperforms MultiKrum on the most challenging Byzantine attacks: e.g., Advanced LIE is detected with 95.11% accuracy versus 81.33% on MNIST and 97.78% versus 75.56% on CIFAR-10. QUBO fares poorly on simpler attacks where MultiKrum excels, so the two methods are complementary. QUBO quality also degrades as the number of clients grows. To address this, we introduce a MultiSignal ensemble that uses a dual-feature routing gate based on Euclidean and cosine Krum score gaps to classify attacks into four regimes and routes evasion attacks to a suspicion-penalized QUBO with agreement voting. At 100 clients on MNIST, MultiSignal achieves 95.3% average detection accuracy versus 91.8% for classical MultiKrum, with the largest gains on Sparse Lie (72.0% to 95.2%, +23.2 points) and Advanced Lie (80.4% to 85.2%, +4.8 points). These results show that QUBO-based quantum annealing with MultiSignal is a principled and scalable defense against the most challenging Byzantine strategies in federated learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper claims that reformulating Byzantine client selection in federated learning as a QUBO problem solved on quantum annealers outperforms greedy methods like MultiKrum on challenging attacks at small scale (15 clients), and that a MultiSignal ensemble with a dual-feature routing gate (Euclidean and cosine Krum score gaps) plus suspicion-penalized QUBO and agreement voting extends this to 100 clients, achieving 95.3% average detection accuracy on MNIST versus 91.8% for MultiKrum, with gains of +23.2 points on Sparse Lie and +4.8 on Advanced Lie.

Significance. If the experimental claims hold under rigorous controls, the work would demonstrate a principled way to leverage joint subset optimization via QUBO for Byzantine resilience, showing complementarity with classical methods and a scalable routing mechanism. The explicit encoding of mutual consistency in the cost function and the attack-regime routing are conceptually attractive strengths.

major comments (3)
  1. [Abstract / Experimental evaluation at scale] Abstract and large-scale results: the headline 95.3% vs 91.8% detection accuracies at 100 clients on MNIST are reported without error bars, hyper-parameter tuning protocols, or ablation of the routing gate and agreement voting; these omissions make it impossible to assess whether the gains are statistically reliable or driven by the QUBO component.
  2. [Large-scale experiments (100 clients)] Large-scale experiments: the manuscript states that QUBO solution quality degrades with client count and that the formulation jointly optimizes for the closest group of m clients, yet supplies no quantitative metrics (e.g., energy gap or solution quality vs. classical solvers) for n=100; without this, the +23.2 point gain on Sparse Lie cannot be confidently attributed to quantum annealing rather than the classical routing and voting logic.
  3. [QUBO formulation and evaluation methodology] QUBO formulation and evaluation: performance is measured solely by end-to-end detection accuracy on fixed public datasets via direct comparison to MultiKrum; because the cost function directly encodes pairwise distances rather than being fitted to the target metric, the evaluation risks circularity and does not test whether the annealer actually returns higher-quality mutually consistent subsets than classical QUBO heuristics.
minor comments (2)
  1. [Abstract] Abstract: the statement that 'QUBO fares poorly on simpler attacks' is left without the corresponding numerical values or attack names, reducing completeness.
  2. [MultiSignal ensemble description] Notation: the definition of the suspicion-penalized QUBO objective and the precise four-regime classification logic of the routing gate would benefit from an explicit equation or pseudocode block for reproducibility.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, outlining planned revisions where appropriate to strengthen the experimental reporting and clarify the role of the QUBO component.

read point-by-point responses
  1. Referee: [Abstract / Experimental evaluation at scale] Abstract and large-scale results: the headline 95.3% vs 91.8% detection accuracies at 100 clients on MNIST are reported without error bars, hyper-parameter tuning protocols, or ablation of the routing gate and agreement voting; these omissions make it impossible to assess whether the gains are statistically reliable or driven by the QUBO component.

    Authors: We agree that the current presentation lacks error bars, explicit hyper-parameter tuning details, and ablations, which limits assessment of reliability and component contributions. In the revised manuscript we will report mean detection accuracies with standard deviations computed over five independent random seeds for both the 15-client and 100-client regimes. We will also add a dedicated paragraph describing the grid search used to select routing-gate thresholds (Euclidean and cosine Krum-score gaps) and the suspicion penalty coefficient in the QUBO objective. Finally, we will include a new ablation table that isolates the routing gate and the agreement-voting step, showing their incremental effect on the Sparse Lie and Advanced Lie gains. These additions will make it possible to evaluate statistical significance and the specific contribution of the QUBO-based selection. revision: yes

  2. Referee: [Large-scale experiments (100 clients)] Large-scale experiments: the manuscript states that QUBO solution quality degrades with client count and that the formulation jointly optimizes for the closest group of m clients, yet supplies no quantitative metrics (e.g., energy gap or solution quality vs. classical solvers) for n=100; without this, the +23.2 point gain on Sparse Lie cannot be confidently attributed to quantum annealing rather than the classical routing and voting logic.

    Authors: We acknowledge that the manuscript does not supply energy-gap or solver-comparison metrics at n=100. The reported degradation statement is qualitative, and the +23.2-point gain on Sparse Lie is achieved by the full MultiSignal ensemble rather than by a standalone QUBO solve. In revision we will expand the discussion of scale limitations, explicitly stating that the routing gate directs only the most challenging attack regimes to the penalized QUBO and that the observed improvement therefore reflects the hybrid design. We will also add a brief analysis of subset consistency (average pairwise distance within the selected m clients) for the routed cases to provide indirect evidence that the QUBO component contributes to the gain. However, new quantum-annealing runs with direct classical-solver comparisons at n=100 are not feasible within the revision timeline due to hardware-access constraints. revision: partial

  3. Referee: [QUBO formulation and evaluation methodology] QUBO formulation and evaluation: performance is measured solely by end-to-end detection accuracy on fixed public datasets via direct comparison to MultiKrum; because the cost function directly encodes pairwise distances rather than being fitted to the target metric, the evaluation risks circularity and does not test whether the annealer actually returns higher-quality mutually consistent subsets than classical QUBO heuristics.

    Authors: The QUBO objective is deliberately constructed to minimize the sum of pairwise Euclidean distances among the selected clients, thereby directly encoding the joint mutual-consistency criterion that distinguishes it from MultiKrum’s per-client greedy scoring. End-to-end detection accuracy is the appropriate primary metric for a Byzantine-resilience method. To address the circularity concern we will add a supplementary experiment that compares the internal consistency (mean pairwise distance and variance of the selected subset) achieved by the quantum annealer against two classical QUBO heuristics—simulated annealing and a greedy pairwise-distance baseline—on identical problem instances drawn from the 100-client MNIST runs. This will demonstrate that the annealer returns higher-quality mutually consistent subsets independent of the downstream detection task. revision: yes

standing simulated objections not resolved
  • Quantitative metrics on QUBO solution quality (energy gaps or direct comparisons versus classical solvers) for the n=100 regime, owing to hardware-access limitations that prevent new annealing runs within the revision period.

Circularity Check

0 steps flagged

No significant circularity; direct reformulation and empirical benchmarking

full rationale

The paper reformulates client selection as a QUBO by directly encoding pairwise distances between updates into the cost function, then solves for the mutually closest group of m clients. This is a standard mathematical mapping rather than a self-referential definition or fitted parameter. Detection performance is measured by direct comparison against MultiKrum on fixed public datasets (MNIST, CIFAR-10) at stated client scales, without tuning any QUBO coefficients or routing thresholds to the target accuracy numbers. The MultiSignal ensemble is introduced explicitly to mitigate acknowledged QUBO degradation at n=100, using Krum score gaps as routing features. No load-bearing self-citations, uniqueness theorems, or ansatzes from prior author work appear in the derivation. The chain is self-contained against external benchmarks and does not reduce any claimed result to its own inputs by construction.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that honest client updates form tighter clusters than malicious ones and on the engineering choice of a fixed subset size m; no new physical entities or free parameters fitted to the final accuracy are introduced in the abstract.

free parameters (1)
  • subset size m
    The number of clients to select is a modeling choice that must be set in advance and directly shapes the QUBO objective.
axioms (1)
  • domain assumption Honest updates are more similar to one another than to malicious updates under Euclidean and cosine distances.
    This is the premise that allows pairwise distances to serve as a proxy for honesty inside the QUBO cost function.

pith-pipeline@v0.9.0 · 5915 in / 1406 out tokens · 64778 ms · 2026-05-20T19:52:31.908080+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages · 1 internal anchor

  1. [1]

    Communication-efficient learning of deep networks from decentralized data,

    H. B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inProceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), ser. Proceedings of Machine Learning Research, vol. 54, 2017, pp. 1273–1282

  2. [2]

    Advances and open problems in federated learning,

    P. Kairouz, H. B. McMahan, B. Avent, A. Bellet, M. Benniset al., “Advances and open problems in federated learning,”Foundations and Trends in Machine Learning, vol. 14, no. 1–2, pp. 1–210, 2021

  3. [3]

    Federated machine learning: Concept and applications,

    Q. Yang, Y . Liu, T. Chen, and Y . Tong, “Federated machine learning: Concept and applications,”ACM Transactions on Intelligent Systems and Technology, vol. 10, no. 2, pp. 1–19, 2019

  4. [4]

    Machine learning with adversaries: Byzantine tolerant gradient descent,

    P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Machine learning with adversaries: Byzantine tolerant gradient descent,” inAd- vances in Neural Information Processing Systems 30 (NeurIPS), 2017

  5. [5]

    A little is enough: Circum- venting defenses for distributed learning,

    M. Baruch, G. Baruch, and Y . Goldberg, “A little is enough: Circum- venting defenses for distributed learning,” 2019

  6. [6]

    G. A. Kochenberger, J.-K. Hao, F. Glover, M. Lewis, Z. L ¨u, H. Wang, and Y . Wang,The Unconstrained Binary Quadratic Programming Prob- lem: A Survey. Annals of Operations Research, 2014, vol. 222

  7. [7]

    Adiabatic quantum computation and quantum annealing,

    T. Albash and D. A. Lidar, “Adiabatic quantum computation and quantum annealing,”Reviews of Modern Physics, vol. 90, no. 1, p. 015002, 2018

  8. [8]

    MNIST handwritten digit database,

    Y . LeCun and C. Cortes, “MNIST handwritten digit database,” http://yann.lecun.com/exdb/mnist/, 2010. [Online]. Available: http: //yann.lecun.com/exdb/mnist/

  9. [9]

    Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms

    H. Xiao, K. Rasul, and R. V ollgraf, “Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms,”arXiv preprint arXiv:1708.07747, 2017

  10. [10]

    Learning multiple layers of features from tiny images,

    A. Krizhevsky, “Learning multiple layers of features from tiny images,” University of Toronto, Technical Report, 2009. [Online]. Available: https://www.cs.toronto.edu/∼kriz/cifar.html

  11. [11]

    Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,

    A. Ferenczi, D. Wang, S. Samanta, and T. Hodges, “Beyond classical multi-krum: Enhancing byzantine detection with quantum embeddings,” in2025 IEEE International Conference on Quantum Artificial Intelli- gence (QAI), Naples, Italy, 2025, pp. 49–54

  12. [12]

    The hid- den vulnerability of distributed learning in byzantium,

    E. M. El Mhamdi, R. Guerraoui, A. Guirguis, and S. Rouault, “The hid- den vulnerability of distributed learning in byzantium,” inProceedings of the 1st Conference on Machine Learning and Systems (MLSys), 2018

  13. [13]

    Byzantine-robust distributed learning: Towards optimal statistical rates,

    D. Yin, Y . Chen, K. Ramchandran, and P. Bartlett, “Byzantine-robust distributed learning: Towards optimal statistical rates,” inProceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 5650–5659

  14. [14]

    DRACO: Byzantine-resilient distributed training via redundant gradients,

    L. Chen, H. Wang, Z. Charles, and D. Papailiopoulos, “DRACO: Byzantine-resilient distributed training via redundant gradients,” inPro- ceedings of the 35th International Conference on Machine Learning (ICML), vol. 80, 2018, pp. 903–912

  15. [15]

    FLAME: Taming backdoors in federated learning,

    T. D. Nguyenet al., “FLAME: Taming backdoors in federated learning,” arXiv preprint arXiv:2101.02281, 2021

  16. [16]

    Sentinel: An aggregation function to secure decentralized federated learning,

    C. Fenget al., “Sentinel: An aggregation function to secure decentralized federated learning,”arXiv preprint arXiv:2310.08097, 2023

  17. [17]

    Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,

    Y . Xu and L. Lyu, “Secure federated learning against model poisoning attacks via cosine similarity-based outlier detection,”arXiv preprint arXiv:2304.00160, 2023

  18. [18]

    Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,

    Z. Zhang, Q. Su, and X. Sun, “Dim-krum: Backdoor-resistant federated learning for NLP with dimension-wise krum-based aggregation,” in Findings of the Association for Computational Linguistics: EMNLP 2022, 2022, pp. 339–354

  19. [19]

    Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,

    X. Wei, L. Fan, Y . Guo, Y . Gong, Z. Han, and Y . Wang, “Hy- brid quantum–classical benders’ decomposition for federated learning scheduling in distributed networks,”IEEE Transactions on Network Science and Engineering, vol. 11, no. 6, pp. 6038–6051, 2024

  20. [20]

    Quantum assisted scheduling algorithm for federated learning in distributed networks,

    X. Wei, L. Fan, Y . Guo, Z. Han, and Y . Wang, “Quantum assisted scheduling algorithm for federated learning in distributed networks,” in 2023 32nd International Conference on Computer Communications and Networks (ICCCN), 2023, pp. 1–10

  21. [21]

    Quantum-enhanced blockchain federated learning via quantum byzantine agreement,

    H.-W. Liu, Z.-P. Liu, H.-L. Yin, and Z.-B. Chen, “Quantum-enhanced blockchain federated learning via quantum byzantine agreement,”Sci- ence China Information Sciences, vol. 68, 2025

  22. [22]

    Zhang, C

    Y . Zhang, C. Zhang, C. Zhang, L. Fan, B. Zeng, and Q. Yang, “Federated learning with quantum secure aggregation,”arXiv preprint arXiv:2207.07444, 2022

  23. [23]

    Defending against byzantine attacks in quantum federated learning,

    Q. Xia, Z. Tao, and Q. Li, “Defending against byzantine attacks in quantum federated learning,” inProceedings of the 17th International Conference on Mobility, Sensing and Networking (MSN), 2021, pp. 145– 152

  24. [24]

    Hybrid quantum enhanced fed- erated learning for cyber attack detection,

    G. Subramanian and M. Chinnadurai, “Hybrid quantum enhanced fed- erated learning for cyber attack detection,”Scientific Reports, vol. 14, p. 32038, 2024

  25. [25]

    Ising formulations of many NP problems,

    A. Lucas, “Ising formulations of many NP problems,”Frontiers in Physics, vol. 2, p. 5, 2014