pith. sign in

arxiv: 2605.15885 · v1 · pith:Y6L2GF54new · submitted 2026-05-15 · 💻 cs.CR

FedEDAuth -- Federated Embedding Distribution Authentication for Counterfeit IC Detection

Pith reviewed 2026-05-20 17:02 UTC · model grok-4.3

classification 💻 cs.CR
keywords federated learningcounterfeit IC detectiondata poisoningclient authenticationembedding distributionsByzantine attackshardware securityprivacy preservation
0
0 comments X

The pith

FedEDAuth detects all malicious clients in federated learning for counterfeit IC detection by authenticating embedding distributions against a golden reference.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FedEDAuth to make federated learning practical for collaborative counterfeit IC detection by addressing its vulnerability to data poisoning. It shows that clients can be verified at the embedding level using statistical checks against distributions from a trusted golden dataset, without any access to raw data or gradients. This matters because it enables multiple parties across the semiconductor supply chain to jointly train a model while preserving privacy and blocking attacks. If the method works as described, poisoned updates get filtered before aggregation, leaving a model that still classifies counterfeit chips accurately. The reported experiments with fifty participants under Byzantine attacks provide the supporting evidence for these steps.

Core claim

FedEDAuth is a lightweight embedding-level client authentication framework that detects and filters malicious participants before model aggregation in federated learning. It derives reference embedding distributions from a golden dataset and evaluates each client through outlier analysis, mean shift measurements, and micro-cluster behavior. In experimental settings with fifty distributed participants facing Byzantine data poisoning attacks, the framework identifies every poisoned client, producing a 100 percent malicious client detection rate. After the malicious clients are removed, the resulting federated model reaches 94.17 percent accuracy on counterfeit IC classification.

What carries the argument

Reference embedding distributions derived from a golden dataset, combined with outlier analysis, mean shift measurements, and micro-cluster checks to authenticate clients without raw data or gradient access.

If this is right

  • Secure collaboration across the semiconductor supply chain becomes feasible without exposing sensitive data or gradients.
  • The federated model retains 94.17 percent accuracy on counterfeit IC classification once malicious participants are removed.
  • The authentication steps integrate directly into standard federated learning pipelines before the aggregation step.
  • Byzantine data poisoning attacks can be mitigated at the embedding level rather than after model updates are combined.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Embedding-based checks of this kind could be tested in other federated learning settings that face poisoning risks, such as medical diagnostics or sensor networks.
  • If the checks remain reliable when the golden dataset distribution shifts slightly, the approach might reduce the need for fully trusted central coordinators.
  • Running the same pipeline on real supply-chain data with natural distribution differences would show whether the detection rate holds outside controlled experiments.

Load-bearing premise

The method assumes that embedding distributions from a golden dataset accurately represent honest client behavior and that the statistical checks can separate malicious updates even under varied attack strategies.

What would settle it

A case in which a malicious client generates embedding distributions that pass all three checks (outlier, mean shift, and micro-cluster) without detection, or a run in which filtering honest clients drops final classification accuracy below 90 percent.

Figures

Figures reproduced from arXiv: 2605.15885 by Dhruva Aklekar, Fareena Saqib, Minhaj Alam, Nahush Tambe, Naseeruddin Lodge, Sina Gholami, Vineet Chadalavada.

Figure 1
Figure 1. Figure 1: Federated Learning Framework Federated learning is increasingly being adopted in pri￾vacy sensitive fields such as healthcare, finance, semiconduc￾tor technologies, and autonomous vehicles [10]–[13]. Real world FL applications include Google Keyboard, which uses federated learning for on-device next word prediction [14]. Toyota applies FL to improve vehicle safety and perception models using distributed se… view at source ↗
Figure 2
Figure 2. Figure 2: FedEDAuth Framework surface texture were applied to increase diversity. Examples of the augmented samples are shown in figure 3, illustrating slight scratches, soft smudges, texture inconsistencies, and gradual fading that mimic naturally occurring counterfeit indicators. All images were maintained at a resolution of 64×64 pixels, as this was sufficient to preserve the relevant visual cues. To support mode… view at source ↗
Figure 3
Figure 3. Figure 3: Augmented Images B. Byzantine Data Poisoning Attack In this attack scenario, five out of the fifty clients are compromised and modify their entire local dataset. Both genuine and counterfeit IC images are embedded with a small visual trigger. In this setup, the trigger takes the form of faint scratch like patterns as illustrated in figure 4, enlarged in the figure for clarity. By imprinting the same subtle… view at source ↗
Figure 5
Figure 5. Figure 5: Suspicious score of 50 clients It is worth noting that across all experimental trials, no authentic client was incorrectly flagged as malicious, yielding a false positive rate of 0%. The clear separation visible in 5 between poisoned and authentic client suspicion scores sug￾gests that the chosen thresholds and metric weights are well￾calibrated for this threat scenario. However, we acknowledge that in dep… view at source ↗
Figure 6
Figure 6. Figure 6: Malicious client Density 90% vs 70% vs 50% [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
read the original abstract

The widespread of counterfeit integrated circuits (ICs) poses severe risks to the security, reliability, and trustworthiness of modern electronic systems. Federated learning (FL) offers a privacy-preserving paradigm for collaborative counterfeit detection across the semiconductor supply chain, but its vulnerability to byzantine data poisoning attacks limits practical deployment. This paper presents Federated Embedding Distribution Authentication (FedEDAuth), a lightweight, embedding level client authentication framework that detects and filters malicious participants before model aggregation. FedEDAuth leverages reference embedding distributions derived from a golden dataset and evaluates clients using outlier analysis, mean shift measurements, and micro-cluster behavior without requiring access to raw data or gradients. Integrated into standard FL pipelines, FedEDAuth consistently identifies all poisoned clients in experimental settings with 50 distributed participants under the byzantine data poisoning attack, achieving a 100% malicious client detection rate. After filtering, the federated model achieved a high counterfeit IC classification performance of 94.17% accuracy. These results not only validate FedEDAuth's effectiveness but also underscore the broader potential of secure, trustworthy FL frameworks as a critical advancement for next generation hardware security solutions, enabling robust, collaborative intelligence across the semiconductor supply chain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces FedEDAuth, a lightweight client authentication framework integrated into federated learning pipelines for counterfeit IC detection. It derives reference embedding distributions from a golden dataset and applies outlier analysis, mean-shift measurements, and micro-cluster checks to identify and filter malicious clients under byzantine data poisoning attacks, without requiring access to raw data or gradients. The central claims are a 100% malicious client detection rate across 50 distributed participants and a post-filtering counterfeit IC classification accuracy of 94.17%.

Significance. If the experimental results are substantiated with full details, the work could meaningfully advance secure and privacy-preserving federated learning for hardware security applications in the semiconductor supply chain. The embedding-level authentication approach is lightweight and avoids direct data exposure, addressing a practical vulnerability in collaborative counterfeit detection. The provision of reproducible code or parameter-free derivations is not evident from the description, which limits the assessed strength relative to papers that include such elements.

major comments (2)
  1. [Abstract] Abstract: the claim of 100% malicious client detection and 94.17% accuracy is presented without any description of the experimental setup, including attack model specifics, data distribution across the 50 clients, choice of outlier/mean-shift thresholds, baseline defenses, error bars, or statistical tests; this absence makes the central empirical claim impossible to evaluate for soundness.
  2. [Methods] Methods/Experimental section: the core assumption that reference distributions from the golden dataset reliably represent honest client behavior under varied attack strategies is load-bearing for the detection guarantees, yet no analysis of distribution shift, mimicry attacks, or sensitivity to threshold choices is provided to support it.
minor comments (2)
  1. [Introduction] The abstract and introduction would benefit from explicit comparison to prior FL defense techniques (e.g., Krum, Trimmed Mean) to clarify the novelty of the embedding-distribution approach.
  2. Notation for embedding distributions and the precise definitions of 'outlier analysis' and 'micro-cluster behavior' should be formalized with equations or pseudocode for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment below and indicate where revisions will be made to improve clarity and strengthen the presentation of results.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the claim of 100% malicious client detection and 94.17% accuracy is presented without any description of the experimental setup, including attack model specifics, data distribution across the 50 clients, choice of outlier/mean-shift thresholds, baseline defenses, error bars, or statistical tests; this absence makes the central empirical claim impossible to evaluate for soundness.

    Authors: We agree that the abstract would benefit from additional context on the experimental conditions to allow immediate evaluation of the reported figures. The full details on attack models, client data distributions, threshold selection, baseline comparisons, and evaluation metrics are provided in the Methods and Experimental Results sections. We will revise the abstract to incorporate a concise summary of the setup, including the number of participants, the byzantine poisoning attack considered, and the conditions under which the 100% detection rate and 94.17% accuracy were obtained. revision: yes

  2. Referee: [Methods] Methods/Experimental section: the core assumption that reference distributions from the golden dataset reliably represent honest client behavior under varied attack strategies is load-bearing for the detection guarantees, yet no analysis of distribution shift, mimicry attacks, or sensitivity to threshold choices is provided to support it.

    Authors: The reference distributions derived from the golden dataset form the basis of our outlier, mean-shift, and micro-cluster checks, and the experiments with 50 participants under byzantine data poisoning demonstrate consistent detection performance. We acknowledge that explicit sensitivity analysis and evaluation against mimicry or distribution-shift scenarios would provide further support. We will add a new subsection to the Experimental Results section that includes threshold sensitivity studies and additional experiments simulating mimicry attacks to directly address this point. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The FedEDAuth framework derives client authentication from reference embedding distributions sourced from an external golden dataset, applying standard statistical checks (outlier analysis, mean shift, micro-cluster behavior) without requiring raw data access. These components do not reduce to self-definition, fitted inputs renamed as predictions, or self-citation chains; the golden dataset functions as an independent external benchmark for honest client behavior. Experimental claims of 100% malicious client detection and 94.17% post-filter accuracy are presented as empirical outcomes rather than tautological results forced by the method's construction. No load-bearing uniqueness theorems or ansatzes are imported via self-citation. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

Review performed on abstract only; full methods, equations, and experimental details unavailable. Likely free parameters include detection thresholds for outliers and mean shift; axioms include that golden dataset embeddings represent honest behavior and that byzantine attacks manifest as detectable distribution shifts.

free parameters (1)
  • outlier and mean-shift thresholds
    Abstract implies tunable parameters for identifying malicious clients via distribution analysis; exact values and fitting process not described.
axioms (2)
  • domain assumption Reference embedding distributions from golden dataset accurately model honest client updates
    Central to the authentication step; invoked implicitly when using these distributions for outlier detection.
  • domain assumption Byzantine poisoning attacks produce detectable shifts in embedding space without raw data access
    Underpins the claim that micro-cluster and mean-shift checks suffice for 100% detection.

pith-pipeline@v0.9.0 · 5762 in / 1498 out tokens · 57662 ms · 2026-05-20T17:02:09.148247+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 1 internal anchor

  1. [1]

    Future fab,

    C. Mouli and W. Carriker, “Future fab,”IEEE Spectrum, vol. 44, no. 3, pp. 38–43, 2007

  2. [2]

    Communication-efficient learning of deep networks from decentralized data,

    B. McMahan, E. Moore, D. Ramage, S. Hampson, and B. A. y Arcas, “Communication-efficient learning of deep networks from decentralized data,” inArtificial intelligence and statistics. PMLR, 2017, pp. 1273– 1282

  3. [3]

    Counterfeit ic detection via federated learning: Exposure to byzantine data poisoning,

    N. Lodge, N. Tambe, D. Aklekar, and F. Saqib, “Counterfeit ic detection via federated learning: Exposure to byzantine data poisoning,” in2025 IEEE Physical Assurance and Inspection of Electronics (PAINE). IEEE, 2025, pp. 1–7

  4. [4]

    Impacts of machine learning on counterfeit ic detection and avoidance techniques,

    O. Aramoon and G. Qu, “Impacts of machine learning on counterfeit ic detection and avoidance techniques,” in2020 21st International Symposium on Quality Electronic Design (ISQED). IEEE, 2020, pp. 352–357

  5. [5]

    Enhancing counterfeit detection of integrated circuits through machine learning-assisted thz-tds analysis,

    C. Xi, N. Varshney, M. S. M. Khan, H. Dalir, and N. Asadizanjani, “Enhancing counterfeit detection of integrated circuits through machine learning-assisted thz-tds analysis,” inTerahertz, RF , Millimeter, and Submillimeter-Wave Technology and Applications XVII, vol. 12885. SPIE, 2024, pp. 64–72

  6. [6]

    Autodetect: Novel autoencoding architecture for counterfeit ic detection,

    C. Bhure, G. S. Nicholas, S. Ghosh, N. Asadi, and F. Saqib, “Autodetect: Novel autoencoding architecture for counterfeit ic detection,”Journal of Hardware and Systems Security, vol. 8, no. 2, pp. 113–132, 2024

  7. [7]

    Golden-free robust age estimation to triage recycled ics,

    V . R. Surabhi, P. Krishnamurthy, H. Amrouch, J. Henkel, R. Karri, and F. Khorrami, “Golden-free robust age estimation to triage recycled ics,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 42, no. 9, pp. 2839–2851, 2023

  8. [8]

    A rapid, non-destructive method to detect counterfeit integrated circuits using a resonant cavity system

    A. Nechiyil, R. Lee, and G. Chapman, “A rapid, non-destructive method to detect counterfeit integrated circuits using a resonant cavity system.” Instruments (2410-390X), vol. 8, no. 3, 2024

  9. [9]

    Learning assisted side channel delay test for detection of recycled ics,

    A. Vakil, F. Niknia, A. Mirzaeian, A. Sasan, and N. Karimi, “Learning assisted side channel delay test for detection of recycled ics,” in Proceedings of the 26th Asia and South Pacific Design Automation Conference, 2021, pp. 455–462

  10. [10]

    Federated learning with privacy-preserving and model ip-right-protection,

    Q. Yang, A. Huang, L. Fan, C. S. Chan, J. H. Lim, K. W. Ng, D. S. Ong, and B. Li, “Federated learning with privacy-preserving and model ip-right-protection,”Machine Intelligence Research, vol. 20, no. 1, pp. 19–37, 2023

  11. [11]

    Federated learning for connected and automated vehicles: A survey of existing approaches and challenges,

    V . P. Chellapandi, L. Yuan, C. G. Brinton, S. H. ˙Zak, and Z. Wang, “Federated learning for connected and automated vehicles: A survey of existing approaches and challenges,”IEEE Transactions on Intelligent Vehicles, vol. 9, no. 1, pp. 119–137, 2023

  12. [12]

    Federated learning: A survey on enabling technologies, protocols, and applications,

    M. Aledhari, R. Razzak, R. M. Parizi, and F. Saeed, “Federated learning: A survey on enabling technologies, protocols, and applications,”IEEE Access, vol. 8, pp. 140 699–140 725, 2020

  13. [13]

    Federated learning review: Fundamentals, enabling technologies, and future applications,

    S. Banabilah, M. Aloqaily, E. Alsayed, N. Malik, and Y . Jararweh, “Federated learning review: Fundamentals, enabling technologies, and future applications,”Information processing & management, vol. 59, no. 6, p. 103061, 2022

  14. [14]

    Federated Learning for Mobile Keyboard Prediction

    A. Hard, K. Rao, R. Mathews, S. Ramaswamy, F. Beaufays, S. Augen- stein, H. Eichner, C. Kiddon, and D. Ramage, “Federated learning for mobile keyboard prediction,”arXiv preprint arXiv:1811.03604, 2018

  15. [15]

    (2023) 8 applications of federated learning across the globe

    OpenSistemas. (2023) 8 applications of federated learning across the globe. Accessed: 2025-02-14. [Online]. Available: https://opensistemas. com/en/8-applications-of-federated-learning/

  16. [16]

    R. D. Caballar and C. Stryker. (2025) What is federated learning? IBM. Accessed: 2025-11-18. [Online]. Available: https://www.ibm.com/think/ topics/federated-learning

  17. [17]

    C. He, A. Nevarez, S. Avestimehr, and R. DeFauw. (2024) Federated learning on aws using fedml, amazon eks, and amazon sagemaker. https://aws.amazon.com/blogs/machine-learning/federated-learning-on - aws-using-fedml-amazon-eks-and-amazon-sagemaker/. Amazon Web Services. Accessed: 2025-11-18

  18. [18]

    An overview of trustworthy ai: advances in ip protection, privacy-preserving federated learning, security verification, and gai safety alignment,

    Y . Zheng, C.-H. Chang, S.-H. Huang, P.-Y . Chen, and S. Picek, “An overview of trustworthy ai: advances in ip protection, privacy-preserving federated learning, security verification, and gai safety alignment,”IEEE Journal on Emerging and Selected Topics in Circuits and Systems, 2024

  19. [19]

    Certified robustness to label-flipping attacks via randomized smoothing,

    E. Rosenfeld, E. Winston, P. Ravikumar, and Z. Kolter, “Certified robustness to label-flipping attacks via randomized smoothing,” in International Conference on Machine Learning. PMLR, 2020, pp. 8230–8241

  20. [20]

    Label sanitization against label flipping poisoning attacks,

    A. Paudice, L. Mu ˜noz-Gonz´alez, and E. C. Lupu, “Label sanitization against label flipping poisoning attacks,” inJoint European conference on machine learning and knowledge discovery in databases. Springer, 2018, pp. 5–15

  21. [21]

    Dba: Distributed backdoor attacks against federated learning,

    C. Xie, K. Huang, P.-Y . Chen, and B. Li, “Dba: Distributed backdoor attacks against federated learning,” inInternational conference on learn- ing representations, 2019

  22. [22]

    How to backdoor federated learning,

    E. Bagdasaryan, A. Veit, Y . Hua, D. Estrin, and V . Shmatikov, “How to backdoor federated learning,” inInternational conference on artificial intelligence and statistics. PMLR, 2020, pp. 2938–2948

  23. [23]

    Model poisoning attacks against distributed machine learning systems,

    R. Tomsett, K. Chan, and S. Chakraborty, “Model poisoning attacks against distributed machine learning systems,” inArtificial Intelligence and Machine Learning for Multi-Domain Operations Applications, vol. 11006. SPIE, 2019, pp. 481–489

  24. [24]

    Local model poisoning attacks to{Byzantine-Robust}federated learning,

    M. Fang, X. Cao, J. Jia, and N. Gong, “Local model poisoning attacks to{Byzantine-Robust}federated learning,” in29th USENIX security symposium (USENIX Security 20), 2020, pp. 1605–1622

  25. [25]

    Fall of empires: Breaking byzantine- tolerant sgd by inner product manipulation,

    C. Xie, O. Koyejo, and I. Gupta, “Fall of empires: Breaking byzantine- tolerant sgd by inner product manipulation,” inUncertainty in Artificial Intelligence. PMLR, 2020, pp. 261–270

  26. [26]

    A survey on federated learning: challenges and applications,

    J. Wen, Z. Zhang, Y . Lan, Z. Cui, J. Cai, and W. Zhang, “A survey on federated learning: challenges and applications,”International journal of machine learning and cybernetics, vol. 14, no. 2, pp. 513–535, 2023

  27. [27]

    Challenges and approaches for mitigating byzantine attacks in federated learning,

    J. Shi, W. Wan, S. Hu, J. Lu, and L. Y . Zhang, “Challenges and approaches for mitigating byzantine attacks in federated learning,” in 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom). IEEE, 2022, pp. 139– 146

  28. [28]

    Byzantine-robust dis- tributed learning: Towards optimal statistical rates,

    D. Yin, Y . Chen, R. Kannan, and P. Bartlett, “Byzantine-robust dis- tributed learning: Towards optimal statistical rates,” inInternational conference on machine learning. Pmlr, 2018, pp. 5650–5659

  29. [29]

    Ma- chine learning with adversaries: Byzantine tolerant gradient descent,

    P. Blanchard, E. M. El Mhamdi, R. Guerraoui, and J. Stainer, “Ma- chine learning with adversaries: Byzantine tolerant gradient descent,” Advances in neural information processing systems, vol. 30, 2017

  30. [30]

    Ic-chipnet: Deep embedding learning for fine-grained retrieval, recognition, and verification of microelectronic images,

    M. A. Reza and D. J. Crandall, “Ic-chipnet: Deep embedding learning for fine-grained retrieval, recognition, and verification of microelectronic images,” in2020 IEEE Applied Imagery Pattern Recognition Workshop (AIPR). IEEE, 2020, pp. 1–10