pith. sign in

arxiv: 2605.18892 · v1 · pith:ZHPMXCQ2new · submitted 2026-05-17 · 💻 cs.LG · cs.AI· cs.DC

Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning

Pith reviewed 2026-05-20 13:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.DC
keywords federated learningclient contribution estimationlogit maximizationdata-free aggregationclass imbalancenon-IID dataevidence matrix
0
0 comments X

The pith

A data-free logit maximization approach estimates client contributions class by class in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes CELM, a framework for federated learning that estimates each client's contribution to different classes using only logit outputs from their model updates. By assembling a cross-client evidence matrix from these scores, the server can assign higher weights to clients that provide strong evidence for underrepresented classes. This leads to improved model performance under class imbalance and non-IID data distributions without any data sharing. The method maintains stability through simplex constraints and momentum smoothing, making it easy to integrate into standard FL pipelines. Experiments on vision benchmarks with controlled imbalances show better robustness compared to standard aggregation.

Core claim

The central claim is that probing client updates for class-wise logit-based evidence scores allows construction of a matrix that quantifies per-class competence and coverage, enabling contribution weights that upweight clients strong on minority classes and yielding better aggregation in imbalanced federated settings.

What carries the argument

The cross-client evidence matrix, built from class-wise evidence scores derived via logit maximization on probed client updates, which quantifies competence and coverage to compute contribution weights.

If this is right

  • Improves robustness to class imbalance and statistical heterogeneity in federated learning.
  • Enhances performance on minority classes without requiring data exchange or auxiliary datasets.
  • Produces stable aggregation through simplex constraints and momentum smoothing.
  • Remains compatible with existing federated learning training pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar logit-based probing could be tested in other privacy-sensitive distributed settings like edge computing.
  • Extending the evidence matrix to handle dynamic client participation might further improve long-term training stability.
  • The approach suggests that model outputs alone carry sufficient signal for contribution estimation, which could apply to other modalities if logits generalize well.

Load-bearing premise

Logit outputs from client models can reliably indicate per-class competence and class coverage without access to raw data or labels.

What would settle it

An experiment showing no correlation between the computed evidence scores and actual per-class accuracy improvements on held-out minority class data would falsify the method's effectiveness.

Figures

Figures reproduced from arXiv: 2605.18892 by Asim Ukaye, Karthik Nandakumar, Mubarak Abdu-Aguye, Nurbek Tastan.

Figure 1
Figure 1. Figure 1: Overview of CELM. During the initial warm-up communication rounds, the server probes each client update using class-wise logit maximization, constructs a de￾biased client–class evidence matrix, and converts normalized class-wise evidence into contribution weights for aggregation. After warm-up, these contribution weights are frozen and reused for the remaining rounds to stabilize training and reduce overhe… view at source ↗
Figure 2
Figure 2. Figure 2: Visualization of the non-IID split settings used in our experiments. Each bubble plot encodes the client-class label allocation matrix: the bubble size is proportional to the number of samples of a given class held by a given client. The top marginal summarizes per-class totals across all clients, while the right marginal summarizes per￾client totals across all classes. After Tw, the server broadcasts the … view at source ↗
Figure 3
Figure 3. Figure 3: Maverick versus free-rider client patterns. Bubble area is proportional to class sample count for each client–class pair. (a) A Maverick client contributes distinctive class evidence; (b) a free-rider client contributes weak or non-informative class signal; (c) the mixed setting used to evaluate whether aggregation methods can emphasize informative rare-class clients while suppressing non-contributors [PI… view at source ↗
Figure 4
Figure 4. Figure 4: FPR-versus-threshold curves for z-score-based free-rider detection. A client is flagged as a free-rider when its standardized contribution score falls below a threshold. (a) FashionMNIST under the FRM setting (free-rider + Maverick), and (b) CIFAR-10 under the FR setting (single free-rider). Lower curves indicate more robust detection over a broad threshold range [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Estimation fidelity of CELM on FedISIC dataset. Panels (a) and (b) compare the true and CELM-estimated client class distributions, while panels (c) and (d) com￾pare client-level marginals and the aggregated global class marginals. The close visual agreement indicates that CELM preserves both per-client class tendencies and cohort￾level class prevalence without accessing raw client data. 7 Conclusion & Limi… view at source ↗
Figure 6
Figure 6. Figure 6: Sensitivity of CELM to LM optimization steps L and LM learning rate η. Each heatmap cell reports final test accuracy (%) for a specific (L, η) pair across represen￾tative non-IID splits on FashionMNIST and CIFAR-10. longer warm-up improves performance, consistent with stronger class asymme￾try requiring more evidence collection. To balance accuracy and computational overhead, we set Tw = 0.05T in the main … view at source ↗
Figure 7
Figure 7. Figure 7: Final warm-up LM probe images (Xfinal) on FashionMNIST (Pure Label Skew). Each row is a client, and each column is a target class; red outlines indicate classes present in that client’s local data. Present-class cells often exhibit more distinct re￾sponse patterns, while the images remain abstract and non-semantic, indicating useful class evidence with no visual privacy leakage [PITH_FULL_IMAGE:figures/fu… view at source ↗
read the original abstract

Federated learning (FL) enables collaborative learning of computer vision models, where privacy and regulatory constraints prevent centralizing data across devices or organizations. However, practical FL deployments often exhibit severe class imbalance and label skew, causing standard aggregation protocols to overfit dominant clients and degrade minority-class performance. We propose a data-free, class-wise contribution estimation and aggregation framework based on logit maximization (CELM) that does not require sharing raw data, client metadata, or auxiliary public datasets. The FL server probes client updates to obtain class-wise evidence scores and assembles a cross-client evidence matrix, which quantifies both per-class competence and class coverage. Using this matrix, we compute contribution weights that upweight clients providing strong, discriminative evidence for underrepresented classes. The resulting aggregation is stable due to simplex constraints and momentum smoothing, and it remains compatible with standard FL training pipelines. We evaluate the approach on representative vision benchmarks under controlled non-IID and pathological label splits, demonstrating that CELM-based aggregation improves robustness to imbalance and statistical heterogeneity, while yielding better performance without requiring any additional data exchange.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes CELM, a data-free client contribution estimation and aggregation framework for federated learning. The server probes client updates via logit maximization to derive class-wise evidence scores, assembles a cross-client evidence matrix quantifying per-class competence and coverage, computes contribution weights to upweight clients strong on underrepresented classes, and applies the weights with simplex constraints and momentum smoothing. The approach is evaluated on vision benchmarks under controlled non-IID and pathological label splits, claiming improved robustness to imbalance and heterogeneity without raw data sharing or auxiliary datasets.

Significance. If the logit-derived evidence scores can be shown to correlate with actual per-class discriminative power on private client distributions, the method would provide a practical, privacy-preserving alternative to existing contribution estimation techniques that often rely on public data or metadata, potentially improving FL performance on imbalanced real-world deployments.

major comments (2)
  1. [Method description of logit probing and evidence matrix construction] The core premise that class-wise evidence scores from logit maximization on client updates reliably quantify per-class competence and coverage (without raw data or labels) is load-bearing for the contribution weights and robustness claims, yet the manuscript provides no derivation or empirical validation of this correlation. If the maximization operates solely on parameters without reference to a data manifold, the matrix may capture initialization artifacts rather than discriminative power, as noted in the stress-test concern.
  2. [Evaluation section] The evaluation claims performance gains on representative vision benchmarks under non-IID and pathological splits, but the abstract and available text lack specific quantitative results, baseline comparisons, ablation studies on the evidence matrix, or error analysis to substantiate the robustness improvements.
minor comments (1)
  1. [Abstract and method overview] Clarify the exact inputs (if any) used during logit maximization, as the data-free claim is central but potentially ambiguous given that logit computation normally requires samples.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the revisions planned for the next version.

read point-by-point responses
  1. Referee: [Method description of logit probing and evidence matrix construction] The core premise that class-wise evidence scores from logit maximization on client updates reliably quantify per-class competence and coverage (without raw data or labels) is load-bearing for the contribution weights and robustness claims, yet the manuscript provides no derivation or empirical validation of this correlation. If the maximization operates solely on parameters without reference to a data manifold, the matrix may capture initialization artifacts rather than discriminative power, as noted in the stress-test concern.

    Authors: We agree that the current manuscript would benefit from an explicit derivation linking logit maximization to per-class competence. The process optimizes a synthetic input to maximize the logit for each class on the client's model parameters after local training; this serves as a proxy for how strongly the model has internalized discriminative features for that class from its private data. In the revision we will add a short theoretical subsection deriving this correlation under the assumption that local SGD aligns the model with the client's data distribution. We will also include empirical correlation analysis between the resulting evidence scores and class-wise accuracies computed on client-held validation splits. For the initialization artifact concern, we will report additional stress-test experiments using varied random initializations to confirm stability of the evidence matrix. revision: yes

  2. Referee: [Evaluation section] The evaluation claims performance gains on representative vision benchmarks under non-IID and pathological splits, but the abstract and available text lack specific quantitative results, baseline comparisons, ablation studies on the evidence matrix, or error analysis to substantiate the robustness improvements.

    Authors: The full experimental section already reports quantitative results on vision benchmarks (CIFAR-10, CIFAR-100, MNIST) under controlled non-IID and pathological partitions, with comparisons to FedAvg, FedProx and other contribution-aware baselines, plus ablations that isolate the evidence-matrix components. To address the referee's observation, we will (i) revise the abstract to include the main accuracy and robustness gains, (ii) add a dedicated ablation subsection focused on the evidence matrix, and (iii) include error-bar analysis and failure-case discussion in the main text or appendix. revision: yes

Circularity Check

0 steps flagged

No circularity: contribution weights derived from independent logit-probing construction

full rationale

The paper's central derivation computes class-wise evidence scores by probing client model updates via logit maximization, assembles these into a cross-client evidence matrix that quantifies per-class competence and coverage, and then derives contribution weights from the matrix under simplex and momentum constraints. This chain introduces a new data-free probing step whose outputs are not defined in terms of the final aggregation performance or fitted to the target metrics; the matrix construction and weighting formulas stand as independent operations on the probed logits. No equations reduce a prediction to a fitted input by construction, no load-bearing premise rests solely on self-citation, and no uniqueness theorem or ansatz is smuggled in from prior author work. Evaluation on vision benchmarks under non-IID splits provides external empirical checks rather than tautological confirmation, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The method rests on domain assumptions about logit probing and introduces a new evidence matrix construct; limited free parameters mentioned such as momentum smoothing.

free parameters (1)
  • momentum smoothing parameter
    Used to stabilize aggregation weights; value not specified but implied as a tunable element for stability.
axioms (1)
  • domain assumption Client model updates can be probed by the server to extract class-wise logit evidence without accessing raw data or labels.
    Foundational to the data-free property and evidence score computation.
invented entities (1)
  • cross-client evidence matrix no independent evidence
    purpose: Quantifies per-class competence and class coverage across clients to compute contribution weights.
    New structure assembled from probed logits to enable class-wise weighting.

pith-pipeline@v0.9.0 · 5732 in / 1272 out tokens · 51960 ms · 2026-05-20T13:47:34.767962+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

  1. [1]

    University of Montreal1341(3), 1 (2009)

    Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. University of Montreal1341(3), 1 (2009)

  2. [2]

    In: International conference on artificial intelligence and statistics

    Fraboni, Y., Vidal, R., Lorenzi, M.: Free-rider attacks on model aggregation in fed- erated learning. In: International conference on artificial intelligence and statistics. pp. 1846–1854. PMLR (2021)

  3. [3]

    In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data

    He, J., Wu, L., Zhang, Z., Lu, N., Wei, X.: Client evaluation and revision in fed- erated learning: Towards defending free-riders and promoting fairness. In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data. pp. 100–118. Springer (2024)

  4. [4]

    In: AAAI

    Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Tackling mavericks in federated learning via adaptive client selection strategy. In: AAAI. vol. 2023 (2022)

  5. [5]

    In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining

    Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Maverick matters: Client con- tribution and selection in federated learning. In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining. pp. 269–282. Springer (2023)

  6. [6]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Jiang, M., Roth, H.R., Li, W., Yang, D., Zhao, C., Nath, V., Xu, D., Dou, Q., Xu, Z.: Fair federated medical image segmentation via client contribution estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16302–16311 (2023)

  7. [7]

    Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

    Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., et al.: Advances and open problems in federated learning. Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

  8. [8]

    In: International conference on machine learning

    Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: Stochastic controlled averaging for federated learning. In: International conference on machine learning. pp. 5132–5143. PMLR (2020)

  9. [9]

    In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

    Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10713– 10722 (2021)

  10. [10]

    In: Dhillon, I., Papailiopoulos, D., Sze, V

    Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Feder- ated optimization in heterogeneous networks. In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems. vol. 2, pp. 429– 450 (2020),https://proceedings.mlsys.org/paper_files/paper/2020/file/ 1f5fe83998a09396ebe6477d9475ba0c-Paper.pdf

  11. [11]

    In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

    Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data. In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

  12. [12]

    Lyu, L., Xu, X., Wang, Q.: Collaborative Fairness in Federated Learning (Aug 2020).https://doi.org/10.48550/arXiv.2008.12161

  13. [13]

    International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

    Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

  14. [14]

    MIT press (1999)

    Manning, C., Schutze, H.: Foundations of statistical natural language processing. MIT press (1999)

  15. [15]

    In: Artificial intelligence and statistics

    McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. pp. 1273–1282. PMLR (2017)

  16. [16]

    Encyclopedia of mathematics78(2001)

    Nikulin, M.S., et al.: Hellinger distance. Encyclopedia of mathematics78(2001)

  17. [17]

    Distill2(11), e7 (2017)

    Olah, C., Mordvintsev, A., Schubert, L.: Feature visualization. Distill2(11), e7 (2017)

  18. [18]

    In: Sixth international conference on computer vision (IEEE Cat

    Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). pp. 59–66. IEEE (1998)

  19. [19]

    Deep Inside Convolutional Networks2(2) (2014)

    Simonyan, K., Vedaldi, A., Zisserman, A.: Visualising image classification models and saliency maps. Deep Inside Convolutional Networks2(2) (2014)

  20. [20]

    In: Larson, K

    Tastan, N., Fares, S., Aremu, T., Horváth, S., Nandakumar, K.: Redefining Con- tributions: Shapley-Driven Federated Learning. In: Larson, K. (ed.) Proceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24. pp. 5009–5017. International Joint Conferences on Artificial Intelli- gence Organization (8 2024).https:/ /...

  21. [21]

    In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J

    Tastan, N., Horváth, S., Nandakumar, K.: Aequa: Fair Model Rewards in Col- laborative Learning via Slimmable Networks. In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J. (eds.) Pro- ceedings of the 42nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 267, p...

  22. [22]

    Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

    Tastan, N., Horváth, S., Nandakumar, K.: CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning. Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

  23. [23]

    In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z

    Tastan, N., Horváth, S., Takáč, M., Nandakumar, K.: Fedpews: Personalized warmup via subnetworks for enhanced heterogeneous federated learning. In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z. (eds.) Conference on Parsimony and Learning. Proceedings of Machine Learning Research, vol. 280, pp. 462–483. PMLR (24–27 Mar 2025),https://...

  24. [24]

    Advances in neural information processing systems33, 7611–7623 (2020)

    Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective in- consistency problem in heterogeneous federated optimization. Advances in neural information processing systems33, 7611–7623 (2020)

  25. [25]

    In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

    Xu, X., Lyu, L.: A reputation mechanism is all you need: Collaborative fairness and adversarial robustness in federated learning. In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

  26. [26]

    In: Advances in Neural Information Processing Systems

    Xu, X., Lyu, L., Ma, X., Miao, C., Foo, C.S., Low, B.K.H.: Gradient Driven Re- wards to Guarantee Fairness in Collaborative Machine Learning. In: Advances in Neural Information Processing Systems. vol. 34, pp. 16104–16117. Curran Asso- ciates, Inc. (2021)

  27. [27]

    Transactions on Machine Learning Re- search

    Yang, M., Buyukates, B., Markopoulou, A.: Rewarding the rare: Maverick-aware shapley valuation in federated learning. Transactions on Machine Learning Re- search

  28. [28]

    Understanding Neural Networks Through Deep Visualization

    Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015) Data-free Contribution Estimation via Logit Maximization 17

  29. [29]

    In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al

    Zhuang, W., Xu, J., Chen, C., Li, J., Lyu, L.: COALA: A practical and vision- centric federated learning platform. In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al. A Algorithm Algorithm 1CELM: Contribution Estimation from Logit Maximization Input:Initial global modelw (0) g ...