Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning

Asim Ukaye; Karthik Nandakumar; Mubarak Abdu-Aguye; Nurbek Tastan

arxiv: 2605.18892 · v1 · pith:ZHPMXCQ2new · submitted 2026-05-17 · 💻 cs.LG · cs.AI· cs.DC

Data-Free Client Contribution Estimation via Logit Maximization for Federated Learning

Asim Ukaye , Nurbek Tastan , Mubarak Abdu-Aguye , Karthik Nandakumar This is my paper

Pith reviewed 2026-05-20 13:47 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.DC

keywords federated learningclient contribution estimationlogit maximizationdata-free aggregationclass imbalancenon-IID dataevidence matrix

0 comments

The pith

A data-free logit maximization approach estimates client contributions class by class in federated learning.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes CELM, a framework for federated learning that estimates each client's contribution to different classes using only logit outputs from their model updates. By assembling a cross-client evidence matrix from these scores, the server can assign higher weights to clients that provide strong evidence for underrepresented classes. This leads to improved model performance under class imbalance and non-IID data distributions without any data sharing. The method maintains stability through simplex constraints and momentum smoothing, making it easy to integrate into standard FL pipelines. Experiments on vision benchmarks with controlled imbalances show better robustness compared to standard aggregation.

Core claim

The central claim is that probing client updates for class-wise logit-based evidence scores allows construction of a matrix that quantifies per-class competence and coverage, enabling contribution weights that upweight clients strong on minority classes and yielding better aggregation in imbalanced federated settings.

What carries the argument

The cross-client evidence matrix, built from class-wise evidence scores derived via logit maximization on probed client updates, which quantifies competence and coverage to compute contribution weights.

If this is right

Improves robustness to class imbalance and statistical heterogeneity in federated learning.
Enhances performance on minority classes without requiring data exchange or auxiliary datasets.
Produces stable aggregation through simplex constraints and momentum smoothing.
Remains compatible with existing federated learning training pipelines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar logit-based probing could be tested in other privacy-sensitive distributed settings like edge computing.
Extending the evidence matrix to handle dynamic client participation might further improve long-term training stability.
The approach suggests that model outputs alone carry sufficient signal for contribution estimation, which could apply to other modalities if logits generalize well.

Load-bearing premise

Logit outputs from client models can reliably indicate per-class competence and class coverage without access to raw data or labels.

What would settle it

An experiment showing no correlation between the computed evidence scores and actual per-class accuracy improvements on held-out minority class data would falsify the method's effectiveness.

Figures

Figures reproduced from arXiv: 2605.18892 by Asim Ukaye, Karthik Nandakumar, Mubarak Abdu-Aguye, Nurbek Tastan.

**Figure 1.** Figure 1: Overview of CELM. During the initial warm-up communication rounds, the server probes each client update using class-wise logit maximization, constructs a debiased client–class evidence matrix, and converts normalized class-wise evidence into contribution weights for aggregation. After warm-up, these contribution weights are frozen and reused for the remaining rounds to stabilize training and reduce overhe… view at source ↗

**Figure 2.** Figure 2: Visualization of the non-IID split settings used in our experiments. Each bubble plot encodes the client-class label allocation matrix: the bubble size is proportional to the number of samples of a given class held by a given client. The top marginal summarizes per-class totals across all clients, while the right marginal summarizes perclient totals across all classes. After Tw, the server broadcasts the … view at source ↗

**Figure 3.** Figure 3: Maverick versus free-rider client patterns. Bubble area is proportional to class sample count for each client–class pair. (a) A Maverick client contributes distinctive class evidence; (b) a free-rider client contributes weak or non-informative class signal; (c) the mixed setting used to evaluate whether aggregation methods can emphasize informative rare-class clients while suppressing non-contributors [PI… view at source ↗

**Figure 4.** Figure 4: FPR-versus-threshold curves for z-score-based free-rider detection. A client is flagged as a free-rider when its standardized contribution score falls below a threshold. (a) FashionMNIST under the FRM setting (free-rider + Maverick), and (b) CIFAR-10 under the FR setting (single free-rider). Lower curves indicate more robust detection over a broad threshold range [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Estimation fidelity of CELM on FedISIC dataset. Panels (a) and (b) compare the true and CELM-estimated client class distributions, while panels (c) and (d) compare client-level marginals and the aggregated global class marginals. The close visual agreement indicates that CELM preserves both per-client class tendencies and cohortlevel class prevalence without accessing raw client data. 7 Conclusion & Limi… view at source ↗

**Figure 6.** Figure 6: Sensitivity of CELM to LM optimization steps L and LM learning rate η. Each heatmap cell reports final test accuracy (%) for a specific (L, η) pair across representative non-IID splits on FashionMNIST and CIFAR-10. longer warm-up improves performance, consistent with stronger class asymmetry requiring more evidence collection. To balance accuracy and computational overhead, we set Tw = 0.05T in the main … view at source ↗

**Figure 7.** Figure 7: Final warm-up LM probe images (Xfinal) on FashionMNIST (Pure Label Skew). Each row is a client, and each column is a target class; red outlines indicate classes present in that client’s local data. Present-class cells often exhibit more distinct response patterns, while the images remain abstract and non-semantic, indicating useful class evidence with no visual privacy leakage [PITH_FULL_IMAGE:figures/fu… view at source ↗

read the original abstract

Federated learning (FL) enables collaborative learning of computer vision models, where privacy and regulatory constraints prevent centralizing data across devices or organizations. However, practical FL deployments often exhibit severe class imbalance and label skew, causing standard aggregation protocols to overfit dominant clients and degrade minority-class performance. We propose a data-free, class-wise contribution estimation and aggregation framework based on logit maximization (CELM) that does not require sharing raw data, client metadata, or auxiliary public datasets. The FL server probes client updates to obtain class-wise evidence scores and assembles a cross-client evidence matrix, which quantifies both per-class competence and class coverage. Using this matrix, we compute contribution weights that upweight clients providing strong, discriminative evidence for underrepresented classes. The resulting aggregation is stable due to simplex constraints and momentum smoothing, and it remains compatible with standard FL training pipelines. We evaluate the approach on representative vision benchmarks under controlled non-IID and pathological label splits, demonstrating that CELM-based aggregation improves robustness to imbalance and statistical heterogeneity, while yielding better performance without requiring any additional data exchange.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CELM is a new data-free contribution estimation for non-IID federated learning via logit max, but the data-free logit probing needs clearer justification to support the claims.

read the letter

The main thing to know is that the paper proposes CELM, a data-free method using logit maximization to estimate class-wise client contributions in federated learning and then builds a cross-client evidence matrix to adjust aggregation weights for better handling of non-IID data. It is new in combining logit probing of client updates with this matrix construction to upweight clients strong on minority classes, all without sharing data or using public datasets. The approach stays compatible with existing FL pipelines and adds stability through constraints and smoothing. The work does a solid job identifying the issue of class imbalance and label skew in practical FL and offering a targeted fix that claims improved robustness on vision tasks under controlled splits. The soft spot is around the reliability of those logit-based evidence scores. The stress-test concern is fair here: since the method is data-free, it's unclear what inputs are used for the maximization step. If it's not tied to actual data distributions, the scores could fail to measure true per-class competence or coverage and instead pick up on unrelated factors like model parameters. The abstract is high-level on the evaluation, so more details on the exact computation and quantitative results would help. I would want to see if the method actually improves minority class performance in the experiments and how it compares to other FL aggregation baselines. This is for researchers focused on federated learning aggregation techniques, especially those working with heterogeneous client data. It has enough of a distinct idea to merit a serious referee. I recommend putting it through peer review, asking the authors to expand on the logit maximization procedure and include more ablation studies.

Referee Report

2 major / 1 minor

Summary. The paper proposes CELM, a data-free client contribution estimation and aggregation framework for federated learning. The server probes client updates via logit maximization to derive class-wise evidence scores, assembles a cross-client evidence matrix quantifying per-class competence and coverage, computes contribution weights to upweight clients strong on underrepresented classes, and applies the weights with simplex constraints and momentum smoothing. The approach is evaluated on vision benchmarks under controlled non-IID and pathological label splits, claiming improved robustness to imbalance and heterogeneity without raw data sharing or auxiliary datasets.

Significance. If the logit-derived evidence scores can be shown to correlate with actual per-class discriminative power on private client distributions, the method would provide a practical, privacy-preserving alternative to existing contribution estimation techniques that often rely on public data or metadata, potentially improving FL performance on imbalanced real-world deployments.

major comments (2)

[Method description of logit probing and evidence matrix construction] The core premise that class-wise evidence scores from logit maximization on client updates reliably quantify per-class competence and coverage (without raw data or labels) is load-bearing for the contribution weights and robustness claims, yet the manuscript provides no derivation or empirical validation of this correlation. If the maximization operates solely on parameters without reference to a data manifold, the matrix may capture initialization artifacts rather than discriminative power, as noted in the stress-test concern.
[Evaluation section] The evaluation claims performance gains on representative vision benchmarks under non-IID and pathological splits, but the abstract and available text lack specific quantitative results, baseline comparisons, ablation studies on the evidence matrix, or error analysis to substantiate the robustness improvements.

minor comments (1)

[Abstract and method overview] Clarify the exact inputs (if any) used during logit maximization, as the data-free claim is central but potentially ambiguous given that logit computation normally requires samples.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and indicate the revisions planned for the next version.

read point-by-point responses

Referee: [Method description of logit probing and evidence matrix construction] The core premise that class-wise evidence scores from logit maximization on client updates reliably quantify per-class competence and coverage (without raw data or labels) is load-bearing for the contribution weights and robustness claims, yet the manuscript provides no derivation or empirical validation of this correlation. If the maximization operates solely on parameters without reference to a data manifold, the matrix may capture initialization artifacts rather than discriminative power, as noted in the stress-test concern.

Authors: We agree that the current manuscript would benefit from an explicit derivation linking logit maximization to per-class competence. The process optimizes a synthetic input to maximize the logit for each class on the client's model parameters after local training; this serves as a proxy for how strongly the model has internalized discriminative features for that class from its private data. In the revision we will add a short theoretical subsection deriving this correlation under the assumption that local SGD aligns the model with the client's data distribution. We will also include empirical correlation analysis between the resulting evidence scores and class-wise accuracies computed on client-held validation splits. For the initialization artifact concern, we will report additional stress-test experiments using varied random initializations to confirm stability of the evidence matrix. revision: yes
Referee: [Evaluation section] The evaluation claims performance gains on representative vision benchmarks under non-IID and pathological splits, but the abstract and available text lack specific quantitative results, baseline comparisons, ablation studies on the evidence matrix, or error analysis to substantiate the robustness improvements.

Authors: The full experimental section already reports quantitative results on vision benchmarks (CIFAR-10, CIFAR-100, MNIST) under controlled non-IID and pathological partitions, with comparisons to FedAvg, FedProx and other contribution-aware baselines, plus ablations that isolate the evidence-matrix components. To address the referee's observation, we will (i) revise the abstract to include the main accuracy and robustness gains, (ii) add a dedicated ablation subsection focused on the evidence matrix, and (iii) include error-bar analysis and failure-case discussion in the main text or appendix. revision: yes

Circularity Check

0 steps flagged

No circularity: contribution weights derived from independent logit-probing construction

full rationale

The paper's central derivation computes class-wise evidence scores by probing client model updates via logit maximization, assembles these into a cross-client evidence matrix that quantifies per-class competence and coverage, and then derives contribution weights from the matrix under simplex and momentum constraints. This chain introduces a new data-free probing step whose outputs are not defined in terms of the final aggregation performance or fitted to the target metrics; the matrix construction and weighting formulas stand as independent operations on the probed logits. No equations reduce a prediction to a fitted input by construction, no load-bearing premise rests solely on self-citation, and no uniqueness theorem or ansatz is smuggled in from prior author work. Evaluation on vision benchmarks under non-IID splits provides external empirical checks rather than tautological confirmation, rendering the derivation self-contained.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 1 invented entities

The method rests on domain assumptions about logit probing and introduces a new evidence matrix construct; limited free parameters mentioned such as momentum smoothing.

free parameters (1)

momentum smoothing parameter
Used to stabilize aggregation weights; value not specified but implied as a tunable element for stability.

axioms (1)

domain assumption Client model updates can be probed by the server to extract class-wise logit evidence without accessing raw data or labels.
Foundational to the data-free property and evidence score computation.

invented entities (1)

cross-client evidence matrix no independent evidence
purpose: Quantifies per-class competence and class coverage across clients to compute contribution weights.
New structure assembled from probed logits to enable class-wise weighting.

pith-pipeline@v0.9.0 · 5732 in / 1272 out tokens · 51960 ms · 2026-05-20T13:47:34.767962+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

29 extracted references · 29 canonical work pages · 1 internal anchor

[1]

University of Montreal1341(3), 1 (2009)

Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. University of Montreal1341(3), 1 (2009)

work page 2009
[2]

In: International conference on artificial intelligence and statistics

Fraboni, Y., Vidal, R., Lorenzi, M.: Free-rider attacks on model aggregation in fed- erated learning. In: International conference on artificial intelligence and statistics. pp. 1846–1854. PMLR (2021)

work page 2021
[3]

In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data

He, J., Wu, L., Zhang, Z., Lu, N., Wei, X.: Client evaluation and revision in fed- erated learning: Towards defending free-riders and promoting fairness. In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data. pp. 100–118. Springer (2024)

work page 2024
[4]

In: AAAI

Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Tackling mavericks in federated learning via adaptive client selection strategy. In: AAAI. vol. 2023 (2022)

work page 2023
[5]

In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining

Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Maverick matters: Client con- tribution and selection in federated learning. In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining. pp. 269–282. Springer (2023)

work page 2023
[6]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Jiang, M., Roth, H.R., Li, W., Yang, D., Zhao, C., Nath, V., Xu, D., Dou, Q., Xu, Z.: Fair federated medical image segmentation via client contribution estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16302–16311 (2023)

work page 2023
[7]

Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., et al.: Advances and open problems in federated learning. Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

work page 2021
[8]

In: International conference on machine learning

Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: Stochastic controlled averaging for federated learning. In: International conference on machine learning. pp. 5132–5143. PMLR (2020)

work page 2020
[9]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10713– 10722 (2021)

work page 2021
[10]

In: Dhillon, I., Papailiopoulos, D., Sze, V

Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Feder- ated optimization in heterogeneous networks. In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems. vol. 2, pp. 429– 450 (2020),https://proceedings.mlsys.org/paper_files/paper/2020/file/ 1f5fe83998a09396ebe6477d9475ba0c-Paper.pdf

work page 2020
[11]

In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data. In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

work page 2020
[12]

Lyu, L., Xu, X., Wang, Q.: Collaborative Fairness in Federated Learning (Aug 2020).https://doi.org/10.48550/arXiv.2008.12161

work page doi:10.48550/arxiv.2008.12161 2020
[13]

International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

work page 2016
[14]

MIT press (1999)

Manning, C., Schutze, H.: Foundations of statistical natural language processing. MIT press (1999)

work page 1999
[15]

In: Artificial intelligence and statistics

McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. pp. 1273–1282. PMLR (2017)

work page 2017
[16]

Encyclopedia of mathematics78(2001)

Nikulin, M.S., et al.: Hellinger distance. Encyclopedia of mathematics78(2001)

work page 2001
[17]

Distill2(11), e7 (2017)

Olah, C., Mordvintsev, A., Schubert, L.: Feature visualization. Distill2(11), e7 (2017)

work page 2017
[18]

In: Sixth international conference on computer vision (IEEE Cat

Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). pp. 59–66. IEEE (1998)

work page 1998
[19]

Deep Inside Convolutional Networks2(2) (2014)

Simonyan, K., Vedaldi, A., Zisserman, A.: Visualising image classification models and saliency maps. Deep Inside Convolutional Networks2(2) (2014)

work page 2014
[20]

In: Larson, K

Tastan, N., Fares, S., Aremu, T., Horváth, S., Nandakumar, K.: Redefining Con- tributions: Shapley-Driven Federated Learning. In: Larson, K. (ed.) Proceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24. pp. 5009–5017. International Joint Conferences on Artificial Intelli- gence Organization (8 2024).https:/ /...

work page doi:10.24963/ijcai.2024/554 2024
[21]

In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J

Tastan, N., Horváth, S., Nandakumar, K.: Aequa: Fair Model Rewards in Col- laborative Learning via Slimmable Networks. In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J. (eds.) Pro- ceedings of the 42nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 267, p...

work page 2025
[22]

Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

Tastan, N., Horváth, S., Nandakumar, K.: CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning. Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

work page 2025
[23]

In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z

Tastan, N., Horváth, S., Takáč, M., Nandakumar, K.: Fedpews: Personalized warmup via subnetworks for enhanced heterogeneous federated learning. In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z. (eds.) Conference on Parsimony and Learning. Proceedings of Machine Learning Research, vol. 280, pp. 462–483. PMLR (24–27 Mar 2025),https://...

work page 2025
[24]

Advances in neural information processing systems33, 7611–7623 (2020)

Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective in- consistency problem in heterogeneous federated optimization. Advances in neural information processing systems33, 7611–7623 (2020)

work page 2020
[25]

In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

Xu, X., Lyu, L.: A reputation mechanism is all you need: Collaborative fairness and adversarial robustness in federated learning. In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

work page 2021
[26]

In: Advances in Neural Information Processing Systems

Xu, X., Lyu, L., Ma, X., Miao, C., Foo, C.S., Low, B.K.H.: Gradient Driven Re- wards to Guarantee Fairness in Collaborative Machine Learning. In: Advances in Neural Information Processing Systems. vol. 34, pp. 16104–16117. Curran Asso- ciates, Inc. (2021)

work page 2021
[27]

Transactions on Machine Learning Re- search

Yang, M., Buyukates, B., Markopoulou, A.: Rewarding the rare: Maverick-aware shapley valuation in federated learning. Transactions on Machine Learning Re- search

work page
[28]

Understanding Neural Networks Through Deep Visualization

Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015) Data-free Contribution Estimation via Logit Maximization 17

work page internal anchor Pith review Pith/arXiv arXiv 2015
[29]

In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al

Zhuang, W., Xu, J., Chen, C., Li, J., Lyu, L.: COALA: A practical and vision- centric federated learning platform. In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al. A Algorithm Algorithm 1CELM: Contribution Estimation from Logit Maximization Input:Initial global modelw (0) g ...

work page 2024

[1] [1]

University of Montreal1341(3), 1 (2009)

Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. University of Montreal1341(3), 1 (2009)

work page 2009

[2] [2]

In: International conference on artificial intelligence and statistics

Fraboni, Y., Vidal, R., Lorenzi, M.: Free-rider attacks on model aggregation in fed- erated learning. In: International conference on artificial intelligence and statistics. pp. 1846–1854. PMLR (2021)

work page 2021

[3] [3]

In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data

He, J., Wu, L., Zhang, Z., Lu, N., Wei, X.: Client evaluation and revision in fed- erated learning: Towards defending free-riders and promoting fairness. In: Asia- Pacific Web (APWeb) and Web-Age Information Management (WAIM) Joint In- ternational Conference on Web and Big Data. pp. 100–118. Springer (2024)

work page 2024

[4] [4]

In: AAAI

Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Tackling mavericks in federated learning via adaptive client selection strategy. In: AAAI. vol. 2023 (2022)

work page 2023

[5] [5]

In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining

Huang, J., Hong, C., Liu, Y., Chen, L.Y., Roos, S.: Maverick matters: Client con- tribution and selection in federated learning. In: Pacific-Asia Conference on Knowl- edge Discovery and Data Mining. pp. 269–282. Springer (2023)

work page 2023

[6] [6]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Jiang, M., Roth, H.R., Li, W., Yang, D., Zhao, C., Nath, V., Xu, D., Dou, Q., Xu, Z.: Fair federated medical image segmentation via client contribution estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 16302–16311 (2023)

work page 2023

[7] [7]

Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

Kairouz, P., McMahan, H.B., Avent, B., Bellet, A., Bennis, M., Bhagoji, A.N., Bonawitz, K., Charles, Z., Cormode, G., Cummings, R., et al.: Advances and open problems in federated learning. Foundations and trends®in machine learning 14(1–2), 1–210 (2021)

work page 2021

[8] [8]

In: International conference on machine learning

Karimireddy, S.P., Kale, S., Mohri, M., Reddi, S., Stich, S., Suresh, A.T.: Scaffold: Stochastic controlled averaging for federated learning. In: International conference on machine learning. pp. 5132–5143. PMLR (2020)

work page 2020

[9] [9]

In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition

Li, Q., He, B., Song, D.: Model-contrastive federated learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. pp. 10713– 10722 (2021)

work page 2021

[10] [10]

In: Dhillon, I., Papailiopoulos, D., Sze, V

Li, T., Sahu, A.K., Zaheer, M., Sanjabi, M., Talwalkar, A., Smith, V.: Feder- ated optimization in heterogeneous networks. In: Dhillon, I., Papailiopoulos, D., Sze, V. (eds.) Proceedings of Machine Learning and Systems. vol. 2, pp. 429– 450 (2020),https://proceedings.mlsys.org/paper_files/paper/2020/file/ 1f5fe83998a09396ebe6477d9475ba0c-Paper.pdf

work page 2020

[11] [11]

In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

Li, X., Huang, K., Yang, W., Wang, S., Zhang, Z.: On the convergence of fedavg on non-iid data. In: International Conference on Learning Representations (2020), https://openreview.net/forum?id=HJxNAnVtDS

work page 2020

[12] [12]

Lyu, L., Xu, X., Wang, Q.: Collaborative Fairness in Federated Learning (Aug 2020).https://doi.org/10.48550/arXiv.2008.12161

work page doi:10.48550/arxiv.2008.12161 2020

[13] [13]

International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

Mahendran, A., Vedaldi, A.: Visualizing deep convolutional neural networks using natural pre-images. International Journal of Computer Vision120(3), 233–255 (2016) 16 A.Ukaye et al

work page 2016

[14] [14]

MIT press (1999)

Manning, C., Schutze, H.: Foundations of statistical natural language processing. MIT press (1999)

work page 1999

[15] [15]

In: Artificial intelligence and statistics

McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. pp. 1273–1282. PMLR (2017)

work page 2017

[16] [16]

Encyclopedia of mathematics78(2001)

Nikulin, M.S., et al.: Hellinger distance. Encyclopedia of mathematics78(2001)

work page 2001

[17] [17]

Distill2(11), e7 (2017)

Olah, C., Mordvintsev, A., Schubert, L.: Feature visualization. Distill2(11), e7 (2017)

work page 2017

[18] [18]

In: Sixth international conference on computer vision (IEEE Cat

Rubner, Y., Tomasi, C., Guibas, L.J.: A metric for distributions with applications to image databases. In: Sixth international conference on computer vision (IEEE Cat. No. 98CH36271). pp. 59–66. IEEE (1998)

work page 1998

[19] [19]

Deep Inside Convolutional Networks2(2) (2014)

Simonyan, K., Vedaldi, A., Zisserman, A.: Visualising image classification models and saliency maps. Deep Inside Convolutional Networks2(2) (2014)

work page 2014

[20] [20]

In: Larson, K

Tastan, N., Fares, S., Aremu, T., Horváth, S., Nandakumar, K.: Redefining Con- tributions: Shapley-Driven Federated Learning. In: Larson, K. (ed.) Proceed- ings of the Thirty-Third International Joint Conference on Artificial Intelligence, IJCAI-24. pp. 5009–5017. International Joint Conferences on Artificial Intelli- gence Organization (8 2024).https:/ /...

work page doi:10.24963/ijcai.2024/554 2024

[21] [21]

In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J

Tastan, N., Horváth, S., Nandakumar, K.: Aequa: Fair Model Rewards in Col- laborative Learning via Slimmable Networks. In: Singh, A., Fazel, M., Hsu, D., Lacoste-Julien, S., Berkenkamp, F., Maharaj, T., Wagstaff, K., Zhu, J. (eds.) Pro- ceedings of the 42nd International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 267, p...

work page 2025

[22] [22]

Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

Tastan, N., Horváth, S., Nandakumar, K.: CYCle: Choosing Your Collaborators Wisely to Enhance Collaborative Fairness in Decentralized Learning. Transac- tions on Machine Learning Research (2025),https://openreview.net/forum? id=ygqNiLQqfH

work page 2025

[23] [23]

In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z

Tastan, N., Horváth, S., Takáč, M., Nandakumar, K.: Fedpews: Personalized warmup via subnetworks for enhanced heterogeneous federated learning. In: Chen, B., Liu, S., Pilanci, M., Su, W., Sulam, J., Wang, Y., Zhu, Z. (eds.) Conference on Parsimony and Learning. Proceedings of Machine Learning Research, vol. 280, pp. 462–483. PMLR (24–27 Mar 2025),https://...

work page 2025

[24] [24]

Advances in neural information processing systems33, 7611–7623 (2020)

Wang, J., Liu, Q., Liang, H., Joshi, G., Poor, H.V.: Tackling the objective in- consistency problem in heterogeneous federated optimization. Advances in neural information processing systems33, 7611–7623 (2020)

work page 2020

[25] [25]

In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

Xu, X., Lyu, L.: A reputation mechanism is all you need: Collaborative fairness and adversarial robustness in federated learning. In: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2021 (FL-ICML’21) (2021)

work page 2021

[26] [26]

In: Advances in Neural Information Processing Systems

Xu, X., Lyu, L., Ma, X., Miao, C., Foo, C.S., Low, B.K.H.: Gradient Driven Re- wards to Guarantee Fairness in Collaborative Machine Learning. In: Advances in Neural Information Processing Systems. vol. 34, pp. 16104–16117. Curran Asso- ciates, Inc. (2021)

work page 2021

[27] [27]

Transactions on Machine Learning Re- search

Yang, M., Buyukates, B., Markopoulou, A.: Rewarding the rare: Maverick-aware shapley valuation in federated learning. Transactions on Machine Learning Re- search

work page

[28] [28]

Understanding Neural Networks Through Deep Visualization

Yosinski, J., Clune, J., Nguyen, A., Fuchs, T., Lipson, H.: Understanding neural networks through deep visualization. arXiv preprint arXiv:1506.06579 (2015) Data-free Contribution Estimation via Logit Maximization 17

work page internal anchor Pith review Pith/arXiv arXiv 2015

[29] [29]

In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al

Zhuang, W., Xu, J., Chen, C., Li, J., Lyu, L.: COALA: A practical and vision- centric federated learning platform. In: Forty-first International Conference on Ma- chine Learning (2024),https://openreview.net/forum?id=ATRnM8PyQX 18 A.Ukaye et al. A Algorithm Algorithm 1CELM: Contribution Estimation from Logit Maximization Input:Initial global modelw (0) g ...

work page 2024